Home » SQL & PL/SQL » SQL & PL/SQL » Identify Groups where 2 or more Clients have bought the same Products - SQL Oracle (SQL - ORACLE)
Identify Groups where 2 or more Clients have bought the same Products - SQL Oracle [message #687371] Fri, 03 March 2023 10:29 Go to next message
Nicha
Messages: 20
Registered: March 2020
Junior Member
I need to build an Oracle Select Query that identifies, within each Customer Group on Table 1, those who have bought the same Products.

As shown in the attached file and in the image below, the goal is to get the result shown in the "Result Query".
Mark with an 'X' all the records where customers, within the same Group, have bought the same products.

/forum/fa/14694/0/

We can observe that:

- In Group [G1], Customers C1 and C3 bought the same products (P1, P3) : Mark with 'X'.
- No Grupo [G2], os Clientes compraram produtos diferentes.
- No Grupo [G100], os Clientes C310 e C315 compraram o mesmo produto (P1) : Marcar com 'X'

Can anyone please help?
My best regards.
Re: Identify Groups where 2 or more Clients have bought the same Products - SQL Oracle [message #687372 is a reply to message #687371] Fri, 03 March 2023 10:49 Go to previous messageGo to next message
Michel Cadot
Messages: 68624
Registered: March 2007
Location: Nanterre, France, http://...
Senior Member
Account Moderator

With any SQL or PL/SQL question, please, post a working Test case: create table (including all constraints) and insert statements along with the result you want with these data then we will be able work with your table and data. Explain with words and sentences the rules that lead to this result.

Also always post your Oracle version, with 4 decimals (query v$version), as often solution depends on it.

If, in group G100, you have a client, say C300, which has bought P4, are C315 and C300 marked?

The test case must be complete with ALL possible cases (which is not the case of your image) as my previous question showed.

Re: Identify Groups where 2 or more Clients have bought the same Products - SQL Oracle [message #687373 is a reply to message #687371] Fri, 03 March 2023 11:24 Go to previous messageGo to next message
Barbara Boehmer
Messages: 9077
Registered: November 2002
Location: California, USA
Senior Member
statements for creating table and data that you should have provided; please do so yourself next time:
CREATE TABLE table1
  (group_id   VARCHAR2(8),
   client_id  VARCHAR2(9),
   product_id VARCHAR2(10),
   prod_qty   NUMBER)
/
INSERT ALL
INTO table1 VALUES ('G1',   'C1',   'P1', 1200)
INTO table1 VALUES ('G1',   'C1',   'P3',   32)
INTO table1 VALUES ('G1',   'C2',   NULL,  421)
INTO table1 VALUES ('G1',   'C3',   'P3',   12)
INTO table1 VALUES ('G1',   'C3',   'P1',  167)
INTO table1 VALUES ('G2',   'C4',   'P4',   39)
INTO table1 VALUES ('G2',   'C4',   'P3',  543)
INTO table1 VALUES ('G2',   'C5',   'P1',  234)
INTO table1 VALUES ('G100', 'C310', 'P1',   56)
INTO table1 VALUES ('G100', 'C315', 'P4',  733)
INTO table1 VALUES ('G100', 'C315', 'P1',  877)
SELECT * FROM DUAL
/
resulting test data after running the above:
SCOTT@orcl_12.1.0.2.0> SELECT * FROM table1
  2  ORDER BY SUBSTR(group_id,1,1), TO_NUMBER(SUBSTR(group_id,2)),
  3  	      SUBSTR(client_id,1,1), TO_NUMBER(SUBSTR(client_id,2)),
  4  	      SUBSTR(product_id,1,1), TO_NUMBER(SUBSTR(product_id,2))
  5  /

GROUP_ID CLIENT_ID PRODUCT_ID   PROD_QTY
-------- --------- ---------- ----------
G1       C1        P1               1200
G1       C1        P3                 32
G1       C2                          421
G1       C3        P1                167
G1       C3        P3                 12
G2       C4        P3                543
G2       C4        P4                 39
G2       C5        P1                234
G100     C310      P1                 56
G100     C315      P1                877
G100     C315      P4                733

11 rows selected.

one method that produces the desired results with the given data;
you may wish to order the results differently, especially if there may be more than 1 non-numeric starting character
and there may be more efficient methods:
SCOTT@orcl_12.1.0.2.0> COLUMN mark_selection FORMAT A14
SCOTT@orcl_12.1.0.2.0> SELECT *
  2  FROM   (SELECT t1.group_id, t1.client_id, t1.product_id, t1.prod_qty, 'X' AS mark_selection
  3  	     FROM   table1 t1
  4  	     WHERE  EXISTS
  5  		    (SELECT *
  6  		     FROM   table1 t2
  7  		     WHERE  t1.group_id = t2.group_id
  8  		     AND    t1.client_id != t2.client_id
  9  		     AND    t1.product_id = t2.product_id)
 10  	     UNION ALL
 11  	     SELECT t1.group_id, t1.client_id, t1.product_id, t1.prod_qty, NULL AS mark_selection
 12  	     FROM   table1 t1
 13  	     WHERE  NOT EXISTS
 14  		    (SELECT *
 15  		     FROM   table1 t2
 16  		     WHERE  t1.group_id = t2.group_id
 17  		     AND    t1.client_id != t2.client_id
 18  		     AND    t1.product_id = t2.product_id))
 19  ORDER BY SUBSTR(group_id,1,1), TO_NUMBER(SUBSTR(group_id,2)),
 20  	      SUBSTR(client_id,1,1), TO_NUMBER(SUBSTR(client_id,2)),
 21  	      SUBSTR(product_id,1,1), TO_NUMBER(SUBSTR(product_id,2))
 22  /

GROUP_ID CLIENT_ID PRODUCT_ID   PROD_QTY MARK_SELECTION
-------- --------- ---------- ---------- --------------
G1       C1        P1               1200 X
G1       C1        P3                 32 X
G1       C2                          421
G1       C3        P1                167 X
G1       C3        P3                 12 X
G2       C4        P3                543
G2       C4        P4                 39
G2       C5        P1                234
G100     C310      P1                 56 X
G100     C315      P1                877 X
G100     C315      P4                733

11 rows selected.
Re: Identify Groups where 2 or more Clients have bought the same Products - SQL Oracle [message #687374 is a reply to message #687373] Fri, 03 March 2023 11:33 Go to previous messageGo to next message
Barbara Boehmer
Messages: 9077
Registered: November 2002
Location: California, USA
Senior Member
Here is a litter better method
SCOTT@orcl_12.1.0.2.0> COLUMN mark_selection FORMAT A14
SCOTT@orcl_12.1.0.2.0> SELECT t1.group_id, t1.client_id, t1.product_id, t1.prod_qty,
  2  	    CASE WHEN EXISTS
  3  		    (SELECT *
  4  		     FROM   table1 t2
  5  		     WHERE  t1.group_id = t2.group_id
  6  		     AND    t1.client_id != t2.client_id
  7  		     AND    t1.product_id = t2.product_id)
  8  		 THEN 'X'
  9  		 ELSE NULL
 10  	    END AS mark_selection
 11  FROM   table1 t1
 12  ORDER  BY SUBSTR(group_id,1,1), TO_NUMBER(SUBSTR(group_id,2)),
 13  	      SUBSTR(client_id,1,1), TO_NUMBER(SUBSTR(client_id,2)),
 14  	      SUBSTR(product_id,1,1), TO_NUMBER(SUBSTR(product_id,2))
 15  /

GROUP_ID CLIENT_ID PRODUCT_ID   PROD_QTY MARK_SELECTION
-------- --------- ---------- ---------- --------------
G1       C1        P1               1200 X
G1       C1        P3                 32 X
G1       C2                          421
G1       C3        P1                167 X
G1       C3        P3                 12 X
G2       C4        P3                543
G2       C4        P4                 39
G2       C5        P1                234
G100     C310      P1                 56 X
G100     C315      P1                877 X
G100     C315      P4                733

11 rows selected.

Re: Identify Groups where 2 or more Clients have bought the same Products - SQL Oracle [message #687375 is a reply to message #687374] Fri, 03 March 2023 11:41 Go to previous messageGo to next message
Barbara Boehmer
Messages: 9077
Registered: November 2002
Location: California, USA
Senior Member
And another method for your consideration
SCOTT@orcl_12.1.0.2.0> COLUMN mark_selection FORMAT A14
SCOTT@orcl_12.1.0.2.0> SELECT t1.group_id, t1.client_id, t1.product_id, t1.prod_qty,
  2  	    MAX(DECODE(t2.product_id, t1.product_id, 'X', NULL)) AS mark_selection
  3  FROM   table1 t1, table1 t2
  4  WHERE  t1.group_id = t2.group_id
  5  AND    t1.client_id != t2.client_id
  6  GROUP  BY t1.group_id, t1.client_id, t1.product_id, t1.prod_qty
  7  ORDER  BY SUBSTR(group_id,1,1), TO_NUMBER(SUBSTR(group_id,2)),
  8  	       SUBSTR(client_id,1,1), TO_NUMBER(SUBSTR(client_id,2)),
  9  	       SUBSTR(product_id,1,1), TO_NUMBER(SUBSTR(product_id,2))
 10  /

GROUP_ID CLIENT_ID PRODUCT_ID   PROD_QTY MARK_SELECTION
-------- --------- ---------- ---------- --------------
G1       C1        P1               1200 X
G1       C1        P3                 32 X
G1       C2                          421
G1       C3        P1                167 X
G1       C3        P3                 12 X
G2       C4        P3                543
G2       C4        P4                 39
G2       C5        P1                234
G100     C310      P1                 56 X
G100     C315      P1                877 X
G100     C315      P4                733

11 rows selected.
Re: Identify Groups where 2 or more Clients have bought the same Products - SQL Oracle [message #687376 is a reply to message #687375] Fri, 03 March 2023 12:02 Go to previous messageGo to next message
Michel Cadot
Messages: 68624
Registered: March 2007
Location: Nanterre, France, http://...
Senior Member
Account Moderator

Adding this row:
insert into table1 VALUES ('G100', 'C300', 'P4', 0);
Is the following result correct or not?
SQL> SELECT t1.group_id, t1.client_id, t1.product_id, t1.prod_qty,
  2      CASE WHEN EXISTS
  3      (SELECT *
  4       FROM   table1 t2
  5       WHERE  t1.group_id = t2.group_id
  6       AND    t1.client_id != t2.client_id
  7       AND    t1.product_id = t2.product_id)
  8        THEN 'X'
  9      ELSE NULL
 10     END AS mark_selection
 11  FROM   table1 t1
 12  ORDER  BY SUBSTR(group_id,1,1), TO_NUMBER(SUBSTR(group_id,2)),
 13        SUBSTR(client_id,1,1), TO_NUMBER(SUBSTR(client_id,2)),
 14        SUBSTR(product_id,1,1), TO_NUMBER(SUBSTR(product_id,2))
 15  /

GROUP_ID CLIENT_ID PRODUCT_ID   PROD_QTY MARK_SELECTION
-------- --------- ---------- ---------- --------------
G1       C1        P1               1200 X
G1       C1        P3                 32 X
G1       C2                          421
G1       C3        P1                167 X
G1       C3        P3                 12 X
G2       C4        P3                543
G2       C4        P4                 39
G2       C5        P1                234
G100     C300      P4                  0 X
G100     C310      P1                 56 X
G100     C315      P1                877 X
G100     C315      P4                733 X

Re: Identify Groups where 2 or more Clients have bought the same Products - SQL Oracle [message #687377 is a reply to message #687376] Fri, 03 March 2023 12:22 Go to previous messageGo to next message
Barbara Boehmer
Messages: 9077
Registered: November 2002
Location: California, USA
Senior Member
Michel,

It looks correct to me, but we will have to wait for the original poster. The criteria does not say anything about buying all of the same products and the posted desired results already show that C310 and C315 were both marked for P1 but not P4.

Barbara

P.S. I expect you or Solomon or somebody will probably post better methods as usual. It seems like I usually post something that works, then somebody else posts a more modern method. I am still trying to catch up on the latest after an approximate four-year absence. I wasn't sure if I was ever going to get back into this at all. I enjoy programming problems more than computerized card games like hearts or gin rummy or crazy eights or dice games like yahtzee or solving a rubik's cube, but with programming problems sometimes half the problem is understanding the definition of the problem. Sometimes it seems like one person can tell what someone means and sometimes it seems like we are all clueless. I have limited time, but I find a quick daily game is a good test of whether I am alert enough to do anything. If I fall asleep in my chair, that's my first clue that I am not.
Re: Identify Groups where 2 or more Clients have bought the same Products - SQL Oracle [message #687378 is a reply to message #687377] Fri, 03 March 2023 14:23 Go to previous messageGo to next message
Michel Cadot
Messages: 68624
Registered: March 2007
Location: Nanterre, France, http://...
Senior Member
Account Moderator

Another question is if C1 also bought P4, are C1 and C3 still be marked?
Does the problem "those who have bought the same Products" means exactly the same products and only them or something else?
In other words, is this
  • {products bought by C1} = {products bought by C3} (as the image seems to show)
  • or can it be {products bought by C1/C3} includes {products bought by C3/C1}
  • or {products bought by C1} intersect {products bought by C3} is not empty.
You queries give the later one (adding G2/C5/P3):
SQL> SELECT t1.group_id, t1.client_id, t1.product_id, t1.prod_qty,
  2      CASE WHEN EXISTS
  3      (SELECT *
  4       FROM   table1 t2
  5       WHERE  t1.group_id = t2.group_id
  6       AND    t1.client_id != t2.client_id
  7       AND    t1.product_id = t2.product_id)
  8        THEN 'X'
  9      ELSE NULL
 10     END AS mark_selection
 11  FROM   table1 t1
 12  ORDER  BY SUBSTR(group_id,1,1), TO_NUMBER(SUBSTR(group_id,2)),
 13        SUBSTR(client_id,1,1), TO_NUMBER(SUBSTR(client_id,2)),
 14        SUBSTR(product_id,1,1), TO_NUMBER(SUBSTR(product_id,2))
 15  /

GROUP_ID CLIENT_ID PRODUCT_ID   PROD_QTY MARK_SELECTION
-------- --------- ---------- ---------- --------------
G1       C1        P1               1200 X
         C1        P3                 32 X
         C1        P4                  0
         C2                          421
         C3        P1                167 X
         C3        P3                 12 X

G2       C4        P3                543 X
         C4        P4                 39
         C5        P1                234
         C5        P3                  0 X

G100     C300      P4                  0 X
         C310      P1                 56 X
         C315      P1                877 X
         C315      P4                733 X


14 rows selected.

[Updated on: Sat, 04 March 2023 09:44]

Report message to a moderator

Re: Identify Groups where 2 or more Clients have bought the same Products - SQL Oracle [message #687379 is a reply to message #687377] Fri, 03 March 2023 14:49 Go to previous messageGo to next message
Michel Cadot
Messages: 68624
Registered: March 2007
Location: Nanterre, France, http://...
Senior Member
Account Moderator

Quote:
but with programming problems sometimes half the problem is understanding the definition of the problem. Sometimes it seems like one person can tell what someone means and sometimes it seems like we are all clueless.
I agree this why must of the time I ask for a more detailed specification before tryong to post a solution to prevent a very disappointing tip in a topic spending time to find a solution and to receive at each answer a "does not work for this or that" like in avtaritet asking about dba_hist_snapshots when in the end it has nothing to do with it.
In addition, often when the OP succeeds to clearly specify s/he also find the solution.


Quote:
I find a quick daily game is a good test of whether I am alert enough to do anything.

This is one of the reasons I still try to answer in OraFAQ and also program some little tools.

[Updated on: Fri, 03 March 2023 14:51]

Report message to a moderator

Re: Identify Groups where 2 or more Clients have bought the same Products - SQL Oracle [message #687380 is a reply to message #687379] Sun, 05 March 2023 19:03 Go to previous messageGo to next message
mathguy
Messages: 106
Registered: January 2023
Senior Member
If I understand correctly, the output is just the input, with an added column named MARK_SELECTION which may contain the string 'X' or be NULL. 'X' marks a row if there is another row for the same group_id and same product_id. It is not clear if in your data the same product may appear for the same client and in the same group more than once; in other words, if you assume (GROUP_ID, CLIENT_ID, PRODUCT_ID) to be unique. In the solution I proposed I assume that the three-column composite is unique (same product doesn't appear more than once for the same client in the same group), but I will indicate the change needed if you need to relax that assumption.

Your sample data doesn't show a product id for client C2 in group G1 (with a quantity of 421). Barbara assumed that should be NULL, but I think that's a bad assumption. Rather, I attribute that to sloppiness on your part (not an unreasonable assumption - see the incomplete translation of your question into English). It makes no sense to have a definite quantity of 421 associated with an "unknown" product. I will assume that GROUP_ID, CLIENT_ID and PRODUCT_ID are all non-NULL. In any case, if PRODUCT_ID can be NULL (or if any of the other two columns can be NULL) you would need to explain the requirement for those cases, since it's not obvious.

So - assuming (GROUP_ID, CLIENT_ID, PRODUCT_ID) is a candidate key (that is: it is primary key, or could be primary key; in other words: all three columns are non-NULL, and the combination is unique), and my understanding of the problem is correct, the query to get the desired result is quite simple:

select t.*,
       case when count(*) over (partition by group_id, product_id) > 1
            then 'X' end as mark_selection
from   table1 t
;
I didn't attempt to order the result - if you need an order, you should tell us what is required, since your sample image doesn't quite tell us what it is.

If the same product may appear more than once for the same client in the same group, you can modify the query slightly to account for that. Namely, where you see the analytic COUNT(*) function in the CASE expression in the SELECT list, change that to COUNT(DISTINCT CLIENT_ID).
Re: Identify Groups where 2 or more Clients have bought the same Products - SQL Oracle [message #687386 is a reply to message #687380] Mon, 06 March 2023 00:38 Go to previous messageGo to next message
Michel Cadot
Messages: 68624
Registered: March 2007
Location: Nanterre, France, http://...
Senior Member
Account Moderator

Hmm, how to qualify providing a "solution" when you don't know the problem and make so many assumptions (and miss many others)?

Re: Identify Groups where 2 or more Clients have bought the same Products - SQL Oracle [message #687391 is a reply to message #687376] Mon, 06 March 2023 03:39 Go to previous messageGo to next message
Nicha
Messages: 20
Registered: March 2020
Junior Member
Hi, Michel Cadot and Barbara Boehmer. I'm trully grateful for all your help.
I didn't answer before because have left my computer at the office.

Michael, the answer to your question, when inserting that line in group G100 - C300 - P4, this Line should be marked as well, because C315 also bought the product P4.


Hi Michel : answering your questions:

Q - Another question is if C1 also bought P4, are C1 and C3 still be marked?
A - The answer is No.

Q - Does the problem "those who have bought the same Products" means exactly the same products and only them or something else?
In other words, is this
{products bought by C1} = {products bought by C3} (as the image seems to show)
or can it be {products bought by C1/C3} includes {products bought by C3/C1}
or {products bought by C1} intersect {products bought by C3} is not empty.

A - The answer is True for the first criteria - ({products bought by C1} = {products bought by C3} (as the image seems to show))


Hi MathGuy : answering your questions:

Q - "Your sample data doesn't show a product id for client C2 in group G1 (with a quantity of 421)"
A - There may exist a client that has no Product yet. But You're right regarding the quantity 421; if the Client has no Products surely there's no quantity.


[Updated on: Mon, 06 March 2023 04:11]

Report message to a moderator

Re: Identify Groups where 2 or more Clients have bought the same Products - SQL Oracle [message #687392 is a reply to message #687391] Mon, 06 March 2023 03:56 Go to previous messageGo to next message
Michel Cadot
Messages: 68624
Registered: March 2007
Location: Nanterre, France, http://...
Senior Member
Account Moderator

What about the group G2 in the previous output from Barbara's query? Is it correct?

The question is: 2 rows, in the same group, should be marked if:
- they have the same product, whatever are the other products brought by the clients (in the same group)
- or all products brought by the 2 clients are same
- or all products brought by one the client are same than the other client which may have bought some other products?

Re: Identify Groups where 2 or more Clients have bought the same Products - SQL Oracle [message #687393 is a reply to message #687392] Mon, 06 March 2023 07:44 Go to previous messageGo to next message
Nicha
Messages: 20
Registered: March 2020
Junior Member
Hello Michel. I'll answer below:

The question is: 2 rows, in the same group, should be marked if:
- they have the same product, whatever are the other products brought by the clients (in the same group)
- or all products brought by the 2 clients are same
- or all products brought by one the client are same than the other client which may have bought some other products?

Answer : We must mark only when all products brought by the 2 clients are same. If client have other products it means that they should not have "exactly" the same products.

[Updated on: Mon, 06 March 2023 07:45]

Report message to a moderator

Re: Identify Groups where 2 or more Clients have bought the same Products - SQL Oracle [message #687395 is a reply to message #687393] Mon, 06 March 2023 08:19 Go to previous messageGo to next message
Michel Cadot
Messages: 68624
Registered: March 2007
Location: Nanterre, France, http://...
Senior Member
Account Moderator

In this case, your image is not correct as, in G100, you marked C310 and C315 with product P1 but C315 also bought P4 but C310 did not.

And your note: "when inserting that line in group G100 - C300 - P4, this Line should be marked as well, because C315 also bought the product P4" is also wrong as C315 also bought P1 but C300 did not.

Quote:
We must mark only when all products brought by the 2 clients are same. If client have other products it means that they should not have "exactly" the same products.

Then:
SQL> with
  2    products_bought as (
  3      select group_id, client_id,
  4             listagg(product_id,'/') within group (order by product_id) products
  5      from table1
  6      group by group_id, client_id
  7    ),
  8    equal_sets as (
  9      select group_id, client_id
 10      from products_bought p1
 11      where exists (select null from products_bought p2
 12                    where p2.group_id = p1.group_id
 13                      and p2.client_id != p1.client_id
 14                      and p2.products = p1.products)
 15    )
 16  select t.*, nvl2(e.client_id,'X',null) mark_selection
 17  from table1 t left outer join equal_sets e
 18         on e.group_id = t.group_id and e.client_id = t.client_id
 19  ORDER BY SUBSTR(t.group_id,1,1), TO_NUMBER(SUBSTR(t.group_id,2)),
 20           SUBSTR(t.client_id,1,1), TO_NUMBER(SUBSTR(t.client_id,2)),
 21           SUBSTR(t.product_id,1,1), TO_NUMBER(SUBSTR(t.product_id,2))
 22  /

GROUP_ID CLIENT_ID PRODUCT_ID   PROD_QTY MARK_SELECTION
-------- --------- ---------- ---------- --------------
G1       C1        P1               1200 X
G1       C1        P3                 32 X
G1       C2                          421
G1       C3        P1                167 X
G1       C3        P3                 12 X

G2       C4        P3                543
G2       C4        P4                 39
G2       C5        P1                234
G2       C5        P3                  0

G100     C300      P4                  0
G100     C310      P1                 56
G100     C315      P1                877
G100     C315      P4                733


13 rows selected.

[Updated on: Mon, 06 March 2023 08:22]

Report message to a moderator

Re: Identify Groups where 2 or more Clients have bought the same Products - SQL Oracle [message #687398 is a reply to message #687386] Mon, 06 March 2023 08:38 Go to previous messageGo to next message
mathguy
Messages: 106
Registered: January 2023
Senior Member
Michel Cadot wrote on Mon, 06 March 2023 00:38

Hmm, how to qualify providing a "solution" when you don't know the problem and make so many assumptions (and miss many others)?

Interesting comment. Why does one need to "qualify" a solution?

I stated explicitly all the assumptions I made, which doesn't seem to be a common practice on any of the sites I volunteer on. If I missed something, it is because I didn't realize I made an assumption, not because I chose to be silent about it.

You say I missed some assumptions. Which ones? Such comments are more credible if you include examples.

I said "I assume the three relevant columns are non-NULL". I also said if NULL is in fact possible, the OP needs to clarify the question, since the task in that case isn't entirely clear. Do you disagree? (Don't just say "yes" - explain how.)

I assumed uniqueness of the column composite for the solution I proposed, but I also explained in detail how to modify the solution if this assumption is wrong.

In any case, all seems moot now, since the OP said he needs equality of collections (of product id's), not non-empty intersection. I make the non-empty intersection assumption (explicitly!) based on the example he gave us - you already pointed that out to him.
Re: Identify Groups where 2 or more Clients have bought the same Products - SQL Oracle [message #687399 is a reply to message #687398] Mon, 06 March 2023 08:45 Go to previous messageGo to next message
Michel Cadot
Messages: 68624
Registered: March 2007
Location: Nanterre, France, http://...
Senior Member
Account Moderator

Quote:
I stated explicitly all the assumptions I made, which doesn't seem to be a common practice on any of the sites I volunteer on.

I concur but too much specify assumptions is counter-productive and may obfuscate important ones.
In addition, many readers (including me) are not native English speakers and reading long post is exhausting. As one of my previous boss usually said, if a mail (post) has more than 5 lines I don't read it.

Quote:
You say I missed some assumptions. Which ones? Such comments are more credible if you include examples.

The points I asked in my last but one post before yours: there.
I'd say these points about what is the meaning of "Mark with an 'X' all the records where customers, within the same Group, have bought the same products" is the most important ones, before the fact some columns are null or not.
Posting a solution without knowing the answer to this question is just posting without knowing what is the issue and then posting a wrong "solution":
SQL> select t.*,
  2         case when count(*) over (partition by group_id, product_id) > 1
  3              then 'X' end as mark_selection
  4  from   table1 t
  5  ;

GROUP_ID CLIENT_ID PRODUCT_ID   PROD_QTY MARK_SELECTION
-------- --------- ---------- ---------- --------------
G1       C3        P1                167 X
G1       C1        P1               1200 X
G1       C3        P3                 12 X
G1       C1        P3                 32 X
G1       C2                          421

G100     C310      P1                 56 X
G100     C315      P1                877 X
G100     C315      P4                733 X
G100     C300      P4                  0 X

G2       C5        P1                234
G2       C5        P3                  0 X
G2       C4        P3                543 X
G2       C4        P4                 39
Clearly your solution provide wrong marks as it is now clarified by OP.

[Updated on: Mon, 06 March 2023 08:50]

Report message to a moderator

Re: Identify Groups where 2 or more Clients have bought the same Products - SQL Oracle [message #687400 is a reply to message #687395] Mon, 06 March 2023 08:56 Go to previous messageGo to next message
Nicha
Messages: 20
Registered: March 2020
Junior Member
Yes Michel. You are 100% correct. I'm sorry. I apologize for misleading you.

I will test your SQL, and give you feed-back. I forgot to mention, but the real [Table 1] has millions of records, and I need to test if SQL is fast enough.

Going back to one of your questions, is it possible to know what the Query would look like if we wanted to mark all customers that match on 2 or more products?


Thanks for your precious help.

P.s. It's my first Post here in OraFaq. How can I mark The answer that solved my problem?

My Best Regards.
Re: Identify Groups where 2 or more Clients have bought the same Products - SQL Oracle [message #687401 is a reply to message #687393] Mon, 06 March 2023 08:58 Go to previous messageGo to next message
mathguy
Messages: 106
Registered: January 2023
Senior Member
Nicha wrote on Mon, 06 March 2023 07:44

Hello Michel. I'll answer below:

The question is: 2 rows, in the same group, should be marked if:
- they have the same product, whatever are the other products brought by the clients (in the same group)
- or all products brought by the 2 clients are same
- or all products brought by one the client are same than the other client which may have bought some other products?

Answer : We must mark only when all products brought by the 2 clients are same. If client have other products it means that they should not have "exactly" the same products.
Please answer the additional questions I asked.

Can the group id ever be NULL? If so, how should that be handled? Same for client id. For example: in the same non-NULL group, one client is C1 and another is NULL. Both bought P1 and P2. Can you consider NULL as a single client, and say they bought exactly the same products? The theoretical answer is "no" - for all we know, NULL might be C1 himself, or NULL on the two rows - for products P1 and P2 - may be different clients; but it's your problem, you know the required handling - if this is even a possibility.

You already stated that the product id can be NULL. How should that be treated? Here are just a few examples to consider (always withing the same group):

1. C1 has product P1, C2 has product NULL. These probably don't have "exactly" the same products, but please confirm.
2. C3 has P1, C2 has both P1 and NULL (on different rows). Do they have "exactly" the same products?
3. C5 has P1 and NULL. C6 also has P1 and NULL. Do they have "exactly" the same products?

Note that NULL can be anything - so theoretically the answer should be "no" in all cases (even in the last one). But it's your use case, you know what your business user needs.

Whatever the required handling, in your test data include rows that will reflect each of the three cases I described above, and check that any solution you are considering returns the required output in each case.

Another question - in my answer (which now is clearly answering a different question, not the one you have) I assumed that (group, client, product) is unique. Is that so, or can the same product appear on more than one row, for the same client in the same group? And if it can, how should that be treated? Perhaps as if the product only appeared once, but that is not the only reasonable answer. For example, different rows - even with the same group, client and product - may indicate different orders (on different dates); and if one client bought P1 twice, another client bought "exactly" the same products only if he also bought P1 twice. Again, this is your use case, only you know what the business user means by "exactly" the same in such cases.

And one more question I hadn't thought about before... Can the quantity be 0, or NULL? (It was always non-NULL and strictly positive in your sample data.) If so, how should that be treated? If the quantity is 0, it makes sense to me that the client doesn't actually "have" that product. If the quantity can be NULL, it's even less clear, as that can be either 0 or strictly positive. Please clarify (and make sure you include test cases to make sure the solutions handle those possibilities correctly). Of course, if there is a non-NULL constraint and a check constraint to make sure all quantities are strictly positive, the question is moot - but if that's the case, please say so explicitly.
Re: Identify Groups where 2 or more Clients have bought the same Products - SQL Oracle [message #687402 is a reply to message #687399] Mon, 06 March 2023 08:59 Go to previous messageGo to next message
Michel Cadot
Messages: 68624
Registered: March 2007
Location: Nanterre, France, http://...
Senior Member
Account Moderator

In addition, it seems you love SQL questions and provide quite complex solutions with advanced features like MATCH_RECOGNIZE.
Maybe you can explain the solutions (the MATCH_RECOGNIZE expression), the syntax and how it works, otherwise I think many (including me) will just see it and jump on other things.
Explaining will allow everyone to learn this feature.

We have some SQL puzzles, some of them unanswered, you could have a look at them and provide new solutions: http://www.orafaq.com/forum/t/174976.
Re: Identify Groups where 2 or more Clients have bought the same Products - SQL Oracle [message #687403 is a reply to message #687400] Mon, 06 March 2023 09:08 Go to previous messageGo to next message
Michel Cadot
Messages: 68624
Registered: March 2007
Location: Nanterre, France, http://...
Senior Member
Account Moderator

Quote:
Going back to one of your questions, is it possible to know what the Query would look like if we wanted to mark all customers that match on 2 or more products?

Do you mean:
  • they match on 2 or more of their products but they may have bought other ones.
  • or they match on all their products which should be at least 2.
An interesting question from mathguy is the one about the unique key of your table.
Is this (group_id, client_id, product_id)? Or can there be several times the same product_id for the (group_id, client_id)?

(I assume here that group_id and client_id are not null, otherwise I fail to see the meaning of such table.)


Quote:
P.s. It's my first Post here in OraFaq. How can I mark The answer that solved my problem?

There no way to close a topic in OraFAQ; a topic is always opened to new solution or extension.

Re: Identify Groups where 2 or more Clients have bought the same Products - SQL Oracle [message #687405 is a reply to message #687403] Mon, 06 March 2023 09:33 Go to previous messageGo to next message
Michel Cadot
Messages: 68624
Registered: March 2007
Location: Nanterre, France, http://...
Senior Member
Account Moderator
The second option is easy from my previous query (I added new rows to show):
SQL> -- First case: product sets must be equal:
SQL> with
  2    products_bought as (
  3      select group_id, client_id,
  4             listagg(product_id,'/') within group (order by product_id) products
  5      from table1
  6      group by group_id, client_id
  7    ),
  8    equal_sets as (
  9      select group_id, client_id
 10      from products_bought p1
 11      where exists (select null from products_bought p2
 12                    where p2.group_id = p1.group_id
 13                      and p2.client_id != p1.client_id
 14                      and p2.products = p1.products)
 15    )
 16  select t.*, nvl2(e.client_id,'X',null) mark_selection
 17  from table1 t left outer join equal_sets e
 18         on e.group_id = t.group_id and e.client_id = t.client_id
 19  order by substr(t.group_id,1,1), to_number(substr(t.group_id,2)),
 20           substr(t.client_id,1,1), to_number(substr(t.client_id,2)),
 21           substr(t.product_id,1,1), to_number(substr(t.product_id,2))
 22  /

GROUP_ID CLIENT_ID PRODUCT_ID   PROD_QTY MARK_SELECTION
-------- --------- ---------- ---------- --------------
G1       C1        P1               1200 X
G1       C1        P3                 32 X
G1       C2                          421
G1       C3        P1                167 X
G1       C3        P3                 12 X

G2       C4        P3                543
G2       C4        P4                 39
G2       C5        P1                234
G2       C5        P3                  0
G2       C6        P6                  0 X
G2       C7        P6                  0 X

G100     C300      P4                  0
G100     C310      P1                 56
G100     C315      P1                877
G100     C315      P4                733

21 rows selected.

SQL> -- Second case: product sets must be equal but containing at least 2 product
SQL> with
  2    products_bought as (
  3      select group_id, client_id,
  4             listagg(product_id,'/') within group (order by product_id) products
  5      from table1
  6      group by group_id, client_id
  7      having count(*) >= 2
  8    ),
  9    equal_sets as (
 10      select group_id, client_id
 11      from products_bought p1
 12      where exists (select null from products_bought p2
 13                    where p2.group_id = p1.group_id
 14                      and p2.client_id != p1.client_id
 15                      and p2.products = p1.products)
 16    )
 17  select t.*, nvl2(e.client_id,'X',null) mark_selection
 18  from table1 t left outer join equal_sets e
 19         on e.group_id = t.group_id and e.client_id = t.client_id
 20  order by substr(t.group_id,1,1), to_number(substr(t.group_id,2)),
 21           substr(t.client_id,1,1), to_number(substr(t.client_id,2)),
 22           substr(t.product_id,1,1), to_number(substr(t.product_id,2))
 23  /

GROUP_ID CLIENT_ID PRODUCT_ID   PROD_QTY MARK_SELECTION
-------- --------- ---------- ---------- --------------
G1       C1        P1               1200 X
G1       C1        P3                 32 X
G1       C2                          421
G1       C3        P1                167 X
G1       C3        P3                 12 X

G2       C4        P3                543
G2       C4        P4                 39
G2       C5        P1                234
G2       C5        P3                  0
G2       C6        P6                  0
G2       C7        P6                  0

G100     C300      P4                  0
G100     C310      P1                 56
G100     C315      P1                877
G100     C315      P4                733

21 rows selected.

The first option of my previous post is a new query (not done).

[Updated on: Mon, 06 March 2023 09:34]

Report message to a moderator

Re: Identify Groups where 2 or more Clients have bought the same Products - SQL Oracle [message #687407 is a reply to message #687402] Mon, 06 March 2023 10:30 Go to previous messageGo to next message
mathguy
Messages: 106
Registered: January 2023
Senior Member
Michel Cadot wrote on Mon, 06 March 2023 08:59

In addition, it seems you love SQL questions and provide quite complex solutions with advanced features like MATCH_RECOGNIZE.
Maybe you can explain the solutions (the MATCH_RECOGNIZE expression), the syntax and how it works, otherwise I think many (including me) will just see it and jump on other things.
Explaining will allow everyone to learn this feature.

We have some SQL puzzles, some of them unanswered, you could have a look at them and provide new solutions: http://www.orafaq.com/forum/t/174976.
The MATCH_RECOGNIZE clause appeared in Oracle 12.1, almost ten years ago. It is a shame that many developers, including many experienced ones, did not spend time to study it. Many others did study it and use it splendidly; there are some amazing things you can do with it, which are either exceptionally hard or impossible to do without it. Alas, the learning curve is steep - you don't get a lot of power with very little effort.

The main prerequisites are a good understanding of analytic functions and of regular expressions. This latter requirement is surprising, but it is what gives the clause its power.

It makes no sense to give a tutorial on MATCH_RECOGNIZE on a site like this (any more than it makes sense to explain PIVOT, or recursive WITH clause, etc.). In addition to the Oracle documentation, which in many cases is mediocre at best but it is actually quite good for MATCH_RECOGNIZE:

SQL For Pattern Matching

there are at least two other "canonical" tutorials for it. One is on Tim Hall's excellent site, oracle-base.com:

Pattern Matching in Oracle Database

The other - and more advanced, not an intro tutorial - is a series of articles by Keith Laker, one of the developers of the MATCH_RECOGNIZE machinery:

SQP Pattern Matching Deep Dive
Re: Identify Groups where 2 or more Clients have bought the same Products - SQL Oracle [message #687408 is a reply to message #687405] Mon, 06 March 2023 10:34 Go to previous messageGo to next message
Nicha
Messages: 20
Registered: March 2020
Junior Member
Remarkable Michel.

That's very well explained.

My best regards.
Re: Identify Groups where 2 or more Clients have bought the same Products - SQL Oracle [message #687409 is a reply to message #687407] Mon, 06 March 2023 11:32 Go to previous messageGo to next message
Nicha
Messages: 20
Registered: March 2020
Junior Member
Mathguy I wanted to thank you for the reference to MATCH_RECOGNIZE. This will certainly help many of us to evolve in this area.

thank you very much.
Re: Identify Groups where 2 or more Clients have bought the same Products - SQL Oracle [message #687410 is a reply to message #687407] Mon, 06 March 2023 12:03 Go to previous messageGo to next message
Michel Cadot
Messages: 68624
Registered: March 2007
Location: Nanterre, France, http://...
Senior Member
Account Moderator

Quote:
The MATCH_RECOGNIZE clause appeared in Oracle 12.1, almost ten years ago. It is a shame that many developers, including many experienced ones, did not spend time to study it.

You can say the same thing for MODEL clause (did you study it Wink ) but I never saw it used in real world. Sad

Re: Identify Groups where 2 or more Clients have bought the same Products - SQL Oracle [message #687412 is a reply to message #687410] Mon, 06 March 2023 14:16 Go to previous messageGo to next message
mathguy
Messages: 106
Registered: January 2023
Senior Member
Not that it matters, but yes, I did study MODEL, and used it a few times. I would not say the same thing for MODEL, though, for the simple reason that I have not seen the same kind of wonderful applications I saw for MATCH_RECOGNIZE. I can remember exactly one excellent application of MODEL, which I don't know how else one would attack, on the "Oracle Forum" or whatever they call it now (used to be known as OTN). Not enough to justify becoming an expert.
Re: Identify Groups where 2 or more Clients have bought the same Products - SQL Oracle [message #687413 is a reply to message #687412] Mon, 06 March 2023 14:54 Go to previous messageGo to next message
Michel Cadot
Messages: 68624
Registered: March 2007
Location: Nanterre, France, http://...
Senior Member
Account Moderator

I once wrote a query using MODEL clause to solve sudoku. Smile
Is it possible with MATCH_RECOGNIZE?

Solomon and I gave some solutions in this forum with MODEL... about 10-12 years ago.

Re: Identify Groups where 2 or more Clients have bought the same Products - SQL Oracle [message #687418 is a reply to message #687405] Tue, 07 March 2023 11:03 Go to previous messageGo to next message
Nicha
Messages: 20
Registered: March 2020
Junior Member
Hi Michel Cadot

I was presented to a new scenario that I need to present to you, to see if it is possible to include in your Query.
I'm refering to the "First case: product sets must be equal:", you presented in your final answer.

In addition to marking the customers who have exactly the same products, it is necessary, in Groups of only 2 customers, in which only one has a product, to mark the latter.

This is the Case of the Client C401 in Group G200.

So, in Groups with only two Clients:
- If Only one has Product or Products, mark the one that has it.
- If both Clients have Product or Products, doesn't mark.
- If both doesn't have, or have different Product or Products, doesn't mark

See the example below, please.

/forum/fa/14696/0/


The idea is to add these scenarios of groups that only have 2 clients to what Michel had already done.

Many thanks once more for your time.



Re: Identify Groups where 2 or more Clients have bought the same Products - SQL Oracle [message #687419 is a reply to message #687418] Tue, 07 March 2023 11:20 Go to previous messageGo to next message
Michel Cadot
Messages: 68624
Registered: March 2007
Location: Nanterre, France, http://...
Senior Member
Account Moderator
If I understand, you want :

1/ The clients where one set is included (strictly or equal) into the other.
In this case, in G100, C300 and C315 should then be marked: {P1} included in {P1,P4}, as well as, C310 and C315: {P4} included in {P1,P4}.

2/ The client of a group which is alone in its group to have bought some products.
What if it is alone in the group? Is it marked or not?

Re: Identify Groups where 2 or more Clients have bought the same Products - SQL Oracle [message #687420 is a reply to message #687419] Tue, 07 March 2023 12:19 Go to previous messageGo to next message
Nicha
Messages: 20
Registered: March 2020
Junior Member
I'll answer your questions below.

1/ The clients where one set is included (strictly or equal) into the other.
In this case, in G100, C300 and C315 should then be marked: {P1} included in {P1,P4}, as well as, C310 and C315: {P4} included in {P1,P4}.

A - No. G100 has 3 clients, and they have different sets of products. In Groups with more than two clients, product sets must be equal. This ones you have already treated very well. C300 and C315 has not equal sets, and C310 - C315 too.
What changes, are the Groups with only two clients, where Only one has Product or Products and the other not - G200 C401 and G300 C501.

2/ The client of a group which is alone in its group to have bought some products.
What if it is alone in the group? Is it marked or not?

A - No. If it is alone never be marked, having or not bought products.

My best regards
Re: Identify Groups where 2 or more Clients have bought the same Products - SQL Oracle [message #687422 is a reply to message #687420] Tue, 07 March 2023 14:20 Go to previous messageGo to next message
Michel Cadot
Messages: 68624
Registered: March 2007
Location: Nanterre, France, http://...
Senior Member
Account Moderator

OK, I inserted a pre-process step ("data") to add the relevant counts and use them in the final query to add the conditions when there are 2 clients in a group (line 26):
SQL> with
  2    data as (
  3      select t.*,
  4             count(product_id) over (partition by group_id, client_id) nb_products,
  5             count(product_id) over (partition by group_id) nb_group_products,
  6             count(distinct client_id) over (partition by group_id) client_cnt
  7      from table1 t
  8    ),
  9    products_bought as (
 10      select group_id, client_id,
 11             listagg(product_id,'/') within group (order by product_id) products
 12      from table1
 13      group by group_id, client_id
 14    ),
 15    equal_sets as (
 16      select group_id, client_id
 17      from products_bought p1
 18      where exists (select null from products_bought p2
 19                    where p2.group_id = p1.group_id
 20                      and p2.client_id != p1.client_id
 21                      and p2.products = p1.products)
 22    )
 23  select t.group_id, t.client_id, t.product_id, t.prod_qty,
 24         case
 25           when    e.client_id is not null
 26                or ( client_cnt = 2 and nb_products > 0 and nb_products = nb_group_products )
 27             then 'X'
 28           end mark_selection
 29  from data t left outer join equal_sets e
 30         on e.group_id = t.group_id and e.client_id = t.client_id
 31  order by substr(t.group_id,1,1), to_number(substr(t.group_id,2)),
 32           substr(t.client_id,1,1), to_number(substr(t.client_id,2)),
 33           substr(t.product_id,1,1), to_number(substr(t.product_id,2))
 34  /
GROUP_ID CLIENT_ID PRODUCT_ID   PROD_QTY MARK_SELECTION
-------- --------- ---------- ---------- --------------
G1       C1        P1               1200 X
G1       C1        P3                 32 X
G1       C2                          421
G1       C3        P1                167 X
G1       C3        P3                 12 X

G2       C4        P3                543
G2       C4        P4                 39
G2       C5        P1                234
G2       C5        P3                  0
G2       C6        P6                  0 X
G2       C7        P6                  0 X

G100     C300      P4                  0
G100     C310      P1                 56
G100     C315      P1                877
G100     C315      P4                733

G200     C400
G200     C401      P1                 15 X
G200     C401      P3                 23 X

G300     C500
G300     C601      P3                 23 X

G400     C600
G400     C601

G500     C700      P4                390
G500     C701      P6                134


24 rows selected.

[Updated on: Tue, 07 March 2023 14:21]

Report message to a moderator

Re: Identify Groups where 2 or more Clients have bought the same Products - SQL Oracle [message #687423 is a reply to message #687422] Tue, 07 March 2023 15:07 Go to previous messageGo to next message
Nicha
Messages: 20
Registered: March 2020
Junior Member
Perfect. It will be fast with 1 million records?
Re: Identify Groups where 2 or more Clients have bought the same Products - SQL Oracle [message #687424 is a reply to message #687423] Tue, 07 March 2023 15:17 Go to previous message
Michel Cadot
Messages: 68624
Registered: March 2007
Location: Nanterre, France, http://...
Senior Member
Account Moderator

A priori, the slowest part will be "equal_sets".
You can post the execution plan as explained at http://www.orafaq.com/forum/index.php/mv/msg/206002/433888/#msg_433888.

Previous Topic: Grouping with the previous one if identical
Next Topic: Different UTC time from SYSDATE on different databases
Goto Forum:
  


Current Time: Thu Mar 28 03:08:07 CDT 2024