Multi Registration in a single SQL query

Below is the data in TestingTable1, sorted by date in descending order.

BUYER_ID | ITEM_ID | CREATED_TIME ----------+-----------------+---------------------- 1345653 110909316904 2012-07-09 21:29:06 1345653 151851771618 2012-07-09 19:57:33 1345653 221065796761 2012-07-09 19:31:48 1345653 400307563710 2012-07-09 18:57:33 

And if this is the data below in TestingTable2, sorted by date in descending order, always

 USER_ID | PRODUCT_ID | LAST_TIME ---------+----------------+----------------------- 1345653 110909316904 2012-07-09 22:29:06 1345653 150851771618 2012-07-09 19:57:33 

Each line in TestingTable1 must match TestingTable2 , if it does not match or there is no data in TestingTable2, then I need to show them in the output, as in TestingTable1 I have this data, but the corresponding TestingTable2 I have this data (which will be incorrect data) so that I can see what the discrepancy is and what data is missing.

I need to compare TestingTable2 with TestingTable1 on BUYER_ID and USER_ID . I need to see if the correspondence BUYER_ID and USER_ID comparable, then I need to compare ITEM_ID with PRODUCT_ID and CREATED_TIME with LAST_TIME and if there is a mismatch in TestingTable2 after comparing with TestingTable1 either in one of them or in both of them, then I need to show the result.

So, if you look at the example above - I have three scenarios basically

  • Firstly-In TestingTable1 , in the first row ITEM_ID maps to PRODUCT_ID in the first row of TestingTable2 , but CREATED_TIME does not match LAST_TIME for the first row in both tables
  • Secondly. In TestingTable1 , the second row of CREATED_TIME is mapped to LAST_TIME in the second row of TestingTable2 , but ITEM_ID does not match PRODUCT_ID for the second row in both tables
  • Thirdly. In TestingTable1 last two lines (lines) are generally absent in TestingTable2 . This script is not covered in my request that I wrote. I want this script also in my request .

So these are the three cases that I need to cover when comparing TestingTable2 with TestingTable1 always. And TestingTable1 is the MAIN table where comparisons should always be made, so this means that the data in TestingTable1 always accurate.

So, I need to show the result as shown below, if neither of them matches, or there is no data in TestingTable2 on all TestingTable1 data, and then next to it is the same TestingTable2 data, so that I see what value was in TestingTable1 compared with TestingTable2

 BUYER_ID | ITEM_ID | CREATED_TIME | USER_ID | PRODUCT_ID | LAST_TIME -----------+-----------------+---------------------------+----------------+--------------------+----------------------- 1345653 110909316904 2012-07-09 21:29:06 1345653 110909316904 2012-07-09 22:29:06 1345653 151851771618 2012-07-09 19:57:33 1345653 150851771618 2012-07-09 19:57:33 1345653 221065796761 2012-07-09 19:31:48 NULL NULL NULL 1345653 400307563710 2012-07-09 18:57:33 NULL NULL NULL 

The query below, I only cover my two scenarios , which I mentioned above, and it works fine, and I get the output as above, leaving the last two lines from Output. But I need to add a third scenario also in this (below) request, so that it produces output as described above.

 SELECT * FROM (SELECT T2.buyer_id, T2.item_id, T2.created_time AS created_time, subq.user_id, subq.product_id, subq.LAST_TIME FROM TestingTable2 subq JOIN TestingTable1 T2 ON T2.buyer_id = subq.user_id AND subq.LAST_TIME = ( T2.created_time ) WHERE ( subq.product_id <> T2.item_id ) UNION ALL SELECT T2.buyer_id, T2.item_id AS item_id, T2.created_time, subq.user_id, subq.product_id AS product_id, subq.LAST_TIME FROM TestingTable2 subq JOIN TestingTable1 T2 ON T2.buyer_id = subq.user_id AND subq.product_id = T2.item_id WHERE ( subq.LAST_TIME <> ( T2.created_time ) )) finalResult ORDER BY finalResult.BUYER_ID; 

Any suggestion would be appreciated.

PS I asked a few questions related to JOIN in the last few days, but this only applies to my two scenarios, and not the third scenario that I need in this request.

Update: - I cannot use NOT IN or NOT EXISTS syntax for SQL, since I work with Hive and Hive, it does NOT IN support NOT IN or NOT EXISTS , so I need another way to handle this.

I need to use my query to modify it to work in the third scenario, since Hive will support SQL syntax.

Below is my SQL script that runs my two scripts above, but not the third script. Can someone help me modify my request for work and for the third scenario?

http://sqlfiddle.com/#!3/102dd/1/0 .

The data in table 1 should be in table 2, if there is none, then I need to show the data mismatch after comparing with table 1, and it is also possible, the data from Table 1 will not be present in Table 2, and I also want to show it.

Updated output for script http://sqlfiddle.com/#!3/102dd/3/0

 BUYER_ID | ITEM_ID | CREATED_TIME | USER_ID | PRODUCT_ID | LAST_TIME -----------+-----------------+---------------------------+----------------+--------------------+----------------------- 1345653 151851771618 July, 09 2012 19:57:33 1345653 150851771618 July, 09 2012 19:57:33 1345653 221065796761 July, 09 2012 19:31:48 1345653 221165796761 July, 09 2012 19:31:48 1345653 110909316904 July, 09 2012 21:29:06 1345653 110909316904 July, 09 2012 22:29:06 1345653 400307563710 July, 09 2012 18:57:33 NULL NULL NULL 1345653 310411560125 July, 09 2012 16:09:49 NULL NULL NULL 

UPDATED SQL QUERY THAT GIVES ME AN ERROR

I replaced TestingTable1 with this request -

 (SELECT BUYER_ID, ITEM_ID, rank(BUYER_ID), CREATED_TIME FROM ( SELECT BUYER_ID, ITEM_ID, CREATED_TIME FROM testingtable1 where to_date(from_unixtime(cast(UNIX_TIMESTAMP(CREATED_TIME) as int))) = '2012-07-09' DISTRIBUTE BY BUYER_ID SORT BY BUYER_ID, CREATED_TIME desc ) T1 WHERE rank(BUYER_ID) < 5) 

And TestingTable2 with this request -

 (SELECT USER_ID, PROD_AND_TS.PRODUCT_ID as PRODUCT_ID, PROD_AND_TS.TIMESTAMPS as TIMESTAMPS FROM testingtable2 lateral view explode(PURCHASED_ITEM) exploded_table as PROD_AND_TS where to_date(from_unixtime(cast(PROD_AND_TS.TIMESTAMPS as BIGINT))) = '2012-07-09') 
+2
source share
4 answers

If you cannot use NOT IN or a clean solution posted by @eggyal due to DBMS limitations, another option would be to completely duplicate the original union and use these results in the LEFT JOIN .

For sqlFiddle, the following statement returns the desired results.

SQL statement

 SELECT * FROM( SELECT * FROM TestingTable1 A JOIN TestingTable2 B ON A.BUYER_ID = B.USER_ID AND B.LAST_TIME = A.Created_TIME WHERE B.PRODUCT_ID <> A.ITEM_ID UNION ALL SELECT * FROM TestingTable1 A INNER JOIN TestingTable2 B ON A.BUYER_ID = B.USER_ID AND B.PRODUCT_ID = A.ITEM_ID WHERE B.LAST_TIME <> A.Created_TIME ) X UNION ALL SELECT A.*, NULL, NULL, NULL FROM TestingTable1 A LEFT OUTER JOIN ( SELECT * FROM TestingTable1 A JOIN TestingTable2 B ON A.BUYER_ID = B.USER_ID AND B.LAST_TIME = A.Created_TIME WHERE B.PRODUCT_ID <> A.ITEM_ID UNION ALL SELECT * FROM TestingTable1 A INNER JOIN TestingTable2 B ON A.BUYER_ID = B.USER_ID AND B.PRODUCT_ID = A.ITEM_ID WHERE B.LAST_TIME <> A.Created_TIME ) X ON A.BUYER_ID = X.BUYER_ID AND A.ITEM_ID = X.ITEM_ID WHERE X.BUYER_ID IS NULL 
+2
source

Try this piece of code written in SQL. I already tested this on SQL Fiddle.

  SELECT tt1.buyer_id,tt1.item_id,tt1.created_time, tt2.user_id,tt2.product_id,tt2.last_time FROM testingtable1 tt1 LEFT OUTER JOIN testingtable2 tt2 ON tt1.buyer_id = tt2.user_id AND tt1.item_id = tt2.product_id AND tt1.created_time = tt2.last_time 
+1
source

This answer answers your comment request to do it here: https://stackoverflow.com/a/11440651/1166147 .
I really gave you the answer to this in two of your other duplicates that produce the exact output you are showing. This is the first request I wrote here: fooobar.com/questions/186817 / ... and is mentioned and explained here twice: fooobar.com/questions/186799 / ...

I changed your query back to mine and ran it in your SQLFiddle:

enter image description here

However, your requirements have evolved, and it looks like you can now just do LEFT JOIN if TestingTable1 is really always accurate data.

 SELECT * FROM TestingTable1 A LEFT JOIN TestingTable2 B ON A.BUYER_ID = B.USER_ID AND (B.LAST_TIME = A.Created_TIME OR B.PRODUCT_ID = A.ITEM_ID) 

EDIT COMMENT

MOCK SCENARIOS

 SCENARIO|| TABLE1 || TABLE2 ---------------------------------------------------------- SCENARIO|| BUYER ITEM TIME || USER PRODUCT TIME 1 || 1 A 09 || 1 Z 09 2 || 1 B 10 || NO RECORD IN TABLE 2 3 || 1 C 10 || 1 C 02 4 || 1 D 12 || NO RECORD IN TABLE 2 5 || 1 E 01 || 1 E 01 6 || NO RECORD IN TABLE 1 || 1 Y 05 

You are requesting a SQL solution, but it really is not a SQL issue. You exclude valid SQL answers provided to you because you cannot use them in your environment and then reopen the same question. This is a HQL / HIVE issue. Yes, there are some similarities, but there are some differences.

It seems that HQL will support something like this, although I have no way to test, and SQLFiddle is not suitable for testing this. You will need to decide how to make ISNULL or COALESCE in HQL, as shown in the comments next to select *. This will "merge" the results and return a value from what was not null if there was a non-zero value. I believe HQL supports ISNULL:

 select * --BUYER_ID, isNull(B.USER_ID,C.USER_ID) from (select BUYER_ID,ITEM_ID ,Created_TIME from TestingTable1) a left join (SELECT USER_ID,PRODUCT_ID, last_time FROM TestingTable2 ) b on(a.BUYER_ID = b.user_id and B.last_time = A.Created_TIME) left join (SELECT USER_ID,PRODUCT_ID, last_time FROM TestingTable2 ) c on(a.BUYER_ID = c.user_id and c.PRODUCT_ID = A.ITEM_ID) 

Here is another TSQL answer that may give you some ideas. I would never use this in TSQL, but it really worked.

 SELECT A.BUYER_ID,A.ITEM_ID,CREATED_TIME,COALESCE(B.USER_ID,X.USER_ID,Y.USER_ID), COALESCE(B.PRODUCT_ID,X.PRODUCT_ID,Y.PRODUCT_ID) ,COALESCE(B.last_time,X.last_time,Y.last_time) FROM TestingTable1 A LEFT JOIN TestingTable2 B ON A.BUYER_ID = B.USER_ID AND B.PRODUCT_ID = A.ITEM_ID AND B.last_time = A.Created_TIME LEFT JOIN( SELECT USER_ID,PRODUCT_ID, last_time FROM TestingTable1 A LEFT JOIN TestingTable2 B ON A.BUYER_ID = B.USER_ID AND B.last_time = A.Created_TIME WHERE ISNULL(B.PRODUCT_ID,0) <> A.ITEM_ID AND B.USER_ID IS NOT NULL) X ON X.USER_ID = A.BUYER_ID AND A.Created_TIME = X.last_time LEFT JOIN( SELECT USER_ID,PRODUCT_ID, last_time FROM TestingTable1 A LEFT JOIN TestingTable2 B ON A.BUYER_ID = B.USER_ID AND ISNULL( B.PRODUCT_ID,0) = A.ITEM_ID WHERE B.last_time <> A.Created_TIME AND B.USER_ID IS NOT NULL) Y ON A.BUYER_ID = Y.USER_ID AND A.ITEM_ID = Y.PRODUCT_ID 
+1
source

It looks like what you are looking for is a complete outer join. I did not see in your message which database you are using, so I cannot publish the exact syntax, but this link may point you in the right direction:

http://www.w3resource.com/sql/joins/perform-a-full-outer-join.php

In particular, look at the diagram below.

0
source

All Articles