Exclude NULL fields from the list of concatenated joins

I created the following view "user_details_merged":

SELECT DISTINCT coalesce(own.user_name, join_user_name.user_name) AS user_name, coalesce(own.email, join_mail.email) AS email, coalesce(own.first_name, join_name.first_name) AS first_name, coalesce(own.last_name, join_name.last_name) AS last_name FROM user_details AS own LEFT JOIN user_details AS join_user_name ON own.user_name IS NULL AND ( (join_user_name.email = own.email AND own.email IS NOT NULL) OR (join_user_name.first_name = own.first_name AND join_user_name.last_name = own.last_name AND own.first_name IS NOT NULL AND own.last_name IS NOT NULL)) LEFT JOIN user_details AS join_mail ON own.email IS NULL AND ( (join_mail.user_name = own.user_name AND own.user_name IS NOT NULL) OR (join_mail.first_name = own.first_name AND join_mail.last_name = own.last_name AND own.first_name IS NOT NULL AND own.last_name IS NOT NULL)) LEFT JOIN user_details AS join_name ON own.first_name IS NULL AND own.last_name IS NULL AND ( (join_name.email = own.email AND own.email IS NOT NULL) OR (join_name.user_name = own.user_name AND own.user_name IS NOT NULL)) ORDER BY user_name ASC,email ASC, first_name ASC, last_name ASC 

This combines my columns with:

 user_name | email | first_name | last_name ab NULL NULL NULL bcd a NULL ef NULL xyz 

to

 user_name | email | first_name | last_name ab NULL NULL NULL bcd a NULL ef NULL xyz abcd abef 

I want to:

 user_name | email | first_name | last_name NULL xyz abcd abef 

without NULL containing ROWS when there is a line with the same data that has more information but still retain / NULL xyz / when there is no other line with additional information.

This second view here does exactly what I need:

 SELECT DISTINCT a.user_name,a.email,a.first_name,a.last_name FROM user_details_merged a LEFT JOIN user_details_merged b ON ( ( a.user_name IS NOT NULL OR NOT EXISTS (SELECT user_name FROM user_details_merged b WHERE b.user_name IS NOT NULL AND b.email=ISNULL(a.email,b.email) AND b.first_name=isnull(a.first_name,b.first_name) AND b.last_name=isnull(a.last_name,b.last_name)) ) AND ( a.email IS NOT NULL OR NOT EXISTS (SELECT email FROM user_details_merged b WHERE b.email IS NOT NULL AND b.user_name=ISNULL(a.user_name,b.user_name) AND b.first_name=isnull(a.first_name,b.first_name) AND b.last_name=isnull(a.last_name,b.last_name)) ) AND ( (a.first_name IS NOT NULL AND a.last_name IS NOT NULL) OR NOT EXISTS (SELECT email FROM user_details_merged b WHERE b.email IS NOT NULL AND b.user_name=ISNULL(a.user_name,b.user_name) AND b.email=ISNULL(a.email,b.email)) -- AND b.first_name=isnull(a.first_name,b.first_name) AND b.last_name=isnull(a.last_name,b.last_name)) ) AND NOT (a.first_name = b.first_name AND a.last_name = b.last_name AND a.email = b.email AND a.user_name = b.user_name) ) WHERE coalesce(b.user_name,b.email,b.first_name,b.last_name) IS NOT NULL 

The main problem is that the user_details view where the data comes from consists of many joins of different tables. Some of them contain only the username and email address, as well as only the email address and first / last name, etc. Therefore, there is no unique key, and I cannot index the view due to UNIONS. This makes it impossible to complete the last presentation in an hour. My current workaround is a procedure that stores data in the user_details_merged view in the temp table and allows the second view to use the data from this table. This way I can reduce the runtime to 7 seconds for 8000 lines.

Any other suggestions?

Thank you very much;)

+4
source share
3 answers

Oh! This is a dirty data model. The best solution would be to correct the data model to prevent the need for such a complex query. However, this tends to be even more complex with application dependencies, so I'm going to assume that this has already been entertained.

  • I used a 4-line pattern to come up with an alternative solution.
  • Then I added a couple of lines with only the values โ€‹โ€‹of the first and last name, which exposed the missing script in the request above.
  • I also replicated the same 6 rows to over 12K rows, as the data model supports such a scenario. This led to the fact that the above request worked for more than two hours before I finally just gave up and stopped it.
  • I ran 12K lines against my decision and it returned in less than a second with the expected results.

So, without further adieu:

 -- ================================================================================= -- BEGIN: SETUP TEST DATA -- ================================================================================= SET NOCOUNT ON IF OBJECT_ID('user_details', 'U') IS NOT NULL DROP TABLE user_details; GO CREATE TABLE dbo.user_details ( user_name char(1) NULL, email char(1) NULL, first_name char(1) NULL, last_name char(1) NULL ) GO INSERT dbo.user_details SELECT * FROM ( SELECT * FROM dbo.user_details WHERE 1=2 UNION ALL SELECT 'a', 'b', NULL, NULL UNION ALL SELECT NULL, 'b', 'c', 'd' UNION ALL SELECT 'a', NULL, 'e', 'f' UNION ALL SELECT NULL, 'x', 'y', 'z' UNION ALL SELECT NULL, NULL, 'y', 'z' UNION ALL SELECT NULL, NULL, 'a', 'z' ) A GO --/* -- TURN 6 ROWS INTO OVER 12K ROWS TO TEST PERFORMANCE DECLARE @count int; SELECT @count = 0 WHILE @count < 11 BEGIN INSERT user_details SELECT * FROM user_details SELECT @count = @count + 1 END --*/ -- ================================================================================= -- END: SETUP TEST DATA -- ================================================================================= -- ================================================================================= -- BEGIN: NEW SOLUTION FINAL: <1sec on 12288 rows -- ================================================================================= IF OBJECT_ID('tempdb..#useremail', 'U') IS NOT NULL DROP TABLE #useremail; IF OBJECT_ID('tempdb..#email', 'U') IS NOT NULL DROP TABLE #email; IF OBJECT_ID('tempdb..#user', 'U') IS NOT NULL DROP TABLE #user; IF OBJECT_ID('tempdb..#name', 'U') IS NOT NULL DROP TABLE #name; -- GET YOUR UNIQUE user_name AND email KEY SELECT DISTINCT A.user_name, A.email INTO #useremail FROM user_details A -- GET YOUR UNIQUE email VALUES SELECT DISTINCT A.email, A.first_name, A.last_Name INTO #email FROM user_details A WHERE A.email IS NOT NULL -- GET YOUR UNIQUE user_name VALUES SELECT DISTINCT A.user_name, A.first_name, A.last_Name INTO #user FROM user_details A WHERE A.user_name IS NOT NULL -- GET YOUR UNIQUE first_name AND last_Name VALUES NOT PART OF THE KEY SELECT DISTINCT A.first_name, A.last_Name INTO #name FROM user_details A WHERE A.first_name IS NOT NULL AND A.last_Name IS NOT NULL AND A.user_name IS NULL AND A.email IS NULL -- CLEAN UP YOUR UNIQUE user_name AND email KEY DELETE A -- SELECT * FROM #useremail A JOIN ( SELECT * FROM #useremail WHERE user_name IS NOT NULL AND email IS NOT NULL ) B ON (A.user_name = B.user_name AND A.email IS NULL) OR (A.email = B.email AND A.user_name IS NULL) -- CLEAN UP YOUR UNIQUE email VALUES DELETE A -- SELECT * FROM #email A JOIN ( SELECT * FROM #email WHERE first_name IS NOT NULL AND last_Name IS NOT NULL ) B ON A.email = B.email AND A.first_name IS NULL AND A.last_name IS NULL -- CLEAN UP YOUR UNIQUE user_name VALUES DELETE A -- SELECT * FROM #user A JOIN ( SELECT * FROM #user WHERE first_name IS NOT NULL AND last_Name IS NOT NULL ) B ON A.user_name = B.user_name AND A.first_name IS NULL AND A.last_name IS NULL -- CLEAN UP YOUR UNIQUE #name VALUES DELETE A -- SELECT * FROM #name A JOIN #user B ON A.first_name = B.first_name AND A.last_name = B.last_name DELETE A -- SELECT * FROM #name A JOIN #email B ON A.first_name = B.first_name AND A.last_name = B.last_name -- GET YOUR DATA SELECT A.user_name ,A.email ,U.first_name ,U.last_name --,* FROM #useremail A JOIN #user U ON A.user_name = U.user_name UNION SELECT A.user_name ,A.email ,E.first_name ,E.last_name --,* FROM #useremail A JOIN #email E ON A.email = E.email UNION SELECT NULL as [user_name] ,NULL as [email] ,N.first_name ,N.last_name --,* FROM #name N -- ================================================================================= -- END: NEW SOLUTION FINAL -- ================================================================================= 
+1
source

- If this time I get the right, you can solve it using the following:

  SELECT ISNULL(A.USER_NAME, B.USER_NAME), A.EMAIL, A.FIRST_NAME, A.LAST_NAME FROM user_details A CROSS JOIN user_details B WHERE A.EMAIL IS NOT NULL AND A.FIRST_NAME IS NOT NULL AND A.LAST_NAME IS NOT NULL GROUP BY ISNULL(A.USER_NAME, B.USER_NAME), A.EMAIL, A.FIRST_NAME, A.LAST_NAME 
0
source

try using an external connection with user_details.

0
source

All Articles