Why does "SELECT DISTINCT a, b FROM ..." return fewer records than "SELECT DISTINCT A +" | + B FROM ... "?

I have a query that selects a bunch of fields related to the names and addresses of clients, but it comes down to:

SELECT DISTINCT a, b, c, ... FROM big_dumb_flat_table 

it returns a bunch of records (10986590). When I replace the commas in the selection list, format it as a concatenated string divided into pipes:

 SELECT DISTINCT a + '|' + b + '|' + c + '|' + ... FROM big_dumb_flat_table 

it returns another 248 records. I reassured myself that in any of the fields there are no pipes that could tighten the fidelity of the returned set. What's going on here?

+7
sql sql-server sql-server-2005 concatenation
source share
2 answers

This can lead to gaps. To compare strings, they are ignored.

 CREATE TABLE #T ( a varchar(10), b varchar(10), c varchar(10) ) INSERT INTO #T SELECT 'a ' as a, 'b' as b, 'c ' as c union all SELECT 'a' as a, 'b' as b, 'c ' as c SELECT DISTINCT a, b, c FROM #T /*1 result*/ SELECT DISTINCT a + '|' + b + '|' + c + '|' FROM #T /*2 results*/ SELECT DISTINCT LTRIM(RTRIM(a)) + '|' + LTRIM(RTRIM(b)) + '|' + LTRIM(RTRIM(c)) + '|' FROM #T /*1 result*/ 
+10
source share

These are actually not scripts that I can come up with so that you get MORE records, only fewer. I would simplify the query by selecting only + '|', and then add more columns.

+2
source share

All Articles