SQL Server 2014 LIKE with 4 B wide character returns entire table

I find it difficult to understand why a SELECT with a LIKE condition on emoji at level 4 B ( http://emojipedia.org/eyes/ ) returns all the rows in the table.

 -- The DB collation is Finnish_Swedish_CI_AS CREATE TABLE #Test ( Number int identity, Value nvarchar(20) NOT NULL ); GO INSERT INTO #Test VALUES (''); INSERT INTO #Test VALUES ('a'); INSERT INTO #Test VALUES ('b'); INSERT INTO #Test VALUES ('c'); INSERT INTO #Test VALUES ('πŸ‘€'); -- should be http://emojipedia.org/eyes/ GO select * from #Test where Value like '%πŸ‘€%' select * from #Test where Value like N'%πŸ‘€%' GO DROP TABLE #Test; GO 

The result of the first selection is a single line of emoji. However, the result of the second choice, where the line is marked as UNICODE with the prefix N, is all the lines.

Why does the second choice match all the rows in the table?

Actual scenario I have a web application in which users can search the directory, and if they search emoji, the whole table is returned instead of the correct matching rows.

 ASP.NET MVC 5 web application <-> Web Api 2 <-> EF -> SQL Server 

Refresh My SQL test is incorrect because @deroby indicated that INSERT must be marked with N '...' for proper installation as Unicode.

Solution . As @deroby stated, it seems that the problem is with matching, changing it to Finnish_Swedish_100_CI_AS seems like everything is working as expected.

 select * from #Test where Value like N'%πŸ‘€%' COLLATE Finnish_Swedish_100_CI_AS 
+5
source share
1 answer

When adding

getting even more fun
 INSERT INTO #Test VALUES ('πŸ‘‚'); -- should be http://emojipedia.org/eyes/ 

Your first SELECT will now return both emoji records; even if they look wrong. The reason for this is that both πŸ‘‚ and πŸ‘€ are invalid varchar () values. Thus, they lead to the fact that some placeholder is placed in the database (displayed as ?? ).

Now, if you insert your inserts correctly and insert data into the correct Unicode lines, then everything works much better:

 INSERT INTO #Test VALUES (N''); INSERT INTO #Test VALUES (N'a'); INSERT INTO #Test VALUES (N'b'); INSERT INTO #Test VALUES (N'c'); INSERT INTO #Test VALUES (N'πŸ‘€'); -- should be http://emojipedia.org/eyes/ INSERT INTO #Test VALUES (N'πŸ‘‚'); -- should be http://emojipedia.org/eyes/ select * from #Test -- returns everything select * from #Test where Value like '%πŸ‘€%' -- returns nothing (because it looks for the '??' placeholder select * from #Test where Value like N'%πŸ‘€%' -- returns just one line 

PS: how it works on my SQL2012 here (Latin1_General_BIN).

Why does the second choice match all the rows in the table?

That I also cannot explain; this is not so for me. Therefore, I assume that this is a sorting problem where these emojis are considered equal.

And indeed, performing the test

 select * from #Test where Value like N'%πŸ‘€%' COLLATE Finnish_Swedish_CI_AS select * from #Test where Value like N'%πŸ‘€%' COLLATE Latin1_General_BIN 

shows that you need to look for a reason. As for the reasons, I have no idea.

+3
source

All Articles