SQL Server Unicode queries with SC collation

Question

SQL Server Unicode queries with SC collation

In SQL Server 2012, I have a table with an nvarchar column with a Latin1_General_100_CI_AS_ SC collation, which should support Unicode surrogate characters or extra characters.

When I run this query:
```
select KeyValue from terms where KeyValue = N'➰' 
```
~~(above - Unicode SC)~~
above - curly curly loop symbol with code 10160 (x27B0)
The result is a hundred different similar single-character entries, although they all have different UTF-16 code points. Is it due to sorting? Why is there no exact match?

EDIT: Now I think this has to do with sorting. There seems to be a "undefined" character group in the UTF-16 range, over 1733 characters, and they map to this mapping. Although characters with codes above 65535 are considered unique, these queries return exact matches.

The two queries below have different results:

 select KeyValue from terms where KeyValue = N'π'

returns 3 lines: π and π and ᴨ

 select KeyValue from terms where KeyValue LIKE N'π'

returns 2 lines: π and π

Why is this?

This is the strangest of all. This request:
```
 select KeyValue from terms where KeyValue like N'➰%' 
```
returns ALMOST ALL records in a table that has many regular Latin characters with multiple characters, such as "8w" or "apple". 90% of those who do not return begin with "æ". What's happening?

NOTE. To give this a little context, these are all Wikipedia articles, not random lines.

+5

sql-server sql-server-2012 unicode

Rahmi aksu Jul 03 '15 at 20:53

source share

1 answer

Bill · Answer 1 · 2015-08-12T17:07:30+0000

SQL Server and therefore tempdb also have their own sorting, and they may not match the database or column sorting. While character literals should be assigned a default column or database mapping, the above (possibly too simplistic) T-SQL examples may or may not show the true problem. For example, an ORDER BY clause could be omitted for simplicity. Are the expected results returned when the above statements explicitly use https://msdn.microsoft.com/en-us/library/ms184391.aspx ('COLLATE Latin1_General_100_CI_AS_SC')?

SQL Server Unicode queries with SC collation

More articles: