Sql Server 2008 - Difference Between Sort Types

I am installing a new SQL Server 2008 server and have some problems getting any useful information about different sorts. I searched SQL Server BOL and google'ed for an answer, but cannot find any useful information.

  • What is the difference between sorting Windows "Finnish_Swedish_100" and "Finnish_Swedish" ?

    I believe the "_100" version is an updated collation in SQL Server 2008, but what has changed from the old version, if so?

  • Is it generally helpful to have an "Accent-sensitive" ? I know that it depends on the task and all this, but are there any known pros and cons to consider?

  • Parameters "Binary" and "Binary-code point" , in what cases should they be activated?

+6
sql-server sql-server-2008 collation
source share
7 answers

_100 indicates the new collation sequence in SQL Server 2008, those that have _90 for 2005, and those that don't have a suffix for 2000. I don’t know what the differences are, and I can’t find any documentation. If you are not performing related server queries to another SQL server of a different version, I will be tempted to go with _100. Sorry, I can not help with the differences.

+3
source share

The letters AÄÖ / åäö do not mix with A and O, just setting the mapping to AI (Accent Insensitive). This, however, is true for â and other "combinations" that are not part of the Swedish alphabet as separate letters. â will mix or not mix depending on the given parameter.

Since I have many old databases that I still need to communicate with, also using linked servers, I selected FINNISH _SWEDISH _CI _AS now that I am installing SQL2008. This was the default value for FINNISH _SWEDISH when Windows mappings first appeared in SQL Server.

+3
source share

To answer question 3 (information taken from MSDN , their wording, my format):

Binary ( _BIN ):

  • Sorts and compares data in SQL Server tables based on the bit patterns defined for each character.
  • Binary sort order is case sensitive and accented.
  • Binary is also the fastest sort order.
  • If this option is not selected, SQL Server follows the collation and comparison rules defined in the dictionaries for the corresponding language or alphabet.

Binary Point ( _BIN2 ):

  • For Unicode data: Sorts and compares data in SQL Server tables based on Unicode code points.
  • For non-Unicode data: comparisons identical to binary sorts will be used.

The advantage of using the binary dot sort order is that no data is used in applications that compare SQL Server sorted data. As a result, the binary dot sort order provides easier application development and increased productivity.

See BIN and BIN2 Collations Recommendations for more information.

+2
source share

Use the query below to try it yourself.

As you can see, å, ä, etc. are not considered accented characters and are sorted according to the Swedish alphabet when using Finnish / Swedish sorting.

However, emphasis is only taken into account if you use AS sorting. To sort the AI their order does not change, as if there was no emphasis at all.

 CREATE TABLE #Test ( Number int identity, Value nvarchar(20) NOT NULL ); GO INSERT INTO #Test VALUES ('àá'); INSERT INTO #Test VALUES ('áa'); INSERT INTO #Test VALUES ('aa'); INSERT INTO #Test VALUES ('aà'); INSERT INTO #Test VALUES ('áb'); INSERT INTO #Test VALUES ('ab'); -- w is considered an accented version of v INSERT INTO #Test VALUES ('wa'); INSERT INTO #Test VALUES ('va'); INSERT INTO #Test VALUES ('zz'); INSERT INTO #Test VALUES ('åä'); GO SELECT Number, Value FROM #Test ORDER BY Value COLLATE Finnish_Swedish_CI_AS; SELECT Number, Value FROM #Test ORDER BY Value COLLATE Finnish_Swedish_CI_AI; GO DROP TABLE #Test; GO 
+2
source share

To ask your question 1. Accent sensitive is good if you have Finnish-Swedish turned on. Otherwise, your "å" and "ä" s will be sorted as "a" and "ö" as "o" s. (Assuming you will use such international characters).

More details here: http://msdn.microsoft.com/en-us/library/ms143515.aspx (discusses both binary code point and accent sensitivity)

0
source share

To access qestion 2:

Yes, if an accent is required for a given language, then grammars for a given language.

0
source share

In questions 2 and 3

Accent sensitivity is what I propose to disable if you accept user data, and ON if you have clean, sanitized data. Not being Finnish myself, I don’t know how many words there are that differ from each other depending on what they have. But if there are users who enter data, you can be sure that they will NOT be consistent in their use, and you will want to compare them. If you are collecting data from a data set in which you know the content and know the sequence, then you will want to turn on Accent Sensitivity ON because you know that the differences are targeted.

The same questions apply when considering question 3. (I mostly get this from the link provided by Tomalak). If the data is case and accent sensitive, then you want _BIN because it will sort faster. If the data is irregular and not case sensitive / accent then you need _BIN2 because it is for Unicode data.

0
source share

All Articles