Behavior of a unique index, varchar column, and (spaces) spaces

Question

Behavior of a unique index, varchar column, and (spaces) spaces

I am using Microsoft SQL Server 2008 R2 (with the latest service pack / patch) and the database sort is SQL_Latin1_General_CP1_CI_AS.

The following code:

SET ANSI_PADDING ON; GO CREATE TABLE Test ( Code VARCHAR(16) NULL ); CREATE UNIQUE INDEX UniqueIndex ON Test(Code); INSERT INTO Test VALUES ('sample'); INSERT INTO Test VALUES ('sample '); SELECT '>' + Code + '<' FROM Test WHERE Code = 'sample '; GO

gives the following results:

(1 row (s) affected)
Msg 2601, Level 14, State 1, Line 8
Cannot insert duplicate key string in object 'dbo.Test' with unique index 'UniqueIndex'. The value of the duplicate key is (sample).
The statement is complete.
------------
> sample <
(1 row (s) affected)

My questions:

I assume the index cannot store trailing spaces. Can someone point me to the official documentation that defines / defines this behavior?
Is there a way to change this behavior, that is, make it recognize the "pattern" and "pattern" as two different values (by the way, these are they), so both can be in the index.
Why on earth does SELECT return a string? SQL Server should do something really funny / clever with spaces in the WHERE clause, because if I remove the uniqueness in the index, both INSERTs will work fine, and SELECT will return two rows!

Any help / pointer in the right direction would be greatly appreciated. Thanks.

+8

sql-server tsql string-comparison unique-index

Eric Feb 27 '12 at 6:01

source share

1 answer

Oleg Dok · Accepted Answer · 2012-02-27T06:12:27+0000

Explanation of gaps covered :

SQL Server follows the ANSI / ISO SQL-92 specifications (Section 8.2, General Rules No. 3) on how to compare strings with spaces. The ANSI standard requires padding for character strings used for comparison so that their lengths match when comparing them. Filling directly affects WHERE semantics and predicates of the HAVING clause and other Transact-SQL comparison strings. For example, Transact-SQL considers abc strings and "abc" is equivalent for most comparison operations.
The only exception to this rule is the LIKE predicate. When the right side of the LIKE predicate expression has a trailing space value, SQL Server does not insert two values into the same length before the comparison occurs. Because the purpose of a LIKE predicate, by definition, is to make simple string equality tests easier, this does not violate the section of the ANSI SQL-92 specification mentioned earlier.

An example of all the cases mentioned above is well known here:

 DECLARE @a VARCHAR(10) DECLARE @b varchar(10) SET @a = '1' SET @b = '1 ' --with trailing blank SELECT 1 WHERE @a = @b AND @a NOT LIKE @b AND @b LIKE @a

Here is more information on trailing spaces and the LIKE clause .

Regarding indices:

Inserting into a column whose values must be unique fails if you provide a value that differs from existing values only by trailing spaces. The following lines will be considered equivalent to a unique constraint, primary key, or unique index. Similarly, if you have an existing table with the data below and add a unique constraint, it will not be fulfilled because the values are considered identical.
 PaddedColumn ------------ 'abc' 'abc ' 'abc ' 'abc ' 

(Taken from here .)

Behavior of a unique index, varchar column, and (spaces) spaces

More articles: