SQL Server 2008 blank line versus space

This morning I came across something strange and thought I would post it for comment.

Can someone explain why the following SQL query prints "equal" when starting with SQL 2008. The db compatibility level is 100.

if '' = ' ' print 'equal' else print 'not equal' 

And this returns 0:

 select (LEN(' ')) 

This seems to be auto-trimming a space. I do not know if this was in previous versions of SQL Server, and I no longer need to check it.

I ran into this because a production query was returning incorrect results. I cannot find this behavior anywhere.

Does anyone have any info on this?

+77
sql-server tsql sql-server-2008 string-length datalength
Sep 09 '09 at 13:56
source share
8 answers

varchar and equality are thorns in TSQL. The LEN function says:

Returns the number of characters, not the number of bytes, of a given string expression, excluding trailing spaces .

You need to use DATALENGTH to get the true byte amount of data. If you have data in Unicode, please note that the value you get in this situation will not be the same as the length of the text.

 print(DATALENGTH(' ')) --1 print(LEN(' ')) --0 

When it comes to equality of expressions, two lines are compared for equality as follows:

  • Get shorter string
  • Folder with spaces until the length is equal to the length of the line
  • Compare two

This is an average step that produces unexpected results - after this step you effectively compare spaces with spaces - therefore, they are considered equal.

LIKE behaves better than = in a "space" situation, because it does not perform an empty registration using the template that you tried to match:

 if '' = ' ' print 'eq' else print 'ne' 

It gives eq , and:

 if '' LIKE ' ' print 'eq' else print 'ne' 

Gives ne

Be careful with LIKE , though: it is not symmetrical: it treats trailing spaces as significant in the pattern (RHS), but not an expression of conformity (LHS). The following is here:

 declare @Space nvarchar(10) declare @Space2 nvarchar(10) set @Space = '' set @Space2 = ' ' if @Space like @Space2 print '@Space Like @Space2' else print '@Space Not Like @Space2' if @Space2 like @Space print '@Space2 Like @Space' else print '@Space2 Not Like @Space' @Space Not Like @Space2 @Space2 Like @Space 
+79
Sep 09 '09 at 14:14
source share

The = operator is T-SQL not so much "equal" as it is "the same word / phrase, according to the sorting of the context of the expression", and LEN - "the number of characters in the word / phrase." No sorting treats trailing spaces as part of the preceding word / phrase (although they treat leading spaces as part of the line that they precede).

If you need to distinguish 'this' from "this", you should not use the "same words or phrase" operator, because 'this' and "this" are the same word.

Contributing to the work = means that the string equality operator must depend on the contents of its arguments and the context of the expression matching, but it should not depend on the types of arguments if they are both types of strings.

The concept of a natural language "is one and the same word", as a rule, is not precise enough to be captured by a mathematical operator of type =, and there is no concept of a string type in a natural language. Context (i.e., Matching) matters (and exists in natural language) and is part of the story, and additional properties (some of which seem bizarre) are part of the definition of = to make it clearly defined in the unnatural world of data.

In a type question, you do not want the words to change when they are stored in different string types. For example, the types VARCHAR (10), CHAR (10), and CHAR (3) may contain representations of the words "cat" and? = 'cat' should allow us to decide whether the word โ€œcatโ€ has meaning for any of these types (with questions of accent and accent determined by matching).

Reply to JohnFx comment:

See Using char and varchar data in online documentation. Quote from this page, my hit:

Each char and varchar value has a sort. Notation attributes, such as bit patterns, used to represent each character, comparison rules , and case sensitivity or accent.

I agree that this might be easier to find, but it is documented.

It is also worth noting that the semantics of SQL, where = is related to real data, and the comparison context (unlike something about bits stored on the computer) has been part of SQL for a long time. The premise of RDBMS and SQL is an accurate representation of real data, so collaboration has been supported many years before such ideas (such as CultureInfo) entered the realm of algol-like languages. Making these languages โ€‹โ€‹(at least until recently) was a solution to development problems, not business data management. (Recently, the use of similar languages โ€‹โ€‹in non-technical applications, such as search, has been raiding, but Java, C #, etc. are still struggling with their non-commercial roots.)

In my opinion, itโ€™s unfair to criticize SQL for being different from โ€œmost programming languages.โ€ SQL was designed to support a business data modeling framework that is very different from development, so the language is different (and better for its purpose).

Hell, when SQL was first, some languages โ€‹โ€‹did not have an inline string type. And in some languages, the equals operator between lines does not compare characteristic data at all, but compares links! It would not surprise me if, after a decade or two, the idea that == depends on culture becomes the norm.

+17
Sep 09 '09 at 15:20
source share

I found this article a blog article that describes behavior and explains why.

The SQL standard requires a string of comparisons, effectively, a shorter string with spaces. This leads to an unexpected result that N '' = N '' (an empty string is equal to one or more spaces of characters), and in general, any string is equal to another string if they differ only in trailing spaces. This can be a problem in some contexts.

Additional information is also available in MSKB316626.

+9
Sep 09 '09 at 15:03
source share

There was a similar question a while ago when I looked at a similar problem here

Instead of LEN ('') use DATALENGTH ('') - this gives the correct value.

The solutions were to use the LIKE clause as described in my answer, and / or include a second condition in the WHERE clause to check the DATALENGTH too.

Read this question and links there.

+4
Sep 09 '09 at 14:12
source share

To compare the value with literal space, you can also use this method as an alternative to the LIKE expression:

 IF ASCII('') = 32 PRINT 'equal' ELSE PRINT 'not equal' 
+3
Feb 24 2018-11-21T00:
source share

Sometimes you have to deal with data gaps, with or without any other characters, even if the idea of โ€‹โ€‹using Null is better - but not always applicable. I ran into the described situation and solved it like this:

... where ('>' + @space + '<') <> ('>' + @ space2 + '<')

Of course, you would not do this large amount of fpr data, but it works quickly and easily for several hundred lines ...

Herbert

0
Jan 16 '15 at 2:37
source share

How to distinguish optional records with char / varchar fields on sql server: Example:

 declare @mayvar as varchar(10) set @mayvar = 'data ' select mykey, myfield from mytable where myfield = @mayvar 

expected

mykey (int) | myfield (varchar10)

1 | 'data'

it turns out

mykey | Myfield

1 | 'data' 2 | 'data'

even if I write select mykey, myfield from mytable where myfield = 'data' (without a final space) I get the same results.

how did i decide? In this mode:

 select mykey, myfield from mytable where myfield = @mayvar and DATALENGTH(isnull(myfield,'')) = DATALENGTH(@mayvar) 

and if there is an index in my field, it will be used in every case.

I hope this will be helpful.

0
Apr 14 '15 at 15:45
source share

Another way is to return it to a state in which space has value. For example: replace space with a character known as _

 if REPLACE('hello',' ','_') = REPLACE('hello ',' ','_') print 'equal' else print 'not equal' 

returns: not equal

Not ideal and probably slow, but it is another quick way forward when it is needed quickly.

0
Apr 12 '19 at 3:09 on
source share



All Articles