Query Performance with NULL

I would like to know how NULL values ​​affect query performance in SQL Server 2005.

I have a table like this (simplified):

ID | ImportantData | QuickPickOrder -------------------------- 1 | 'Some Text' | NULL 2 | 'Other Text' | 3 3 | 'abcdefg' | NULL 4 | 'whatever' | 4 5 | 'it is' | 2 6 | 'technically' | NULL 7 | 'a varchar' | NULL 8 | 'of course' | 1 9 | 'but that' | NULL 10 | 'is not' | NULL 11 | 'important' | 5 

And I make a request on it as follows:

 SELECT * FROM MyTable WHERE QuickPickOrder IS NOT NULL ORDER BY QuickPickOrder 

So, QuickPickOrder is basically a column used to highlight some frequently selected items from a larger list. It also provides the order in which they will be displayed to the user. NULL values ​​mean that they do not appear in the quick pick list.

I've always been told that the NULL values ​​in the database are somehow evil, at least in terms of normalization, but is it acceptable to filter out unwanted rows in the WHERE clause?

Would it be better to use a specific numerical value like -1 or 0 to indicate items that are not needed? Are there other alternatives?

EDIT: The example does not accurately reflect the ratio of real values ​​to NULL. A better example might display at least 10 NULL for each non-NULL. The size of the table can be from 100 to 200 rows. This is a look-up table, so updates are rare.

+3
performance sql database sql-server
source share
8 answers

SQL Server indexes are NULL , so most likely just use the Index Seek by index on QuickPickOrder , both for filtering and ordering.

+5
source share

Another alternative could be two tables:

 MyTable: ID | ImportantData ------------------ 1 | 'Some Text' 2 | 'Other Text' 3 | 'abcdefg' 4 | 'whatever' 5 | 'it is' 6 | 'technically' 7 | 'a varchar' 8 | 'of course' 9 | 'but that' 10 | 'is not' 11 | 'important' QuickPicks: MyTableID | QuickPickOrder -------------------------- 2 | 3 4 | 4 5 | 2 8 | 1 11 | 5 SELECT MyTable.* FROM MyTable JOIN QuickPicks ON QuickPickOrder.MyTableID = MyTable.ID ORDER BY QuickPickOrder 

This will allow you to update QuickPickOrder without blocking anything in MyTable or by registering a full row transaction for this table. So depending on how big MyTable is and how often you update QuickPickOrder, there may be a scalability advantage.

In addition, having a separate table allows you to add a unique index to QuickPickOrder so that there is no duplication, and it could be more easily scaled later to allow different types of QuickPicks, having them specific to specific contexts or users, etc.

+3
source share

They do not have a negative result in the database. Remember that NULL is more a state than a value. Checking NOT NULL against setting this value to -1 does not matter except -1, possibly violating data integrity, imo.

+2
source share

NULL looks good to me for this purpose. Performance, in all likelihood, will be basically the same as with a nonzero column and a constant value, or maybe even better for filtering all NULL s.

+1
source share

An alternative is to normalize QuickPickOrder in a table with a foreign key, and then make an internal join to filter zeros (or left join with a where clause to filter out unnecessary ones).

+1
source share

SQL Server performance may be affected by the use of NULLS in your database. There are several reasons for this.

First , NULLS that appear in fixed-length columns (CHARs) occupy the entire size of the column. Therefore, if you have a column with a width of 25 characters and it stores NULL, then SQL Server must store 25 characters to represent the NULL value. This added space increases the size of your database, which in turn means that more I / O overhead is required to find the required data. Of course, one way to do this is to use variable-length fields. When NULLs are added to a variable-length column, space is not wasted because it has fixed-length columns.

Second , using the IS NULL clause in your WHERE clause means that the index cannot be used for the query, and the table will be scanned. This can significantly reduce performance.

Third , using NULLS can lead to confusing Transact-SQL code, which may mean that the code is not working efficiently or not working.

Ideally, NULL should be avoided in your SQL Server databases.

Instead of using NULL, use a coding scheme similar to this in your databases:

  • NA: Not applicable
  • NYN: Not yet known
  • TUN: Truly Unknown

Such a scheme provides the advantages of using NULL, but without the disadvantages.

+1
source share

NULL looks good to me. SQL Server has many kinds of indexes to choose from. I forget which ones do this, but some only index values ​​in a given range. If you had such an index for the column being tested, then the NULL values ​​would not be in the index, and index scanning would be fast.

0
source share

Having a large number of NULLs in a column with an index on it (or starting with it) is usually useful for this kind of query.

NULL values ​​are not entered into the index, which means that inserting / updating rows with NULL there does not affect the performance that occurs when updating another secondary index. If, say, only 0.001% of your rows have a non-zero value in this column, the IS NOT NULL query becomes quite efficient, as it simply scans a relatively small index.

Of course, all this is relative, if your table is negligible, it does not make a noticeable difference.

0
source share

All Articles