Desk Indexing Strategy

I have a SQL Server 2005 table named EventTable, defined as such:

EventID, EventTypeCode, EventStatusCode, EventDate

The table currently has a clustered index for the primary key "EventID", currently there are no other indexes

EventTypeCode and EventStatusCode CHAR (3) codes (examples: "NEW", "SEN", "SAL") are foreign keys

General elections will be ...

select * from EventTable Where EventDate = @dateparam; select * from EventTable Where EventTypeCode = @eventtype; select * from EventTable Where EventStatusCode = @statustype; 

What index strategy would you use to process the above statements?

Is it better to have a cover (composite) index on 3 columns? If so, what order should the composite index contain?

Or a separate index for each of the three columns?

The table will grow at a rate of about 300 events per day.


Queries such as

 where EventDate between '2008-12-01' and '2008-12-31' and EventTypeCode = 'todo' 
  • the table is more likely to grow at 500-800 / records per day, rather than 300
  • the queries mentioned in the initial question will be executed many times during the day during normal use of an ASP.NET application
  • NHibernate 'HQL' is used to execute such queries
  • no initial data loading, the table now takes about 10K records, because this is a new application
  • ... I more or less try to avoid the client calling us after a couple of years to complain that the application is becoming "slow", as this table will hit hard.
+4
source share
4 answers

Strategy 1, specify the indexes that can be used for filtering. A table search will retrieve the remaining data. This almost doubles the use of space, and the fours record the cost of I / O.

 on EventTable(EventDate) on EventTable(EventTypeCode) on EventTable(EventStatusCode) 

Strategy 2, specify coverage indexes that you can use for filtering. There will be no requests. This increases the use of space and records the cost of I / O.

 on EventTable(EventDate, EventId, EventTypeCode, EventStatusCode) on EventTable(EventTypeCode, EventId, EventDate, EventStatusCode) on EventTable(EventStatusCode, EventId, EventDate, EventTypeCode) 

The reason that the order of the columns matters in the coverage index (in the general case) is because the data is ordered for each column in turn. That is: column 2 connects the broken line 1. Column 3 connects the broken line 1 and 2.

Since you do not have queries that filter across multiple columns, there is no value in the column after the first column (in your case).

If you had a request, for example

 where EventDate = @EventDate and EventTypeCode = @EventTypeCode 

Then this coverage index will be useful. EventDate is most likely more selective than EventTypeCode, so it comes first.

 on EventTable(EventDate, EventTypeCode, EventId, EventStatusCode) 

Edit further: If you have a request, for example

 where EventDate between '2008-12-01' and '2008-12-31' and EventTypeCode = 'todo' 

Then this indicator will work best:

 on EventTable(EventTypeCode, EventDate, EventId, EventStatusCode) 

This will combine all the "todo" events sorted by their EventDate as a tie breaker. SQL Server just has to find the first item and read until it finds an item that does not meet the criteria and stops.

If EventDate was the first in the index, then the data will be sorted by date, and then each date will have "todo" events grouped together. SQL Server will find the first todo on 12-01, read until it finds an item that does not meet the criteria ... then find the first todo on 12-02, read until it is from todo ... then find ... in within 31 days.

You want to select an index that places the elements you want adjacent to each other.


At 300 entries per day, your desk will receive up to 5 million entries in 50 years. This is not so important. Any strategy will work. Strategy 1 is likely to be fast enough (error on the side of space).

+6
source

How often do you fetch against a table? Are samples usually part of normal processing or most of reporting and / or maintenance and debugging?

Is there any initial data loading? If not, the size of the table will be very tiny and will probably remain that way for years to come.

Although you give a few samples, do you know how often each type of selection will be performed?

I would just leave the table as it is, and run the profiler to see how it is accessed during production. If this is a table that is constantly accessed and can become a bottleneck for different functions, then I would try to guess which columns will most often be part of the WHERE clause and put one index on it. For example, if there is a process that looks through all the events in the last 24 hours that run every 10 seconds, then the index in the date column may be fine, and I would even cluster on that, and not on the primary key.

+1
source

I would put an index on each of the foreign keys (I usually index most foreign keys), and then probably one in the date field, depending on the frequency that it used in the search.

0
source

Please take a look at this good SQL Server indexing article:

http://www.mssqltips.com/tip.asp?tip=1206

0
source

All Articles