Integer date representation

In a recent project, we had a problem with performing several queries, which were largely based on ordering the results by the datetime field (MSSQL 2008 database).

When we ran queries with ORDER BY RecordDate DESC (or ASC), the queries ran 10 times slower than without them. The order for any other area did not lead to such slow results.

We tried all the indexing options, used the setup wizard, nothing has changed.

One proposed solution was to convert the datetime field to an integer field representing the number of seconds or milliseconds in this datetime field. It will be calculated using a simple algorithm, something like "enter me the number of seconds from RecordDate to 1980-01-01." This value will be stored during insertion, and all sorting will be performed in the integer field, and not in the datetime field.

We never tried, but I'm curious what you guys think?

+4
source share
13 answers

I always store dates as ints using the standard unix timestamp , since most of the languages ​​I use in the program are used as default date and time representations. Obviously, this makes sorting by date much more efficient.

So yes, I recommend :)

+3
source

I think, basically, that the SQL datetime data type is stored behind the scenes in SQL Server, so I would be surprised at these results.

Can you reproduce the slowness in Northwinds or Pubs - if so, it might be worth calling MS, as this should not be 10 times slower. If not, then there might be something strange in your table.

If you are using SQL 2008 and you need to store dates (not the temporary part), you can try using a new date data type. This has less accuracy and therefore should be sorted faster.

+2
source

Are inserts from .NET Code inserted ...

You can save the DateTime.Ticks value in the bigint column in the database and index.

Regarding updating an existing database, it should be relatively trivial to write a CLR function to convert existing DateTimes to TickCount line by line

ALTER TABLE dbo.MyTable ADD TickCount BigInt Null Update dbo.MyTable Set TickCount = CLRFunction(DateTimeColumn) 

This is definitely possible and will greatly improve your sorting inability.

+1
source

Is the data already stored as a number?

+1
source

Do you really need a DateTime or, more specifically, part time? If not, I would examine storing the date either as an integer or string representation of the ISO date format (YYYYMMDD) and see if this gives you a need for better performance. Saving ticks / time_t values, etc. It will give you the opportunity to store time, but I would not worry about it if you really do not need a time component. In addition, the added value of storing a readable date is that it’s slightly easier to debug data-related problems, simply because you can read and understand the data in which your program runs.

+1
source

The only reasonable way to store dates is like Julian days. Unix timestamps are a short cut.

Reasonable, I mean really in code - it is usually (but not always) better to store dates in a database as datetime.

The problem with the database you are experiencing sounds like another problem. I doubt that changing the type of the field will make a huge difference.

It is difficult to be specific without seeing detailed information such as queries, number of records, etc., but the general recommendations are to restructure the order and method of the query to reduce the number of ordered records - since this can significantly affect performance.

0
source

I really don't understand why indexing doesn't help if the SQL behind the covers saves the date as an integer representation.

Sorting by identifier columns gives excellent results or any other indexed field.

0
source

I will vote for indexing. As I said in the comments above, your dates are stored as two ints backstage anyway (sql 2000 anyway). I don’t see it make a difference. It's hard to say what the real problem is without additional information, but my gut feeling is that it is not a problem. If you have dev environmentemnt (and you should :)), try making an int field there and starting raw requests. It will not be difficult to do, and you will get the final results on this idea.

0
source

Is your RecordDate one of the fields in the WHERE clause? In addition, RecordDate only has your criteria ORDER BY? Third, is your query a combination of several tables or a single table query? If you do not select RecordDate and use it as ORDER BY criteria, this can cause a performance problem, since indexes in this case will not really contribute to sorting. Indexes will try to solve connection problems, and then sorting will happen later.

If so, then changing the data type of your RecordDate may not help you much, since you still apply sorting by the record set after the fact.

0
source

I saw a BI database where dates are stored as an integer in YYYMMDD format. A separate table is used to compare these ints with equivalent date-time, formatted string, year number, quarter number, month number, day of the week, holiday status, etc. All you have to do is join this table to get anything related to the date you need. Very comfortably.

0
source

I would suggest using the Julian date used in Excel ( link text ). All financial applications use this view to improve performance and provide a relatively good range of values.

0
source
 SELECT CAST(REPLACE(convert(varchar, GETDATE(), 102),'.','')AS INT) 

- Works well (and fast!).

0
source

I believe that datetime is physically stored as a float , so the improvement will be the same as when converting a float to INT.

I would rather use indexes, as they are for them, and data creation time is for storing dates over time. There is a set of functions related to datetime, so if you decide to use your own type of storage, you will need to take care of this yourself.

-1
source

All Articles