I have a table containing a lot of data, where we especially care about the field date. The reason for this is that the amount of data has only risen ~ 30 times, and old methods will soon decay. A query I hope can help me optimize:
- take a list of dates (generated by a table function based on cte)
- get one record for each of these dates
- based on some definition of "closest"
For example, the current table contains data after 5 seconds (+/- few) intervals. I need to try this table and get a record that will be closest to the 30 second interval.
What I have now works great. I'm just wondering if there is a way to optimize it more. If I can do this in Linq To SQL, that will be neat too. I'm even interested in index suggestions given the number of date values (~ 2 million rows of mines).
declare @st datetime ; set @st = '2012-01-31 05:05:00';
declare @end datetime ; set @end = '2012-01-31 05:10:00';
select distinct
log.*
from
dbo.fn_GenerateDateSteps(@st, @end, 30) as d
inner join lotsOfLogData log on l.Id = (
select top 1 e.[Id]
from
lotsOfLogData as log
where
log.stationId = 1000
AND utcTime between DateAdd(s, -10, dt) AND DateAdd(s, 5, dt)
order by
abs(datediff(s, dt, UtcTime))
)
The structure of the lotsOfLogData table is shown below. There are a relatively small number of station identifiers (maybe 50), but there are many entries for each entry. We know the identifier of the station upon request.
create table
Id bigint identity(1,1) not null
, StationId int not null
, UtcTime datetime not null
)
fn_GenerateDateSteps returns the following data set for the specified parameters:
[DT]
2012-01-31 05:05:00.000
2012-01-31 05:05:30.000
2012-01-31 05:06:00.000
2012-01-31 05:06:30.000 (and so on, every 30 seconds)
I did it with a temporary table this way, but it turned out a little more expensive.
declare @dates table ( dt datetime, ClosestId bigint);
insert into @dates (dt) select dt from dbo.fn_GenerateDateSteps(@st, @end, 30)
update @dates set closestId = (
select * from lotsOfLogData inner join @dates on Id = ClosestId
Edit: Fixed
200K + . , (id/time + includes (.. ...) . , , ( ) index [id + time]. - . , , : D
-- subtree cost (crossapply) : .0808
-- subtree cost (id based) : .0797
-- see above query for what i ended up with