Cumulative function for a given time interval

My SQL is a little rusty and I have quite a few problems with this problem. Suppose I have a table with a Timestamp column and a Number column. The goal is to return a result set containing the average value for some randomly selected regular interval.

So, for example, if I had the following initial data, the result with a 5-minute interval would be as follows:

time value ------------------------------- ----- 06-JUN-12 12.40.00.000000000 PM 2 06-JUN-12 12.41.35.000000000 PM 3 06-JUN-12 12.43.22.000000000 PM 4 06-JUN-12 12.47.55.000000000 PM 5 06-JUN-12 12.52.00.000000000 PM 2 06-JUN-12 12.54.59.000000000 PM 3 06-JUN-12 12.56.01.000000000 PM 4 OUTPUT: start_time avg_value ------------------------------- --------- 06-JUN-12 12.40.00.000000000 PM 3 06-JUN-12 12.45.00.000000000 PM 5 06-JUN-12 12.50.00.000000000 PM 2.5 06-JUN-12 12.55.00.000000000 PM 4 

Please note that this is an Oracle database, so Oracle-oriented solutions will work fine. This, of course, could be done using a stored procedure, but I was hoping to complete the task in a single request.

+7
source share
4 answers
 CREATE TABLE tt (time TIMESTAMP, value NUMBER); INSERT INTO tt (time, value) VALUES ('06-JUN-12 12.40.00.000000000 PM', 2); INSERT INTO tt (time, value) VALUES ('06-JUN-12 12.41.35.000000000 PM', 3); INSERT INTO tt (time, value) VALUES ('06-JUN-12 12.43.22.000000000 PM', 4); INSERT INTO tt (time, value) VALUES ('06-JUN-12 12.47.55.000000000 PM', 5); INSERT INTO tt (time, value) VALUES ('06-JUN-12 12.52.00.000000000 PM', 2); INSERT INTO tt (time, value) VALUES ('06-JUN-12 12.54.59.000000000 PM', 3); INSERT INTO tt (time, value) VALUES ('06-JUN-12 12.56.01.000000000 PM', 4); WITH tmin AS ( SELECT MIN(time) t FROM tt ), tmax AS ( SELECT MAX(time) t FROM tt ) SELECT ranges.inf, ranges.sup, AVG(tt.value) FROM ( SELECT 5*(level-1)*(1/24/60) + tmin.t as inf, 5*(level)*(1/24/60) + tmin.t as sup FROM tmin, tmax CONNECT BY (5*(level-1)*(1/24/60) + tmin.t) < tmax.t ) ranges JOIN tt ON tt.time BETWEEN ranges.inf AND ranges.sup GROUP BY ranges.inf, ranges.sup ORDER BY ranges.inf 

script: http://sqlfiddle.com/#!4/9e314/11

edit: called by Justin, as usual ... :-)

+8
source

Something like

 with st as (SELECT to_timestamp( '2012-06-06 12:40:00', 'yyyy-mm-dd hh24:mi:ss') + numtodsinterval((level-1)*5, 'MINUTE') start_time, to_timestamp( '2012-06-06 12:40:00', 'yyyy-mm-dd hh24:mi:ss') + numtodsinterval(level*5, 'MINUTE') end_time from dual connect by level <= 10) SELECT st.start_time, avg( yt.value ) FROM your_table yt, st WHERE yt.time between st.start_time and st.end_time 

must work. Instead of generating 10 intervals and hard-coding the lowest interval, you can increase the query to get the starting point and the number of rows from MIN(time) and MAX(time) in the table.

+5
source

The answers of Justin and Sebas can be extended by the LEFT JOIN to eliminate β€œspaces”, which is often desirable.

If this is not necessary, as an alternative, we can go to the old-school Oracle DATE arithmetic ...

 SELECT TRUNC(t.time)+FLOOR(TO_CHAR(t.time,'sssss')/300)*300/86400 AS time , AVG(t.value) AS avg_value FROM foo t WHERE t.time IS NOT NULL GROUP BY TRUNC(t.time)+FLOOR(TO_CHAR(t.time,'sssss')/300)*300/86400 ORDER BY TRUNC(t.time)+FLOOR(TO_CHAR(t.time,'sssss')/300)*300/86400 

Let me unzip it a bit. We can separate the date and time components using TRUNC to get a portion of the date, and use TO_CHAR to return the number of seconds since midnight. We know that 5 minutes is 300 seconds, and we know that there are 86,400 seconds per day. Thus, we can divide the number of seconds by 300 and take FLOOR of this (only the integer part), which rounds us to the nearest border of 5 minutes. We multiply this back (by 300) to get the seconds again, and then divide it by the number of seconds per day (86400), and we can add this back to the portion of the (shortened) date.

Painful, yes. But incredibly fast.

NOTE: this returns the value of the rounded time as DATE , it can be dropped to the timestamp if necessary, but for even borders for 5 minutes a DATE has sufficient resolution.

As an advantage of this approach for a large table, we can improve query performance by adding a coverage index for this query:

 CREATE INDEX foo_FBX1 ON foo (TRUNC(t.time)+FLOOR(TO_CHAR(t.time,'sssss')/300)*300/86400,value); 

ADD:

MiMo provided an answer for SQL Server, suggesting that it will be adapted for Oracle. Here is an adaptation of this approach in Oracle. Note that Oracle does not provide equivalents for the DATEDIFF and DATEADD functions. Instead, Oracle uses simple arithmetic.

 SELECT TO_DATE('00010101','YYYYMMDD')+FLOOR((t.time-TO_DATE('00010101','YYYYMMDD'))*288)/288 AS time , AVG(t.value) AS avg_value FROM foo t WHERE t.time IS NOT NULL GROUP BY TO_DATE('00010101','YYYYMMDD')+FLOOR((t.time-TO_DATE('00010101','YYYYMMDD'))*288)/288 ORDER BY TO_DATE('00010101','YYYYMMDD')+FLOOR((t.time-TO_DATE('00010101','YYYYMMDD'))*288)/288 

Choice January 1, 0001 A.D. as a base date is arbitrary, but I did not want to mess with negative values ​​and find out if FLOOR would be correct, or we would need to use CEIL with negative numbers, (Magic number 288 - result of 1440 minutes per day, divided by 5). In this case, we take a fractional day, multiplying by 1440 and dividing by 5, and take the whole part of it, and then return it on fractional days.

It is tempting to pull this "base date" out of the PL / SQL package or retrieve it from a subquery, but any of them can prevent this expression from being deterministic. And we really would like to open up the possibility of creating a function-based index.

My preference is to avoid having to include a β€œbase date” in the calculation.

+3
source

This is the solution for SQL Server:

 declare @startDate datetime = '2000-01-01T00:00:00' declare @interval int = 5 select DATEADD(mi, (DATEDIFF(mi, @startDate, time)/@interval)*@interval, @startDate), AVG(value) from table group by DATEDIFF(mi, @startDate, time)/@interval order by DATEDIFF(mi, @startDate, time)/@interval 

The start date is arbitrary. The idea is that you calculate the number of minutes from the start date, and then group that number divided by the interval.

It should easily adapt to Oracle using the equivalent for DATEADD and DATEDIFF

+1
source

All Articles