How to use subqueries in SQLAlchemy to get moving average?

My problem is that I want to get both a list of measurements and a moving average of these measurements. I can do this with this SQL statement (postgresql interval syntax):

SELECT time, value, ( SELECT AVG(t2.value) FROM measurements t2 WHERE t2.time BETWEEN t1.time - interval '5 days' AND t1.time ) moving_average FROM measurements t1 ORDER BY t1.time; 

I want to have SQLAlchemy code to create a similar statement for this. I currently have this Python code:

 moving_average_days = # configureable value, defaulting to 5 t1 = Measurements.alias('t1') t2 = Measurements.alias('t2') query = select([t1.c.time, t1.c.value, select([func.avg(t2.c.value)], t2.c.time.between(t1.c.time - datetime.timedelta(moving_average_days), t1.c.time))], t1.c.time > (datetime.datetime.utcnow() - datetime.timedelta(ndays))). \ order_by(Measurements.c.time) 

However, this SQL generates:

 SELECT t1.time, t1.value, avg_1 FROM measurements AS t1, ( SELECT avg(t2.value) AS avg_1 FROM measurements AS t2 WHERE t2.time BETWEEN t1.time - %(time_1)s AND t1.time ) WHERE t1.time > %(time_2)s ORDER BY t1.time; 

This SQL has a subquery as part of the FROM clause, where it cannot have scalar access to the values โ€‹โ€‹of a column of top-level values, i.e. causes PostgreSQL to spit out this error:

 ERROR: subquery in FROM cannot refer to other relations of same query level LINE 6: WHERE t2.time BETWEEN t1.time - interval '5 days' AN... 

What I would like to know is: how do I get SQLAlchemy to move a subquery into a SELECT clause?

An alternative could be another way to get a moving average (without performing a query for each pair (time, value)).

+4
source share
1 answer

Right, obviously, I needed to use the so-called scalar choice . With these, I get this python code that actually works the way I want it (generates equivalent SQL with the first in my question, which was my goal):

 moving_average_days = # configurable value, defaulting to 5 ndays = # configurable value, defaulting to 90 t1 = Measurements.alias('t1') ###### t2 = Measurements.alias('t2') query = select([t1.c.time, t1.c.value, select([func.avg(t2.c.value)], t2.c.time.between(t1.c.time - datetime.timedelta(moving_average_days), t1.c.time)).label('moving_average')], t1.c.time > (datetime.datetime.utcnow() - datetime.timedelta(ndays))). \ order_by(t1.c.time) 

This gives this SQL:

 SELECT t1.time, t1.value, ( SELECT avg(t2.value) AS avg_1 FROM measurements AS t2 WHERE t2.time BETWEEN t1.time - :time_1 AND t1.time ) AS moving_average FROM measurements AS t1 WHERE t1.time > :time_2 ORDER BY t1.time; 
+5
source

All Articles