I have a table that contains the power values ββ(kW) for devices. Values ββare read from each device once a minute and inserted into a table with a time stamp. I need to calculate the energy consumption (kWh) for a given time interval and return the 10 most energy-consuming devices. Right now I'm asking for results for a given time interval and doing the calculations in the backend, looping all the records. This works great with a small number of devices and with a short period of time, but in real use I could have thousands of devices and a long period of time.
So my question is, how can I do all this in PostgreSQL 9.4.4 so that my query returns only the 10 most energy-consuming (device_id, power_consumption) pairs?
Example table:
CREATE TABLE measurements ( id serial primary key, device_id integer, power real, created_at timestamp );
Simple data example:
| id | device_id | power | created_at | |----|-----------|-------|--------------------------| | 1 | 1 | 10 | August, 26 2015 08:23:25 | | 2 | 1 | 13 | August, 26 2015 08:24:25 | | 3 | 1 | 12 | August, 26 2015 08:25:25 | | 4 | 2 | 103 | August, 26 2015 08:23:25 | | 5 | 2 | 134 | August, 26 2015 08:24:25 | | 6 | 2 | 2 | August, 26 2015 08:25:25 | | 7 | 3 | 10 | August, 26 2015 08:23:25 | | 8 | 3 | 13 | August, 26 2015 08:24:25 | | 9 | 3 | 20 | August, 26 2015 08:25:25 |
Required query results:
| id | device_id | power_consumption | |----|-----------|-------------------| | 1 | 1 | 24.0 | | 2 | 2 | 186.5 | | 3 | 3 | 28.0 |
A simplified example (created_at in hours) how I calculate the kWh value:
data = [ [ { 'id': 1, 'device_id': 1, 'power': 10.0, 'created_at': 0 }, { 'id': 2, 'device_id': 1, 'power': 13.0, 'created_at': 1 }, { 'id': 3, 'device_id': 1, 'power': 12.0, 'created_at': 2 } ], [ { 'id': 4, 'device_id': 2, 'power': 103.0, 'created_at': 0 }, { 'id': 5, 'device_id': 2, 'power': 134.0, 'created_at': 1 }, { 'id': 6, 'device_id': 2, 'power': 2.0, 'created_at': 2 } ], [ { 'id': 7, 'device_id': 3, 'power': 10.0, 'created_at': 0 }, { 'id': 8, 'device_id': 3, 'power': 13.0, 'created_at': 1 }, { 'id': 9, 'device_id': 3, 'power': 20.0, 'created_at': 2 } ] ] # device_id: power_consumption results = { 1: 0, 2: 0, 3: 0 } for d in data: for i in range(0, len(d)): if i < len(d)-1: # Area between two records gives us kWh # X-axis is time(h) # Y-axis is power(kW) x1 = d[i]['created_at'] x2 = d[i+1]['created_at'] y1 = d[i]['power'] y2 = d[i+1]['power'] # Area between two records gives us kWh # X-axis is time(h) # Y-axis is power(kW) x1 = d[i]['created_at'] x2 = d[i+1]['created_at'] y1 = d[i]['power'] y2 = d[i+1]['power'] results[d[i]['device_id']] += ((x2-x1)*(y2+y1))/2 print results
EDIT: check this one to find out how I decided to solve it.