The database contains about 1 billion rows on Core2Quad with 8GB RAM on 64-bit Ubuntu. Of course, this request should not take half an hour.
It takes half an hour because of how you set up your indexes.
Your query does not have indexes with multiple columns that it can use to direct directly to the desired rows. He does the next best, which is scanning a raster index on barely selective indexes, and the top one sorting the result set.
The two indicated indexes, for security and for date, give rows 1.3M and 2.3M, respectively. Combining them will be painfully slow because you accidentally scan over a million lines and filter each.
Adding insult to injury, your data structure is such that two highly correlated fields (date and time) are stored and processed separately. This confuses the query planner because Postgres does not collect correlation data. Thus, your queries almost always resort to filtering through huge data arrays and organize the filtered set according to individual criteria.
I would suggest the following changes:
Modify the table and add a datetime column of type timestamp with time zone . Combine your date and time columns.
Put the appropriate date and time fields, as well as the indices on them. Also clear the security index.
Create an index (security, datetime). (And don't get confused with the zeros of first / nulls last unless your ordering criteria also contain these sentences.)
Of your choice, add a separate index to (datetime) or on (datetime, security) if you ever need to execute queries that set statistics for all deals in a range of days or dates.
The vacuum analyzes the entire mess once you are done with the above.
Then you can rewrite your request as follows:
SELECT "TIME", "TRADEPRICE" FROM "YEAR" WHERE '2010-03-01 00:00:00' <= "DATETIME" AND "DATETIME" < '2010-03-01 10:16:00' AND "SECURITY"='STW.AX' AND "TYPE" = 'TRADE' ORDER BY "DATETIME" ASC LIMIT 3
This will give the most optimistic plan: extracting the top 3 rows from the filtered index scan (security, datetime), which I expect (since you have a billion rows) will take a maximum of 25 ms.
Denis de bernardy
source share