How can I prevent Postgres from embedding a subquery?

Here's a slow query in Postgres 9.1.6, although the maximum count is 2, with both rows already identified by their primary keys: (4.5 seconds)

EXPLAIN ANALYZE SELECT COUNT(*) FROM tbl WHERE id IN ('6d48fc431d21', 'd9e659e756ad') AND data ? 'building_floorspace' AND data ?| ARRAY['elec_mean_monthly_use', 'gas_mean_monthly_use']; QUERY PLAN ---------------------------------------------------------------------------------------------------------------------------------------------------- Aggregate (cost=4.09..4.09 rows=1 width=0) (actual time=4457.886..4457.887 rows=1 loops=1) -> Index Scan using idx_tbl_on_data_gist on tbl (cost=0.00..4.09 rows=1 width=0) (actual time=4457.880..4457.880 rows=0 loops=1) Index Cond: ((data ? 'building_floorspace'::text) AND (data ?| '{elec_mean_monthly_use,gas_mean_monthly_use}'::text[])) Filter: ((id)::text = ANY ('{6d48fc431d21,d9e659e756ad}'::text[])) Total runtime: 4457.948 ms (5 rows) 

Hmm, maybe if I first make a subquery with the first part of the primary key ...: (no, another 4.5 seconds)

 EXPLAIN ANALYZE SELECT COUNT(*) FROM ( SELECT * FROM tbl WHERE id IN ('6d48fc431d21', 'd9e659e756ad') ) AS t WHERE data ? 'building_floorspace' AND data ?| ARRAY['elec_mean_monthly_use', 'gas_mean_monthly_use']; QUERY PLAN ---------------------------------------------------------------------------------------------------------------------------------------------------- Aggregate (cost=4.09..4.09 rows=1 width=0) (actual time=4854.170..4854.171 rows=1 loops=1) -> Index Scan using idx_tbl_on_data_gist on tbl (cost=0.00..4.09 rows=1 width=0) (actual time=4854.165..4854.165 rows=0 loops=1) Index Cond: ((data ? 'building_floorspace'::text) AND (data ?| '{elec_mean_monthly_use,gas_mean_monthly_use}'::text[])) Filter: ((id)::text = ANY ('{6d48fc431d21,d9e659e756ad}'::text[])) Total runtime: 4854.220 ms (5 rows) 

How can I prevent Postgres from embedding a subquery?

Reference Information. I have a Postgres 9.1 table using hstore and a GiST index ..

+7
indexing postgresql subquery inlining
Feb 15 '13 at 15:22
source share
2 answers

There seems to be a way to tell Postgres not to embed : (0.223ms!)

 EXPLAIN ANALYZE SELECT COUNT(*) FROM ( SELECT * FROM tbl WHERE id IN ('6d48fc431d21', 'd9e659e756ad') OFFSET 0 ) AS t WHERE data ? 'building_floorspace' AND data ?| ARRAY['elec_mean_monthly_use', 'gas_mean_monthly_use']; QUERY PLAN ------------------------------------------------------------------------------------------------------------------------------------------ Aggregate (cost=8.14..8.15 rows=1 width=0) (actual time=0.165..0.166 rows=1 loops=1) -> Subquery Scan on t (cost=4.14..8.14 rows=1 width=0) (actual time=0.160..0.160 rows=0 loops=1) Filter: ((t.data ? 'building_floorspace'::text) AND (t.data ?| '{elec_mean_monthly_use,gas_mean_monthly_use}'::text[])) -> Limit (cost=4.14..8.13 rows=2 width=496) (actual time=0.086..0.092 rows=2 loops=1) -> Bitmap Heap Scan on tbl (cost=4.14..8.13 rows=2 width=496) (actual time=0.083..0.086 rows=2 loops=1) Recheck Cond: ((id)::text = ANY ('{6d48fc431d21,d9e659e756ad}'::text[])) -> Bitmap Index Scan on tbl_pkey (cost=0.00..4.14 rows=2 width=0) (actual time=0.068..0.068 rows=2 loops=1) Index Cond: ((id)::text = ANY ('{6d48fc431d21,d9e659e756ad}'::text[])) Total runtime: 0.223 ms (9 rows) 

The subquery has the OFFSET 0 trick.

+4
Feb 15 '13 at 15:22
source share

I think OFFSET 0 is a better approach, since it is more obvious that it is a hack showing that something strange is happening, and it is unlikely that we will ever change the optimizer behavior around OFFSET 0 ... but we hope that CTE will become inalienable at some point. The following explanation is for completeness; use the answer of Seamus.

For uncorrelated subqueries, you can use PostgreSQL rejection for inline WITH query conditions to rephrase your query as:

 WITH t AS ( SELECT * FROM tbl WHERE id IN ('6d48fc431d21', 'd9e659e756ad') ) SELECT COUNT(*) FROM t WHERE data ? 'building_floorspace' AND data ?| ARRAY['elec_mean_monthly_use', 'gas_mean_monthly_use']; 

This has the same effect as the OFFSET 0 trick, and as the OFFSET 0 trick breaks the quirks in the Pg optimizer that people use to bypass Pg without query hints ... using them as query hints.

+7
Feb 15 '13 at 15:31
source share



All Articles