I have a table with over a billion records. To improve performance, I divided it into 30 sections. The most common queries have (id = ...) in their where clause, so I decided to split the table into an id column.
Basically, partitions were created this way:
CREATE TABLE foo_0 (CHECK (id % 30 = 0)) INHERITS (foo); CREATE TABLE foo_1 (CHECK (id % 30 = 1)) INHERITS (foo); CREATE TABLE foo_2 (CHECK (id % 30 = 2)) INHERITS (foo); CREATE TABLE foo_3 (CHECK (id % 30 = 3)) INHERITS (foo); . . .
I ran ANALYZE for the entire database, and in particular, I collected additional statistics for this id table by doing:
ALTER TABLE foo ALTER COLUMN id SET STATISTICS 10000;
However, when I run queries that are filtered in the id column, the scheduler shows that it is still scanning all sections. constraint_exclusion set to partition , so no problem.
EXPLAIN ANALYZE SELECT * FROM foo WHERE (id = 2); QUERY PLAN ---------------------------------------------------------------------------------------------------------------------------------------------- Result (cost=0.00..8106617.40 rows=3620981 width=54) (actual time=30.544..215.540 rows=171477 loops=1) -> Append (cost=0.00..8106617.40 rows=3620981 width=54) (actual time=30.539..106.446 rows=171477 loops=1) -> Seq Scan on foo (cost=0.00..0.00 rows=1 width=203) (actual time=0.002..0.002 rows=0 loops=1) Filter: (id = 2) -> Bitmap Heap Scan on foo_0 foo (cost=3293.44..281055.75 rows=122479 width=52) (actual time=0.020..0.020 rows=0 loops=1) Recheck Cond: (id = 2) -> Bitmap Index Scan on foo_0_idx_1 (cost=0.00..3262.82 rows=122479 width=0) (actual time=0.018..0.018 rows=0 loops=1) Index Cond: (id = 2) -> Bitmap Heap Scan on foo_1 foo (cost=3312.59..274769.09 rows=122968 width=56) (actual time=0.012..0.012 rows=0 loops=1) Recheck Cond: (id = 2) -> Bitmap Index Scan on foo_1_idx_1 (cost=0.00..3281.85 rows=122968 width=0) (actual time=0.010..0.010 rows=0 loops=1) Index Cond: (id = 2) -> Bitmap Heap Scan on foo_2 foo (cost=3280.30..272541.10 rows=121903 width=56) (actual time=30.504..77.033 rows=171477 loops=1) Recheck Cond: (id = 2) -> Bitmap Index Scan on foo_2_idx_1 (cost=0.00..3249.82 rows=121903 width=0) (actual time=29.825..29.825 rows=171477 loops=1) Index Cond: (id = 2) . . .
What can I do to make the planer smoother? Do I need to run ALTER TABLE foo ALTER COLUMN id SET STATISTICS 10000; for all sections?
EDIT
After using Erwin's proposed change in the query, the scheduler only scans the correct section, however, the execution time is actually worse than a full scan (at least the index).
EXPLAIN ANALYZE select * from foo where (id % 30 = 2) and (id = 2); QUERY PLAN QUERY PLAN -------------------------------------------------------------------------------------------------------------------------------------------------------------------- Result (cost=0.00..8106617.40 rows=3620981 width=54) (actual time=32.611..224.934 rows=171477 loops=1) -> Append (cost=0.00..8106617.40 rows=3620981 width=54) (actual time=32.606..116.565 rows=171477 loops=1) -> Seq Scan on foo (cost=0.00..0.00 rows=1 width=203) (actual time=0.002..0.002 rows=0 loops=1) Filter: (id = 2) -> Bitmap Heap Scan on foo_0 foo (cost=3293.44..281055.75 rows=122479 width=52) (actual time=0.046..0.046 rows=0 loops=1) Recheck Cond: (id = 2) -> Bitmap Index Scan on foo_0_idx_1 (cost=0.00..3262.82 rows=122479 width=0) (actual time=0.044..0.044 rows=0 loops=1) Index Cond: (id = 2) -> Bitmap Heap Scan on foo_1 foo (cost=3312.59..274769.09 rows=122968 width=56) (actual time=0.021..0.021 rows=0 loops=1) Recheck Cond: (id = 2) -> Bitmap Index Scan on foo_1_idx_1 (cost=0.00..3281.85 rows=122968 width=0) (actual time=0.020..0.020 rows=0 loops=1) Index Cond: (id = 2) -> Bitmap Heap Scan on foo_2 foo (cost=3280.30..272541.10 rows=121903 width=56) (actual time=32.536..86.730 rows=171477 loops=1) Recheck Cond: (id = 2) -> Bitmap Index Scan on foo_2_idx_1 (cost=0.00..3249.82 rows=121903 width=0) (actual time=31.842..31.842 rows=171477 loops=1) Index Cond: (id = 2) -> Bitmap Heap Scan on foo_3 foo (cost=3475.87..285574.05 rows=129032 width=52) (actual time=0.035..0.035 rows=0 loops=1) Recheck Cond: (id = 2) -> Bitmap Index Scan on foo_3_idx_1 (cost=0.00..3443.61 rows=129032 width=0) (actual time=0.031..0.031 rows=0 loops=1) . . . -> Bitmap Heap Scan on foo_29 foo (cost=3401.84..276569.90 rows=126245 width=56) (actual time=0.019..0.019 rows=0 loops=1) Recheck Cond: (id = 2) -> Bitmap Index Scan on foo_29_idx_1 (cost=0.00..3370.28 rows=126245 width=0) (actual time=0.018..0.018 rows=0 loops=1) Index Cond: (id = 2) Total runtime: 238.790 ms
Versus:
EXPLAIN ANALYZE select * from foo where (id % 30 = 2) and (id = 2); QUERY PLAN ------------------------------------------------------------------------------------------------------------------------------------------------------------------ Result (cost=0.00..273120.30 rows=611 width=56) (actual time=31.519..257.051 rows=171477 loops=1) -> Append (cost=0.00..273120.30 rows=611 width=56) (actual time=31.516..153.356 rows=171477 loops=1) -> Seq Scan on foo (cost=0.00..0.00 rows=1 width=203) (actual time=0.002..0.002 rows=0 loops=1) Filter: ((id = 2) AND ((id % 30) = 2)) -> Bitmap Heap Scan on foo_2 foo (cost=3249.97..273120.30 rows=610 width=56) (actual time=31.512..124.177 rows=171477 loops=1) Recheck Cond: (id = 2) Filter: ((id % 30) = 2) -> Bitmap Index Scan on foo_2_idx_1 (cost=0.00..3249.82 rows=121903 width=0) (actual time=30.816..30.816 rows=171477 loops=1) Index Cond: (id = 2) Total runtime: 270.384 ms