Choosing the right split rule

I am creating a new PostgreSQL 9 database that will contain millions (or maybe billions) of rows. So I decided to split the data using PostgreSQL inheritance.

I created such a master table (simplified):

CREATE TABLE mytable
(
  user_id integer,
  year integer,
  CONSTRAINT pk_mytable PRIMARY KEY (user_id, year)
);

And 10 partition tables:

CREATE TABLE mytable_0 () INHERITS (mytable);
CREATE TABLE mytable_1 () INHERITS (mytable);
...
CREATE TABLE mytable_9 () INHERITS (mytable);

I know that strings will always be accessible from the application using the unique user_id condition. Therefore, I would like to distribute the data “perfectly” equally over 10 tables using a rule based on user_id.

To configure queries on the main table, my first idea was to use a module check restriction:

ALTER TABLE mytable_0 ADD CONSTRAINT mytable_user_id_check CHECK (user_id % 10 = 0);
ALTER TABLE mytable_1 ADD CONSTRAINT mytable_user_id_check CHECK (user_id % 10 = 1);
...

, "mytable" user_id, PostgreSQL :

EXPLAIN SELECT * FROM mytable WHERE user_id = 12345;

"Result  (cost=0.00..152.69 rows=64 width=36)"
"  ->  Append  (cost=0.00..152.69 rows=64 width=36)"
"        ->  Seq Scan on mytable  (cost=0.00..25.38 rows=6 width=36)"
"              Filter: (user_id = 12345)"
"        ->  Seq Scan on mytable_0 mytable  (cost=0.00..1.29 rows=1 width=36)"
"              Filter: (user_id = 12345)"
"        ->  Seq Scan on mytable_1 mytable  (cost=0.00..1.52 rows=1 width=36)"
"              Filter: (user_id = 12345)"
...
"        ->  Seq Scan on mytable_9 mytable  (cost=0.00..1.52 rows=1 width=36)"
"              Filter: (user_id = 12345)"

CHECK CONSTRAINT, ( , ):

ALTER TABLE mytable_0 ADD CONSTRAINT mytable_user_id_check CHECK (user_id BETWEEN 1 AND 10000);
ALTER TABLE mytable_1 ADD CONSTRAINT mytable_user_id_check CHECK (user_id BETWEEN 10001 AND 20000);
...

, (mytable mytable_1 ):

"Result  (cost=0.00..152.69 rows=64 width=36)"
"  ->  Append  (cost=0.00..152.69 rows=64 width=36)"
"        ->  Seq Scan on mytable  (cost=0.00..25.38 rows=6 width=36)"
"              Filter: (user_id = 12345)"
"        ->  Seq Scan on mytable_1 mytable  (cost=0.00..1.52 rows=1 width=36)"
"              Filter: (user_id = 12345)"

, , , . , , ...

10 , , SELECT ...?

,

+5
2

, partioning. :

http://www.postgresql.org/docs/9.1/static/ddl-partitioning.html

, , .

-, , . , ( , ). , PG , . , - .

-, , . :

  • , . , .
  • 10 .

, - , 100 000 1 . cron-, ( ).

, , , . , , , .

+5

WHERE , CHECK, i. e., , user_id = 12345 , user_id % 10 = 5.

EXPLAIN SELECT * FROM mytable WHERE user_id = 12345 AND user_id % 10 = 5;

, Richard Huxton , , , . Postgres , , .

+1

All Articles