Slow insertion speed in Postgresql memory table space

Question

Slow insertion speed in Postgresql memory table space

I have a requirement when I need to store records at a speed of 10,000 records / sec in a database (with indexing over several fields). The number of columns in one record is 25. I am doing a batch insert of 100,000 records in one transaction block. To improve input speed, I changed the table space from disk to RAM. With this, I can only achieve 5,000 inserts per second.

I also performed the following setup in postgres configuration:

Indices: no
fsync: false
logging: disabled

Additional Information:

Tablespace: RAM
The number of columns in one row: 25 (mostly integer)
CPU: 4 cores, 2.5 GHz
RAM: 48 GB.

I am wondering why a single insert request takes about 0.2 ms on average when the database does not write anything to disk (since I use a RAM-based table space). Is there something I'm doing wrong?

Help evaluate.

Prashant

+6

postgresql

Prashant May 28 '10 at 6:30

source share

4 answers

You make your insert as a series

 INSERT INTO tablename (...) VALUES (...); INSERT INTO tablename (...) VALUES (...); ...

or as a single line insert:

 INSERT INTO tablename (...) VALUES (...),(...),(...);

the second will be much faster by 100 thousand lines.

source: http://kaiv.wordpress.com/2007/07/19/faster-insert-for-multiple-rows/

+4

zed_0xff May 28 '10 at 6:41

source share

Did you host xlog (WAL segments) also on your RAM disk? If not, you are still writing to disk. What about the settings for wal_buffers, checkpoint_segments, etc.? You should try to get all of your 100,000 records (your only transaction) in your wal_buffers. Increasing this setting may cause PostgreSQL to request more System V shared memory than the default setting for your operating system allows.

+3

Frank heikens May 28 '10 at 7:16

source share

I suggest you use COPY instead of INSERT .

You must also fine-tune your postgresql.conf file.

Read about http://wiki.postgresql.org/wiki/Performance_Optimization

+1

pcent May 28 '10 at 19:55

source share

Dave jarvis · Accepted Answer · 2010-05-28T20:12:14+0000

Fast data loading

Transfer your data to CSV.
Create a temporary table (as you noted, without indexes).
Run the command COPY: \COPY schema.temp_table FROM /tmp/data.csv WITH CSV
Paste the data into a non-temporary table.
Creating indexes.
Set the appropriate statistics.

Additional recommendations

For large amounts of data:

Split data into child tables.
Insert it in the column order from which most SELECT will be used. In other words, try to align the physical model with the logical model.
Adjust the configuration settings.
Create a CLUSTER index (the most important column on the left). For example:

  CREATE UNIQUE INDEX measurement_001_stc_index
       ON climate.measurement_001
       USING btree
       (station_id, taken, category_id);
     ALTER TABLE climate.measurement_001 CLUSTER ON measurement_001_stc_index;

Configuration settings

On a machine with 4 GB of RAM, I did the following ...

Kernel Configuration

Tell the kernel that everything is in order for programs to use shared memory cubes:

 sysctl -w kernel.shmmax=536870912 sysctl -p /etc/sysctl.conf

PostgreSQL configuration

Edit /etc/postgresql/8.4/main/postgresql.conf and install:

  shared_buffers = 1GB
 temp_buffers = 32MB
 work_mem = 32MB
 maintenance_work_mem = 64MB
 seq_page_cost = 1.0
 random_page_cost = 2.0
 cpu_index_tuple_cost = 0.001
 effective_cache_size = 512MB
 checkpoint_segments = 10

Highlight the values that are necessary and appropriate for your environment. You may need to modify them to suitably read / write optimizations later.
Restart PostgreSQL.

Tables for children

For example, let's say you have weather-based data divided into different categories. Instead of having one monstrous table, divide it into several tables (one for each category).

Master table

 CREATE TABLE climate.measurement ( id bigserial NOT NULL, taken date NOT NULL, station_id integer NOT NULL, amount numeric(8,2) NOT NULL, flag character varying(1) NOT NULL, category_id smallint NOT NULL, CONSTRAINT measurement_pkey PRIMARY KEY (id) ) WITH ( OIDS=FALSE );

Table for children

 CREATE TABLE climate.measurement_001 ( -- Inherited from table climate.measurement_001: id bigint NOT NULL DEFAULT nextval('climate.measurement_id_seq'::regclass), -- Inherited from table climate.measurement_001: taken date NOT NULL, -- Inherited from table climate.measurement_001: station_id integer NOT NULL, -- Inherited from table climate.measurement_001: amount numeric(8,2) NOT NULL, -- Inherited from table climate.measurement_001: flag character varying(1) NOT NULL, -- Inherited from table climate.measurement_001: category_id smallint NOT NULL, CONSTRAINT measurement_001_pkey PRIMARY KEY (id), CONSTRAINT measurement_001_category_id_ck CHECK (category_id = 1) ) INHERITS (climate.measurement) WITH ( OIDS=FALSE );

Table statistics

Increase table statistics for important columns:

 ALTER TABLE climate.measurement_001 ALTER COLUMN taken SET STATISTICS 1000; ALTER TABLE climate.measurement_001 ALTER COLUMN station_id SET STATISTICS 1000;

Do not forget ANALYSE and ANALYSE after that.

Slow insertion speed in Postgresql memory table space

Fast data loading

Additional recommendations

Configuration settings

Kernel Configuration

PostgreSQL configuration

Tables for children

Master table

Table for children

Table statistics

More articles: