Database temporarily down after multiple pgbench transactions

I am using (PostgreSQL) 9.2.1 and checking the database with pgbench.

pgbench -h 192.168.39.38 -p 5433 -t 1000 -c 40 -j 8 -C -U admin testdb 

When I use the -C option (establish a new connection for each transaction), transactions are always lost after transaction 16381.

 Connection to database "testdb" failed could not connect to server: Can't assign requested address Is the server running on host "192.168.39.38" and accepting TCP/IP connections on port 5433? Client 19 aborted in establishing connection. Connection to database "testdb" failed could not connect to server: Can't assign requested address Is the server running on host "192.168.39.38" and accepting TCP/IP connections on port 5433? Client 19 aborted in establishing connection. .... transaction type: TPC-B (sort of) scaling factor: 30 query mode: simple number of clients: 40 number of threads: 8 number of transactions per client: 1000 number of transactions actually processed: 16381/40000 tps = 1665.221801 (including connections establishing) tps = 9487.779510 (excluding connections establishing) 

And the number of processed transactions is always 16381 in each test. However, pgbench can be successful, and all transactions are processed in circumstances in which

-C not used

or

transaction amount less than 16381

After deleting these transactions, the database may continue to accept the connection in a few seconds. I wonder if I missed some PostgreSQL configuration.

thanks


Change I found that the client is blocked for connection for several seconds, but the rest can still access the database. Does this mean that the same client cannot send too many transactions in a short time?

+7
source share
4 answers

I found the reason he lost connections after about 16,000 transactions. TCP wait_time takes the blame for this error. The following command will show the status of TCP connections:

 $ netstat -n | awk '/^tcp/ {++S[$NF]} END {for(a in S) print a, S[a]}' 

However, it does not show TIME_WAIT on MAC OS X. Therefore, I skipped it. After I configure TCP wait_time with the following command, pgbench works correctly.

 $ sudo sysctl -w net.inet.tcp.msl=1500 net.inet.tcp.msl: 15000 -> 1500 

Thanks for the help.

+4
source

In fact, there is a limit to the maximum connections imposed by the OS. Read the max-connections in the documentation: (in bold)

Determines the maximum number of concurrent connections to the database server. By default, usually 100 connections are established, but may be less if your kernel settings do not support it (as defined during initdb) . This parameter can only be set at server startup.

Increasing this setting may cause PostgreSQL to request more shared memory or System V semaphores than your default configuration for your operating system allows. See section 17.4.1 for information on how to configure these settings, if necessary.

The fact that you can open only 16381 connections is explained by the presence of 2 ^ 14 (= 16384) possible maximum connections minus 3 connections reserved by default for superuser connections (see the documentation).

+1
source

Interestingly, 16381 is so close to power 2.

This is basically an assumption:

I am wondering this is an OS thing. Looking at the TPS data, is a new connection created for each transaction? [Edit yes, now that I have read your question correctly.]

Perhaps the OS has only so many connection resources that it can use and cannot immediately create a new connection after it made 16381 in the recent past (plus a few extra ones)?

An OS parameter can be set to indicate the number of available connection resources, which allows you to use more connections. Can you add some OS details to the question?


In particular, I suspect that the port number with which you are connecting is constantly increasing, and you click the limit. Try "lsof -i" and see if you can catch the connection somehow, and see if the number grows.

0
source

I decided by installing /etc/sysctl.conf:

 net.ipv4.ip_local_port_range = 32768 65000 net.ipv4.tcp_tw_recycle = 1 net.ipv4.tcp_fin_timeout = 10 
0
source

All Articles