PostgreSQL performance issue

In the postgresql logs, I see that some simple queries (without joining and using only matching conditions that use indexes) take 1 to 3 seconds to complete. I am logging requests that take more than a second to execute, so there are similar requests that run for a second that are not reported.

When I try to use the same query using EXPLAIN ANALYZE, it takes a few milliseconds.

The table contains about 8 million records and is written and widely described. I turned on the auto vacuum, and even recently (a few hours ago) VACUUM ANALYZE was launched on this table.

An example of a query log entry: Dec 30 10:14:57 db01 postgres [7400]: [20-1] LOG: duration: 3857.322 ms statement: SELECT * FROM "answers" WHERE ("answers" .contest_id = 17469) AND (user_id is not Dec 30 10:14:57 db01 postgres [7400]: [20-2] null) ORDER BY updated_on desc LIMIT 5

contest_id and user_id are indexed. updated_on is not indexed. If I index it, the query planner ignores the contest_id index and uses update_on instead, which slows down the request even more. The maximum records in the above query without LIMIT must not exceed 1000.

Any help would be greatly appreciated.

+4
source share
5 answers

This is due to the exchange.

0
source

Several details may be helpful here, depending on whether you can provide them. The most useful will be the actual output of your EXPLAIN ANALYZE so that we can see what it does when the request completes. Defining the requested table may also be useful, along with indexes. The more information, the more fun. I can only now talk about what is happening, here are a few blind notes:

  • In this database, many other SELECTs occur simultaneously, and periodically the data and / or the result ends somewhere in the cache.
  • There is something else that periodically blocks this table for 3-4 seconds before releasing it again, during which time this request is stuck.
  • This table is written in such a way that the statistics of the table do not ultimately reflect reality, and therefore the query analyst rejects the decision on whether or not to use the index to execute the query.

Other people may have other ideas, but yes. Additional information on what is happening may be helpful.

+3
source

pgsql-performance is a great mailing list to ask such questions.

You seem to have two problems:

1) You want to be able to index update_on, but if you do, PostgreSQL will choose the wrong plan.

My first wild guess was that PostgreSQL overestimates the number of tuples that match the predicate " (responses.contest_id = 17469) AND (user_id is not null) ". If postgres first uses this predicate, it should later sort the values ​​to implement ORDER BY. You say that it corresponds to 1000 tuples; if postgresql thinks it matches 100000, maybe he thinks scanning in order using the update_on index will be cheaper. Another factor might be your configuration: if work_mem set to a low level, it might seem that sorting is more expensive than it is.

You really need to show the EXPLAIN ANALYZE output of the slow query so that we can understand why it can choose to index scan on updated_on .

2) Even if it is not indexed, sometimes it takes some time to execute, but you do not know, because if you run it manually, it works fine.

Use the auto_explain contrib module, new since 8.4. It allows you to log the output of EXPLAIN ANALYZE requests that take too much time. Just registering the request, you put exactly what you have now: every time you quickly launch the request,

0
source

if exactly the same request takes milliseconds in the analysis of the explanation and 3 seconds in the logs (i.e. I assume that it takes 3 seconds, and not every call takes so much time) - than this definitely means that it lock problem.

0
source
  • Castle
  • replacement
  • CLUSTER / VACUUM FULL in cron operation
  • saturated network
  • saturated IO

check iostat, vmstat, iptraf ...

0
source

All Articles