A simple query works for many years and then suddenly very slowly

I had a query that works fine for about 2 years. The database table has about 50 million rows and is growing slowly. Last week, one of my requests went from an almost instant return to hours of work.

Rank.objects.filter(site=Site.objects.get(profile__client=client, profile__is_active=False)).latest('id') 

I narrowed down the slow query to the Rank model. This seems to be related to using the latest () method. If I just ask for a request, it will immediately return an empty request.

 #count returns 0 and is fast Rank.objects.filter(site=Site.objects.get(profile__client=client, profile__is_active=False)).count() == 0 Rank.objects.filter(site=Site.objects.get(profile__client=client, profile__is_active=False)) == [] #also very fast 

Below are the results of running EXPLAIN. http://explain.depesz.com/s/wPh

AND EXPLAIN ANALYZE: http://explain.depesz.com/s/ggi

I tried to vacuum the table, no change. There is already an index (ForeignKey) in the "site" field.

It’s strange if I run the same request for another client that already has Rank objects associated with her account, then the request returns very quickly again. It seems that this is only a problem when they do not have Rank objects for this client.

Any ideas?

Version: Postgres 9.1, Django 1.4 svn trunk rev 17047

+7
source share
2 answers

Well, you have not shown the actual SQL, so this makes confidence difficult. But the output of the explanation suggests that the fastest way to find a match is to scan the index β€œid” backwards until it finds the client in question.

Since you said it was fast until recently, this is probably not a stupid choice. However, there is always the possibility that a particular customer record will be at the very end of this search.

So, first try two things:

  • Run the analysis in the table in question, see if this plan gives enough information.
  • If not, increase the statistics (ALTER TABLE ... SET STATISTICS) in the corresponding columns and repeat the analysis. See what it does.

http://www.postgresql.org/docs/9.1/static/planner-stats.html

If this still does not help, consider the index on (client, id) and discard the index on id (if it is not needed elsewhere). This should give you lightning fast answers.

0
source

latests is usually used to compare dates, maybe you should try to order by id desc, and then restrict it.

0
source

All Articles