Django: distinctive foreign keys

class Log: project = ForeignKey(Project) msg = CharField(...) date = DateField(...) 

I want to select the four most recent journal entries in which each journal entry should have a unique project foreign key. I am trying to find solutions in a Google search, but none of them work, and the django documentation is not very searchable.

I tried things like:

 Log.objects.all().distinct('project')[:4] Log.objects.values('project').distinct()[:4] Log.objects.values_list('project').distinct('project')[:4] 

But this either does not return anything, or writes records of the same project.

Any help would be appreciated!

+8
django distinct foreign-keys
source share
4 answers

Queries do not work like this: either in Django ORM or in basic SQL. If you want unique identifiers, you can only request an identifier. So you need to make two queries to get the actual log entries. Something like:

 id_list = Log.objects.order_by('-date').values_list('project_id').distinct()[:4] entries = Log.objects.filter(id__in=id_list) 
+11
source share

Actually, you can get project_ids in SQL. Assuming you need unique project identifiers for four projects with the latest log entries, SQL would look like this:

 SELECT project_id, max(log.date) as max_date FROM logs GROUP BY project_id ORDER BY max_date DESC LIMIT 4; 

Now you really need all the log information. In PostgreSQL 8.4 and later, you can use window functions, but this does not work in other versions / databases, so I will make it more complicated way:

 SELECT logs.* FROM logs JOIN ( SELECT project_id, max(log.date) as max_date FROM logs GROUP BY project_id ORDER BY max_date DESC LIMIT 4 ) as latest ON logs.project_id = latest.project_id AND logs.date = latest.max_date; 

Now, if you have access to the window processing functions, this is a little more neat (I think, anyway) and, of course, faster to execute:

 SELECT * FROM ( SELECT logs.field1, logs.field2, logs.field3, logs.date rank() over ( partition by project_id order by "date" DESC ) as dateorder FROM logs ) as logsort WHERE dateorder = 1 ORDER BY logs.date DESC LIMIT 1; 

Well, maybe it’s not so easy to understand, but honestly, it works faster on a large database.

I'm not quite sure how this translates to the syntax of an object, although even if it is. In addition, if you want to get other project data, you will need to join the project table.

+3
source share

I know this is an old post, but in Django 2.0, I think you could just use:

 Log.objects.values('project').distinct().order_by('project')[:4] 
0
source share

You need two sets of queries. It’s good that this still leads to one trip to the database (although there is a subquery).

 latest_ids_per_project = Log.objects.values_list( 'project').annotate(latest=Max('date')).order_by( '-latest').values_list('project') log_objects = Log.objects.filter( id__in=latest_ids_per_project[:4]).order_by('-date') 

This looks a bit confusing, but actually leads to a surprisingly compact query:

 SELECT "log"."id", "log"."project_id", "log"."msg" "log"."date" FROM "log" WHERE "log"."id" IN (SELECT U0."id" FROM "log" U0 GROUP BY U0."project_id" ORDER BY MAX(U0."date") DESC LIMIT 4) ORDER BY "log"."date" DESC 
0
source share

All Articles