Django's most effective query for returning results spanning multiple tables

I am trying to make a rather complicated request in Django in the most efficient way, and I'm not sure how to get started. I have these models (this is a simplified version)

class Status(models.Model): status = models.CharField(max_length=200) class User(models.Model): name = models.CharField(max_length=200) class Event(models.Model): user = models.ForeignKey(User) class EventItem(models.Model): event = models.ForeignKey(Event) rev1 = models.ForeignKey(Status, related_name='rev1', blank=True, null=True) rev2 = models.ForeignKey(Status, related_name='rev2', blank=True, null=True) active = models.BooleanField() 

I want to create a query that will lead to a list of users who have most of the events in which all of their dependent EventItems have rev1 and rev2 are not empty or null and active = True .

I know that I can do this by sorting through the list of users and then checking all their events for compliance with the criteria rev1 , rev2 and active , and then returning these events, but this is hard for the database. Any suggestions?

Thanks!

+4
source share
3 answers

I want to create a query that will lead to a list of users who have most of the events in which all of their dependent EventItems have rev1 and rev2 are not empty or null and active = True.

First, you want Event objects to always have this type of EventItem .

 events = Event.objects.filter(active=True) events = events.exclude(eventitem__rev1__isnull=True) events = events.exclude(eventitem__rev1='') events = events.exclude(eventitem__rev2__isnull=True) events = events.exclude(eventitem__rev2='') 

In addition, you did not indicate whether you want to deal with Event objects that do not have EventItem . You can filter them with:

 events = events.exclude(eventitem__isnull=True) 

Please note that events can contain many duplicates. You can add events.distinct() if you want, but you should only do this if you need it for human reading.

After that, you can now retrieve the User objects that you want:

 users = User.objects.filter(event__in=events) 

Note that on some * ahem * MySQL * ahem * database databases, you may find that the .filter(field__in=QuerySet) pattern .filter(field__in=QuerySet) very slow . In this case, the code should be:

 users = User.objects.filter(event__in=list(events.values_list('pk', flat=True))) 

Then you can order things by the number of Event objects attached:

 from django.db.models import Count active_users = users.annotate(num_events=Count('event')).order_by('-num_events') 
+3
source

Your model is broken, but this should summarize what you did in a cleaner way.

 class Status(models.Model): status = models.CharField(max_length=200) class User(models.Model): name = models.CharField(max_length=200) events = models.ManyToManyField('Event') class Event(models.Model): rev1 = models.ForeignKey(Status, related_name='rev1', blank=True, null=True) rev2 = models.ForeignKey(Status, related_name='rev2', blank=True, null=True) active = models.BooleanField() 

And request

User.objects.filter(events__active=True).exclude(Q(events__rev1=None)|Q(events__rev2=None)).annotate(num_events=Count('events')).order_by('-num_events')

This will return a list of users sorted by the number of events in their set.

For more information, check out Many-to-Many .

+6
source

You can try something like:

 EventItem.objects.exclude(rev1=None).exclude(rev2=None).filter(active=True).values_list('event__user', flat=True) 

This will give you a flat list of user IDs, where the frequency of each ID is equal to the number of EventItem objects that the user has.

You may be able to do better and integrate this into your query using .annotate() , but I'm not sure how to do it right now.

0
source

All Articles