Django / Python - grouping objects into a common set of many-to-many relationships

This is part of the algorithm-logical question (how to do it), the question of implementation of the part (how to do it best!). I work with Django, so I decided to share this.

In Python, it's worth mentioning that the problem is somewhat related to how-do-i-use-pythons-itertoolsgroupby .

Suppose you are given two classes built by the Django Model:

from django.db import models class Car(models.Model): mods = models.ManyToManyField(Representative) 

and

 from django.db import models class Mods(models.Model): ... 

How to get a list of cars grouped by cars with a common set of mods?

those. I want to get class likeso:

 Cars_by_common_mods = [ { mods: { 'a' }, cars: { 'W1', 'W2' } }, { mods: { 'a', 'b' }, cars: { 'X1', 'X2', 'X3' }, }, { mods: { 'b' }, cars: { 'Y1', 'Y2' } }, { mods: { 'a', 'b', 'c' }, cars: { 'Z1' } }, ] 

I was thinking of something like:

 def cars_by_common_mods(): cars = Cars.objects.all() mod_list = [] for car in cars: mod_list.append( { 'car': car, 'mods': list(car.mods.all()) } ret = [] for key, mods_group in groupby(list(mods), lambda x: set(x.mods)): ret.append(mods_group) return ret 

However, this does not work because (perhaps, among other reasons) groupby does not seem to be grouped together by mods. I think mod_list should be sorted to work with groupby. All that I can say, I'm sure that there will be something simple and elegant, which will be both enlightenment and lighting.

Greetings and thanks!

+6
python algorithm django puzzle
source share
5 answers

Have you tried to sort the list first? The algorithm you proposed should work, albeit with a large number of database queries.

 import itertools cars = [ {'car': 'X2', 'mods': [1,2]}, {'car': 'Y2', 'mods': [2]}, {'car': 'W2', 'mods': [1]}, {'car': 'X1', 'mods': [1,2]}, {'car': 'W1', 'mods': [1]}, {'car': 'Y1', 'mods': [2]}, {'car': 'Z1', 'mods': [1,2,3]}, {'car': 'X3', 'mods': [1,2]}, ] cars.sort(key=lambda car: car['mods']) cars_by_common_mods = {} for k, g in itertools.groupby(cars, lambda car: car['mods']): cars_by_common_mods[frozenset(k)] = [car['car'] for car in g] print cars_by_common_mods 

Now about those queries:

 import collections import itertools from operator import itemgetter from django.db import connection cursor = connection.cursor() cursor.execute('SELECT car_id, mod_id FROM someapp_car_mod ORDER BY 1, 2') cars = collections.defaultdict(list) for row in cursor.fetchall(): cars[row[0]].append(row[1]) # Here one I prepared earlier, which emulates the sample data we've been working # with so far, but using the car id instead of the previous string. cars = { 1: [1,2], 2: [2], 3: [1], 4: [1,2], 5: [1], 6: [2], 7: [1,2,3], 8: [1,2], } sorted_cars = sorted(cars.iteritems(), key=itemgetter(1)) cars_by_common_mods = [] for k, g in itertools.groupby(sorted_cars, key=itemgetter(1)): cars_by_common_mods.append({'mods': k, 'cars': map(itemgetter(0), g)}) print cars_by_common_mods # Which, for the sample data gives me (reformatted by hand for clarity) [{'cars': [3, 5], 'mods': [1]}, {'cars': [1, 4, 8], 'mods': [1, 2]}, {'cars': [7], 'mods': [1, 2, 3]}, {'cars': [2, 6], 'mods': [2]}] 

Now that you have lists of car identifiers and model identifiers, if you need complete objects for work, you can make one request for each, to get a complete list for each model and create a dict search for those associated with their identifiers, - then, I suppose, Bob is your proverbial father-brother.

+4
source share

check regroup . this is only for templates, but I assume that this classification applies to the presentation layer anyway.

+2
source share

You have a few problems here.

You did not sort your list before calling groupby, and this is necessary. From the itertools documentation :

As a rule, iterability should already be sorted by the same key function.

Then you do not duplicate the list returned by groupby. Again, the documentation reads:

The returned group itself is an iterator that divides the base iterative group by (). Since the source is shared when the groupby object is advanced, the previous group is no longer displayed. So, if you need this data later, save as a list:

 groups = [] uniquekeys = [] for k, g in groupby(data, keyfunc): groups.append(list(g)) # Store group iterator as a list uniquekeys.append(k) 

And the last mistake is to use sets as keys. They do not work here. A quick solution is to assign them to sorted tuples (there might be a better solution, but I can't think about it now).

So, in your example, the last part should look like this:

 sortMethod = lambda x: tuple(sorted(set(x.mods))) sortedMods = sorted(list(mods), key=sortMethod) for key, mods_group in groupby(sortedMods, sortMethod): ret.append(list(mods_group)) 
+1
source share

If performance is a problem (for example, a lot of cars on a page or a site with high traffic), denormalization makes sense and simplifies your problem as a side effect.

Remember that denormalizing many-to-many relationships can be a bit complicated. I have not come across such code examples yet.

+1
source share

Thanks to everyone for the helpful answers. I am locked on this problem. The "best" solution is still eluding me, but I have some thoughts.

I should mention that the statistics are the dataset I work with. In 75% of cases there will be one mod. In 24% of cases, two. In 1% of cases there will be zero or three or more. There is at least one unique car for each Mod, although the Mod can be applied to many cars.

Having said that, I reviewed (but did not implement) something like-so:

 class ModSet(models.Model): mods = models.ManyToManyField(Mod) 

and change cars to

 class Car(models.Model): modset = models.ForeignKey(ModSet) 

This is trivial for the Car.modset group: I can use a rearrangement, for example, suggested by Javier. This seems like a simpler and reasonably elegant solution; thoughts would be greatly appreciated.

0
source share

All Articles