I am currently taking a course on studying data at the university, but I am a bit stuck with the problem of sorting with multiple indexes.
The actual data includes about a million movie reviews, and I'm trying to analyze this based on US zip codes, but to check how to do what I want, I used a much smaller dataset of 250 random generated ratings for 10 films and instead of zip codes, I I use age groups.
So, this is what I have now, this is a multi-indexed DataFrame in Pandas with two levels: “group” and “name”
rating group title Alien 4.000000 Argo 2.166667 Adults Ben-Hur 3.666667 Gandhi 3.200000 ... ... Alien 3.000000 Argo 3.750000 Coeds Ben-Hur 3.000000 Gandhi 2.833333 ... ... Alien 2.500000 Argo 2.750000 Kids Ben-Hur 3.000000 Gandhi 3.200000 ... ...
What I'm going to do is sort the headings based on their rating within the group (and show only the most popular 5 or so in each group)
So something like this (but I'm going to show only two names in each group):
rating group title Alien 4.000000 Adults Ben-Hur 3.666667 Argo 3.750000 Coeds Alien 3.000000 Gandhi 3.200000 Kids Ben-Hur 3.000000
Does anyone know how to do this? I tried sort_order, sort_index etc. And changed levels, but they also mix groups. Therefore, it looks like this:
rating group title Adults Alien 4.000000 Coeds Argo 3.750000 Adults Ben-Hur 3.666667 Kids Gandhi 3.666667 Coeds Alien 3.000000 Kids Ben-Hur 3.000000
I kind of looked for something like this: Sorting multiple indexes in Pandas , but instead of sorting at a different level, I want to sort based on values. It’s as if this person wanted to sort based on their sales column.
Thanks!
python sorting pandas multi-index
Nadamir
source share