Analyzing, aggregating and sorting a text file in Python

Question

Analyzing, aggregating and sorting a text file in Python

I have a file called "names.txt" with the following contents:

{"1":[1988, "Anil 4"], "2":[2000, "Chris 4"], "3":[1988, "Rahul 1"], "4":[2001, "Kechit 3"], "5":[2000, "Phil 3"], "6":[2001, "Ravi 4"], "7":[1988, "Ramu 3"], "8":[1988, "Raheem 5"], "9":[1988, "Kranti 2"], "10":[2000, "Wayne 1"], "11":[2000, "Javier 2"], "12":[2000, "Juan 2"], "13":[2001, "Gaston 2"], "14":[2001, "Diego 5"], "15":[2001, "Fernando 1"]}

Problem: The file "names.txt" contains student records in the format -

{"number": [year of birth, "name rank"]}

Parse this file and divide them according to year, and then sort the names by wound. First segregation and then sorting. The output should be in the format -

 {year : [Names of students in sorted order according to rank]}

So the expected result is

 {1988:["Rahul 1","Kranti 2","Rama 3","Anil 4","Raheem 5"], 2000:["Wayne 1","Javier 2","Jaan 2","Phil 3","Chris 4"], 2001:["Fernando 1","Gaston 2","Kechit 3","Ravi 4","Diego 5"]}

First, How do I save the contents of this file in a dictionary object? Then it is grouped by year, and then the names are ordered by rank? How to achieve this in Python?

Thanks..

+6

python sorting aggregate grouping

user123 Aug 21 '15 at 16:49

source share

2 answers

Segregation can be done using collections.defaultdict in a simple loop. Then another cycle over the student lists sorts them by the integer value of the last part of the student’s records. And pprint() prints the desired output if we convert defaultdict to normal:

 #!/usr/bin/env python from __future__ import absolute_import, division, print_function import json from collections import defaultdict from pprint import pprint def main(): with open('test.json') as student_file: id2student = json.load(student_file) # # Segregate by year. # year2students = defaultdict(list) for year, student_and_rank in id2student.itervalues(): year2students[year].append(student_and_rank.encode('utf8')) # # Sort by rank. # for students in year2students.itervalues(): students.sort(key=lambda s: int(s.rsplit(' ', 1)[-1])) pprint(dict(year2students)) if __name__ == '__main__': main()

-1

Blackjack Aug 22 '15 at 12:35

source share

Pratik patil · Accepted Answer · 2015-10-14T09:10:57+0000

Its very simple :)

 #!/usr/bin/python # Program: Parsing, Aggregating & Sorting text file in Python # Developed By: Pratik Patil # Date: 22-08-2015 import pprint; # Open file & store the contents in a dictionary object file = open("names.txt","r"); file_contents=eval(file.readlines().pop(0)); # Extract all lists from file contents file_contents_values=file_contents.values(); # Extract Unique Years & apply segregation year=sorted(set(map(lambda x:x[0], file_contents_values))); file_contents_values_grouped_by_year = [ [y[1] for y in file_contents_values if y[0]==x ] for x in year]; # Create Final Dictionary by combining respective keys & values output=dict(zip(year, file_contents_values_grouped_by_year)); # Apply Sorting based on ranking for NameRank in output.values(): NameRank.sort(key=lambda x: int(x.split()[1])); # Print Output by ascending order of keys pprint.pprint(output);

Analyzing, aggregating and sorting a text file in Python

More articles: