Refactoring: Walkthrough
I want to guide you through the refactoring process. Learning to program is not only knowing the end result that you usually get when you ask a question about stack overflow. It's about how to get to this answer. When people publish short, tight answers to a similar question, it is not always obvious how they came to their decisions.
So, let's do some refactoring and see what we can do to simplify your code. We will rewrite, delete, rename and reorder the code until no further improvements are made.
Simplify Algorithms
Python should not be so detailed. This is usually the smell of code when you have explicit loops working on lists and dicts in Python, instead of using list methods and functions that work with containers in general.
Use defaultdict to store the number of characters
A defaultdict(int) will generate records when they are available if they do not exist. This eliminates the if / else branch when counting characters.
from collections import defaultdict characterDict = defaultdict(int) def putEncounteredCharactersInDictionary(lineStr): for character in lineStr: characterDict[character] += 1
Sort dicts
Dictionaries do not guarantee any order for their keys. You cannot assume that the elements are stored in the same order in which you insert them. Thus, sorting the dict records and then returning them back to another dict simply spun them back.
This means that your function basically does not work. After sorting the items, you will need to save them as a list of tuples in order to maintain the sort order. Then, by removing this code, we will reduce this method to one line.
def sortCharacterDictionary(characterDict): return sorted(characterDict.iteritems(), key=itemgetter(1))
Inverting dicts
Given the previous comment, you will no longer have a dictator after sorting. But assuming you did this, this function is one of those cases where an explicit loop is not recommended. In Python, always think about how you can work on collections at the same time, and not just one item at a time.
def inverseSortedCharacterDictionary(sortedCharDict): return dict((v, k) for k, v in sortedCharDict.iteritems())
All in one line, we (1) iterate over the key / value pairs in dict; (2) switch them and create inverted values โโ/ key tuples; (3) create a dict from these inverted tuples.
Comment and name wisely
The names of your methods are long and descriptive. There is no need to repeat the same information in the comments. Use comments only when your code is not self-describing, for example, when you have a complex algorithm or an unusual construction that is not immediately obvious.
At the beginning of naming, your names are unnecessarily long. I will stick to much less descriptive names, as well as make them more universal. Instead of inverseSortedCharacterDictionary try just invertedDict . What all this does is, it inverts the dict. It really doesn't matter if it passed the sorted dict character or any other type of dict.
As a rule, try using the most common names so that your methods and variables can be as universal as possible. More general remedies are more reusable.
characters = defaultdict(int) def countCharacters(string): for ch in string: characters[ch] += 1 def sortedCharacters(characters): return sorted(characters.iteritems(), key=itemgetter(1)) def invertedDict(d): return dict((v, k) for k, v in d.iteritems())
Reduce volume
Using temporary variables and helper methods is good programming practice, and I welcome you for this in your program. However, now that we have them simple enough so that each of them is only one or two lines, we probably do not even need them.
Here is your program object after changing the functions as described above:
f = open('funkymess.txt', 'r') for line in f: countCharacters(line.rstrip('\n')) f.close() print sortedCharacters(characters)[0]
And then let's just go ahead and build in these helper methods, since they are so simple. Here is the final program after refactoring:
Final program
#!/usr/bin/env python from operator import itemgetter from collections import defaultdict characters = defaultdict(int) f = open('funkymess.txt','r') for line in f: for ch in line.rstrip('\n'): characters[ch] += 1 f.close() print sorted(characters.iteritems(), key=itemgetter(1))[0]