How to create a txt frequency counter with all letters (az) in python 3

Question

How to create a txt frequency counter with all letters (az) in python 3

I have a text file called textf that looks something like this:

rxgmgcwbd c qcyurr bkxgmq, lwrg grru rrwxtam rwgzwt am quyam cv avrrgdwkxgcr.iwxbdamcz xdalguj qarc ram av vcmfwgmgum. yw'g

I want to make a frequency for each letter in a text file, but I want it with the condition that if the letter does not appear in the text, it must have a key pair: a value with a value of 0. For example, if z was not in the text, it should look something like "z": 0, etc. for all letters (a to z). I made the following code:

 import string from collections import Counter with open("textf.txt") as tf: letter = tf.read() letter_count = Counter(letter.translate(str.maketrans('','',string.punctuation))) print("Frequency count of letter:","\n",letter_count)

But the result looks something like this:

 Counter({' ': 110, 'r': 12, 'c': 88, 'a': 55, 'g': 57, 'w': 76, 'm': 76, 'x': 72, 'u': 70, 'q': 41, 'y': 40, 'j': 36, 'l': 32, 'b': 18, 'd': 28, 'v': 27, 'k': 22, 't': 19, 'f': 18, 'z': 16, 'i': 7})

I'm trying to make the space count ' ': 110 not showing up and that I have all the letters (az), and when the letter doesn't appear in the text, my result prints something like 'n': 0 and soon. Any ideas or suggestions on how I could make this possible?

+7

python python-3.x

adda.fuentes Oct 3 '17 at 15:51

source share

6 answers

You can do it like this:

 x = "rxgmgcwbd c qcyurr bkxgmq, lwrg grru rrwxtam rwgzwt am quyam cv avrrgdwkxgcr.iwxbdamcz xdalguj qarc ram av vcmfwgmgum. yw'g" import string freq = {i:0 for i in string.ascii_lowercase} for i in x: if i in freq: freq[i] += 1

You can also replace the for-loop with a dictionary understanding ( although it is ineffective for what we are trying to do, since it uses count - but added as a way for reference only):

 freq = {i:x.count(i) for i in freq}

This will result in:

 {'a': 9, 'c': 8, 'b': 3, 'e': 0, 'd': 4, 'g': 12, 'f': 1, 'i': 1, 'h': 0, 'k': 2, 'j': 1, 'm': 10, 'l': 2, 'o': 0, 'n': 0, 'q': 4, 'p': 0, 's': 0, 'r': 14, 'u': 5, 't': 2, 'w': 9, 'v': 4, 'y': 3, 'x': 6, 'z': 2}

+8

coder Oct 3 '17 at 15:58

source share

You can initialize your Counter() with a dictionary. In this case, understanding the dictionary is used to initialize all lowercase letters to zero.

Using update() with letter will then add to these existing values:

 from collections import Counter letter = "hello world " letter_counts = Counter({l:0 for l in string.ascii_lowercase}) letter_counts.update(letter.translate(str.maketrans('','',string.punctuation + ' '))) print(letter_counts)

Providing you:

 Counter({'l': 3, 'o': 2, 'd': 1, 'w': 1, 'h': 1, 'r': 1, 'e': 1, 'p': 0, 'c': 0, 'j': 0, 'x': 0, 't': 0, 'g': 0, 'n': 0, 'f': 0, 'u': 0, 'm': 0, 'q': 0, 'z': 0, 's': 0, 'y': 0, 'a': 0, 'b': 0, 'i': 0, 'k': 0, 'v': 0})

To get rid of space, add it to the punctuation line.

+6

Martin evans Oct 3 '17 at 15:56

source share

What about

 import string from collections import defaultdict row="rxgmgcwbd c qcyurr bkxgmq, lwrg grru rrwxtam rwgzwt am quyam cv avrrgdwkxgcr.iwxbdamcz xdalguj qarc ram av vcmfwgmgum. yw'g" letters = string.ascii_lowercase stats = defaultdict(list) for l in letters: stats[l]=0 for l in row: if l.isalpha(): stats[l]+=1

+1

Daniele bacarella Oct 3 '17 at 16:08

source share

You can use dict.fromkeys to start the dictionary with a default value of 0 for missing letters. And then update this dictionary:

 import string x = "rxgmgcwbd c qcyurr bkxgmq, lwrg grru rrwxtam rwgzwt am quyam cv avrrgdwkxgcr.iwxbdamcz xdalguj qarc ram av vcmfwgmgum. yw'g" letter_count = dict.fromkeys(string.ascii_lowercase, 0) for c in x: if c in string.ascii_lowercase: letter_count[c] += 1 print letter_count

Output:

 {'a': 9, 'c': 8, 'b': 3, 'e': 0, 'd': 4, 'g': 12, 'f': 1, 'i': 1, 'h': 0, 'k': 2, 'j': 1, 'm': 10, 'l': 2, 'o': 0, 'n': 0, 'q': 4, 'p': 0, 's': 0, 'r': 14, 'u': 5, 't': 2, 'w': 9, 'v': 4, 'y': 3, 'x': 6, 'z': 2}

+1

Delimitry Oct 3 '17 at 16:14

source share

... that my result prints something like ...

Other answers focus on choosing a different data structure, but for me it sounds like you have already chosen the right data structure, Counter and just want to show the result beautifully. So something like this:

 display_str = "{" + ", ".join("'{}': {}".format(x, letter_count[x]) for x in string.ascii_lowercase) + "}" print("Frequency count of letter:", display_str, sep="\n")

+1

Arthur tacca Oct 3 '17 at 20:27

source share

PM 2Ring · Accepted Answer · 2017-10-03T16:06:16+0000

One way to do this is to make a normal dict from your counter, using lowercase letters as the keys of the new dict. We use the dict.get method to supply a default value of zero for missing letters.

 import string from collections import Counter letter = "rxgmgcwbd c qcyurr bkxgmq, lwrg grru rrwxtam rwgzwt am quyam cv avrrgdwkxgcr.iwxbdamcz xdalguj qarc ram av vcmfwgmgum. yw'g" letter_count = Counter(letter.translate(str.maketrans('','',string.punctuation))) letter_count = {k: letter_count.get(k, 0) for k in string.ascii_lowercase} print("Frequency count of letter:\n", letter_count)

Exit

 Frequency count of letter: {'a': 9, 'b': 3, 'c': 8, 'd': 4, 'e': 0, 'f': 1, 'g': 12, 'h': 0, 'i': 1, 'j': 1, 'k': 2, 'l': 2, 'm': 10, 'n': 0, 'o': 0, 'p': 0, 'q': 4, 'r': 14, 's': 0, 't': 2, 'u': 5, 'v': 4, 'w': 9, 'x': 6, 'y': 3, 'z': 2}

If you do this in Python 3.6+, you get a side benefit that the new dict sorts alphabetically (although this behavior is currently only a part of an implementation that should not be relied on).

As user letter_count.get(k, 0) mentions in the comments, we do not need to use letter_count.get(k, 0) , since the counter automatically returns zero if we try to read the value of a nonexistent key. Thus, the understanding of dict can be changed to

 letter_count = {k: letter_count[k] for k in string.ascii_lowercase}

How to create a txt frequency counter with all letters (az) in python 3

More articles: