I wrote the following python snippet to store start words as a word as a dictionary and without it appearing as a value in this dictionary against the key.
#!/usr/bin/env python import sys import re hash = {} # initialize an empty dictinonary for line in sys.stdin.readlines(): for word in line.strip().split(): # removing newline char at the end of the line x = re.search(r"[AZ]\S+", word) if x: #if word[0].isupper(): if word in hash: hash[word] += 1 else: hash[word] = 1 for word, cnt in hash.iteritems(): # iterating over the dictionary items sys.stdout.write("%d %s\n" % (cnt, word))
In the above code, I showed both ways, an array index, to check the capital letter of the beginning and using a regular expression. A welcome suggestion to improve the above code for performance or for simplicity is welcome
source share