Suppose:
book = ["once", "upon", "time", ...., "end", "of", "very", "long", "story"] dct = ["alfa", "anaconda", .., "zeta-jones"]
And you want to remove from the list of books all the elements that are present in the dct.
Fast decision:
short_story = [word in book if word not in dct]
Speeding up a dct search: turning a dct into a set is a quick search:
dct = set(dct) short_story = [word in book if word not in dct]
If the book is very long and does not fit into memory, you can process it word for word. To do this, we can use the generator:
def story_words(fname): """fname is name of text file with a story""" with open(fname) as f: for line in f: for word in line.split() yield word
And in case your dictionary is too large, you will have to give up speed and repeat the content as well. But I'll skip this now.
source share