Not sure is stopwords function is stopwords in the function, I suppose it should be in , but you can use the Contdisk with most_common(10) to get the 10 most frequent ones:
from collections import Counter from string import punctuation def content_text(text): stopwords = set(nltk.corpus.stopwords.words('english'))
If you pass the nltk file object, just iterate over it:
def content_text(text): stopwords = set(nltk.corpus.stopwords.words('english')) with_stp = Counter() without_stp = Counter() for word in text:
The nltk method includes punctuation characters, which may not be what you want.
source share