I have a somewhat large document and you want to make a stop word exception and be based on the words of this document with Python. Does anyone know of a shelf package for them? Also, code that is fast enough for large documents is not welcome. Thanks
NLTK supports this.
If for some reason you do not want to use NLTK, you can try PyStemmer. To stop a word, simply download the list (google it) and filter them.