Using the topic model, how do we create a list of stop words?

There are standard standard stop lists that allow you to remove words from the word "no." However, I am wondering if there should be a list of stops in each case?

For example, I have 10K articles from a journal, and then because of the structure of the article, basically you will see words like “introduction, review, conclusion, page” in each article. My concern is: should these words be removed from our corps? (words that each document has?) Thanks to every comment and sentence.

+4
source share
2 answers

, . , - . . , "", "" .., , . , .

+4

, , . ?

. 2017 : " : ". http://www.cs.cornell.edu/~xanda/stopwords2017.pdf

(), - LDA, , .

, .. , (, 50%), , , , -. , , , , .

0

All Articles