What you mean by classification is very important.
Classification is a controlled task that requires a pre-labeled case in advance. Moving from an already labeled case, you need to create a model using several methods and approaches, and finally, you can classify an unmarked test case using this model. If so, you can use a classifier with several classes, which is usually the binary tree of a binary classifier application. A modern approach for this kind of tasks is to use the machine learning branch, SVM . Two of the best SVM classifiers: LibSVM and SVMlight . They are open source, easy to use, and include multiclass classification tools. Finally, you should review the literature to understand what to do, as well as get good results, because using these classifiers alone is not enough. You must manipulate / pre-process your enclosure to retrieve parts that carry information (e.g. unigrams) and exclude noisy parts. In general, you most likely have a long way to go, but NLP is a very interesting topic and worth the work.
However, if you mean classification, this is clustering, then the problem will be more complicated. Clustering is an uncontrolled task, which means that you will not include information about which example belongs to which group / topic / class. There are also academic papers on hybrid semi-controlled approaches, but they diverge slightly from the real purpose of the clustering problem. The pre-processing that you should use when managing your enclosure is similar in nature to what you should do in the classification problem, so I will not mention this anymore. Clustering requires several approaches. Firstly, you can use the LDA (Latent Dirichlet Allocation) method to reduce the dimension (the number of dimensions of your spatial space) of your enclosure, which will increase the efficiency and information of the functions. Near or after the LDA, you can use Hierarchical Clustering or similar other methods, such as K-Tools , to group your unlabeled body. You can use Gensim or Scikit-Learn as open source tools for clustering. Both are powerful, well-documented, and easy-to-use tools.
In all cases, do a lot of academic reading and try to understand the theory under these tasks and problems. Thus, you can come up with innovative and effective solutions for which you are specifically involved, because problems in NLP usually depend on your body, and you, as a rule, are on their own when dealing with your specific problem. It is very difficult to find common and ready-to-use solutions, and I also do not recommend relying on this option.
I can answer your question, sorry for the irrelevant details.
Good luck =)