When I want to predict a new instance, I need to convert the data of the new instance so that the functions correspond to the same encoding as in the model. Is there an easy way to achieve this?
If you are not completely sure how your "pipeline" of your classification works, but you can simply use the fit LabelEncoder method for some new data - le convert the new data if the labels correspond to existing training sets.
from sklearn import preprocessing le = preprocessing.LabelEncoder()
Also, if I want to save the model and get it, is there an easy way to save the encoding format so that I can use it to convert new instances to the extracted model?
You can save models / parts of your models as follows:
import cPickle as pickle from sklearn.externals import joblib from sklearn import preprocessing le = preprocessing.LabelEncoder() train_x = [0,1,2,6,'true','false'] le.fit_transform(train_x)
source share