You can fit encode the label, and then transform labels to their normalized encoding as follows:
In [4]: from sklearn import preprocessing ...: import numpy as np In [5]: le = preprocessing.LabelEncoder() In [6]: le.fit(np.unique(df.values)) Out[6]: LabelEncoder() In [7]: list(le.classes_) Out[7]: ['A', 'B', 'C', 'D', 'E'] In [8]: df.apply(le.transform) Out[8]: Feat1 Feat2 Feat3 Feat4 Feat5 0 0 0 0 0 4 1 1 1 2 2 4 2 2 3 2 2 4 3 3 0 2 3 4
One way to specify default labels:
In [9]: labels = ['A', 'B', 'C', 'D', 'E'] In [10]: enc = le.fit(labels) In [11]: enc.classes_
source share