XGBoost plot_importance does not show function names

I am using XGBoost with Python and have successfully trained the model using an XGBoost function train()called data DMatrix. The matrix was created from the Pandas frame, which has function names for the columns.

Xtrain, Xval, ytrain, yval = train_test_split(df[feature_names], y, \
                                    test_size=0.2, random_state=42)
dtrain = xgb.DMatrix(Xtrain, label=ytrain)

model = xgb.train(xgb_params, dtrain, num_boost_round=60, \
                  early_stopping_rounds=50, maximize=False, verbose_eval=10)

fig, ax = plt.subplots(1,1,figsize=(10,10))
xgb.plot_importance(model, max_num_features=5, ax=ax)

Now I want to see the importance of the function with the function xgboost.plot_importance(), but as a result of this graph, the names of the functions are not displayed. Instead, the functions are listed as f1, f2, f3, etc., as shown below.

enter image description here

I think the problem is that I converted the original Pandas data frame to DMatrix. How can I relate the names of objects so that they are displayed in a graph of the importance of the function?

+15
5

feature_names xgb.DMatrix

dtrain = xgb.DMatrix(Xtrain, label=ytrain, feature_names=feature_names)
+17

train_test_split dataframe numpy, .

, @piRSquared, DMatrix. numpy, train_test_split Dataframe, .

Xtrain, Xval, ytrain, yval = train_test_split(df[feature_names], y, \
                                    test_size=0.2, random_state=42)

# See below two lines
X_train = pd.DataFrame(data=Xtrain, columns=feature_names)
Xval = pd.DataFrame(data=Xval, columns=feature_names)

dtrain = xgb.DMatrix(Xtrain, label=ytrain)
+7

scikit-learn, XGBoost Booster scikit, :

model = joblib.load("your_saved.model")
model.get_booster().feature_names = ["your", "feature", "name", "list"]
xgboost.plot_importance(model.get_booster())
+4

, , feature_names. , , XGBoost v0.80, .

## Saving the model to disk
model.save_model('foo.model')
with open('foo_fnames.txt', 'w') as f:
    f.write('\n'.join(model.feature_names))

## Later, when you want to retrieve the model...
model2 = xgb.Booster({"nthread": nThreads})
model2.load_model("foo.model")

with open("foo_fnames.txt", "r") as f:
    feature_names2 = f.read().split("\n")

model2.feature_names = feature_names2
model2.feature_types = None
fig, ax = plt.subplots(1,1,figsize=(10,10))
xgb.plot_importance(model2, max_num_features = 5, ax=ax)

Thus, it saves feature_namesand adds them separately later. For some reason, it is feature_typesalso necessary to initialize, even if the value is equal None.

+1
source

With the Scikit-Learn Wrapper "XGBClassifier" interface, plot_importance returns the "Axis matplotlib" class. So we can use axes.set_yticklabels.

plot_importance(model).set_yticklabels(['feature1','feature2'])

0
source

All Articles