How to get function value in xgboost?

I use xgboost to create a model and try to find the importance of each function with get_fscore(), but it returns{}

and my train code:

dtrain = xgb.DMatrix(X, label=Y)
watchlist = [(dtrain, 'train')]
param = {'max_depth': 6, 'learning_rate': 0.03}
num_round = 200
bst = xgb.train(param, dtrain, num_round, watchlist)

So is there any mistake in my train? How to get function value in xgboost?

+14
source share
9 answers

In your code, you can get the importance for each function in the form of a hint:

bst.get_score(importance_type='gain')

>>{'ftr_col1': 77.21064539577829,
   'ftr_col2': 10.28690566363971,
   'ftr_col3': 24.225014841466294,
   'ftr_col4': 11.234086283060112}

Explanation: The API method train () get_score () is defined as:

get_score (fmap = '', priority_type = 'weight')

  • fmap (str (optional)) is the name of the feature map file.
  • importance_type
    • 'Weight - the number of times the function is used to split data across all trees.
    • β€˜ - , .
    • β€˜ - , .
    • β€˜Total_gain - , .
    • Total_cover - , .

https://xgboost.readthedocs.io/en/latest/python/python_api.html

+17

sklearn API XGBoost> = 0.81:

clf.get_booster().get_score(importance_type="gain")

regr.get_booster().get_score(importance_type="gain")

, regr.fit ( clf.fit) X pandas.DataFrame.

+8

fscore = clf.best_estimator_.booster().get_fscore()
+7

, , , :

model = xgb.train(params, d_train, 1000, watchlist)
fig, ax = plt.subplots(figsize=(12,18))
xgb.plot_importance(model, max_num_features=50, height=0.8, ax=ax)
plt.show()
+7

. :

:

pd.DataFrame(bst.get_fscore().items(), columns=['feature','importance']).sort_values('importance', ascending=False)

:

xgb.plot_importance(bst)
+5

XGboost

from xgboost import XGBClassifier, plot_importance
model = XGBClassifier()
model.fit(train, label)

. ,

sorted_idx = np.argsort(model.feature_importances_)[::-1]

( , Pandas)

for index in sorted_idx:
    print([train.columns[index], model.feature_importances_[index]]) 

, XGboost

plot_importance(model, max_num_features = 15)
pyplot.show()

max_num_features plot_importance, , .

+5

, xgb.XGBRegressor(), , , , pandas.DataFrame() numpy.array(), dmatrix(). , , gamma XGBRegressor.

fit = alg.fit(dtrain[ft_cols].values, dtrain['y'].values)
ft_weights = pd.DataFrame(fit.feature_importances_, columns=['weights'], index=ft_cols)

After installing the regressor, it fit.feature_importances_returns an array of weights, which I assume is in the same order as the columns of the pandas dataframe functions.

My current setup is Ubuntu 16.04, Anaconda distro, python 3.6, xgboost 0.6 and sklearn 18.1.

+3
source

Get a table containing ratings and item names , and then prepare it.

feature_important = model.get_score(importance_type='weight')
keys = list(feature_important.keys())
values = list(feature_important.values())

data = pd.DataFrame(data=values, index=keys, columns=["score"]).sort_values(by = "score", ascending=False)
data.plot(kind='barh')

For instance:

enter image description here

+3
source
print(model.feature_importances_)

plt.bar(range(len(model.feature_importances_)), model.feature_importances_)
0
source

All Articles