How to install XGBoost package in python on Windows

I tried installing the XGBoost package in Python. I am using Windows OS 64bit. I went through the following.

The package directory states that xgboost is unstable for windows and is disabled: installing pip in windows is currently disabled for further study, please install from github. https://pypi.python.org/pypi/xgboost/

I am not very good at Visual Studio, I am faced with the problem of building XGBoost. I miss opportunities to use the xgboost package in data science.

Please guide so that I can import the XGBoost package into Python.

thanks

+11
python xgboost
source share
7 answers

If you are using anaconda (or miniconda ), you can use the following:

  • conda install -c anaconda py-xgboost UPDATED 2019-09-20
  • Documentation

Check installation by:

  • Activation of the environment (see below)
  • Running conda list

To activate the environment :

On Windows, at the Anaconda prompt, run (assuming your environment is called myenv ):

  • activate myenv

On macOS and Linux, in a terminal window, run (assuming your environment is called myenv ):

  • source activate myenv

Conda injects the myenv path into your system command.

+15
source share

Build it from here:

  • download the xgboost whl file from here (make sure that it matches your version of Python and the system architecture, for example, "xgboost-0.6-cp35-cp35m-win_amd64.whl" for python 3.5 on a 64-bit machine)
  • open command line
  • Go to the Downloads folder (or wherever you save the whl file) pip install xgboost-0.6-cp35-cp35m-win_amd64.whl (or whatever your whl file is listed there)
+6
source share

First you need to create the library via "make", then you can install it using the anaconda prompt (if you want it on anaconda) or git bash (if you use it only in Python).

First follow the official guide with the following procedure (on git bash on Windows):

 git clone --recursive https://github.com/dmlc/xgboost git submodule init git submodule update 

then install TDM-GCC here and follow these steps in git bash:

 alias make='mingw32-make' cp make/mingw64.mk config.mk; make -j4 

Finally, follow these steps with an anaconda or git Bash prompt:

 cd xgboost\python-package python setup.py install 

Also see these great resources:

Official guide

Install Xgboost on Windows

Install XGBoost for Anaconda on Windows

+4
source share

You can install the catboost package. This is the recently opened gradient enhancement library, which in most cases is more accurate and faster than XGBoost and has support for categorical functions. Here is the library site: https://catboost.ai

+2
source share

The following command should work, but if you have problems with this command

conda install -c conda-forge xgboost

Activate your environment first. Suppose your environment is named just write in the conda terminal:

 activate <MY_ENV> 

and then

  pip install xgboost 
+1
source share

I installed xgboost on windows os, following the resources above, which is still not available in pip. However, I tried using the following function code to configure cv parameters:

 #Import libraries: import pandas as pd import numpy as np import xgboost as xgb from xgboost.sklearn import XGBClassifier from sklearn import cross_validation, metrics #Additional sklearn functions from sklearn.grid_search import GridSearchCV #Perforing grid search import matplotlib.pylab as plt %matplotlib inline from matplotlib.pylab import rcParams rcParams['figure.figsize'] = 12, 4 train = pd.read_csv('train_data.csv') target = 'target_value' IDcol = 'ID' 

A function is created to obtain the optimal parameters and display the output in a visual form.

 def modelfit(alg, dtrain, predictors,useTrainCV=True, cv_folds=5, early_stopping_rounds=50): if useTrainCV: xgb_param = alg.get_xgb_params() xgtrain = xgb.DMatrix(dtrain[predictors].values, label=dtrain[target].values) cvresult = xgb.cv(xgb_param, xgtrain, num_boost_round=alg.get_params()['n_estimators'], nfold=cv_folds, metrics='auc', early_stopping_rounds=early_stopping_rounds, show_progress=False) alg.set_params(n_estimators=cvresult.shape[0]) #Fit the algorithm on the data alg.fit(dtrain[predictors], dtrain[target_label],eval_metric='auc') #Predict training set: dtrain_predictions = alg.predict(dtrain[predictors]) dtrain_predprob = alg.predict_proba(dtrain[predictors])[:,1] #Print model report: print "\nModel Report" print "Accuracy : %.4g" % metrics.accuracy_score(dtrain[target_label].values, dtrain_predictions) print "AUC Score (Train): %f" % metrics.roc_auc_score(dtrain[target_label], dtrain_predprob) feat_imp = pd.Series(alg.booster().get_fscore()).sort_values(ascending=False) feat_imp.plot(kind='bar', title='Feature Importances') plt.ylabel('Feature Importance Score') 

Now, when the function is called to get the optimal parameters:

  #Choose all predictors except target & IDcols predictors = [x for x in train.columns if x not in [target]] xgb = XGBClassifier( learning_rate =0.1, n_estimators=1000, max_depth=5, min_child_weight=1, gamma=0, subsample=0.7, colsample_bytree=0.7, objective= 'binary:logistic', nthread=4, scale_pos_weight=1, seed=198) modelfit(xgb, train, predictors) 

Although a function importance chart is displayed, information about the parameters in the red box at the top of the chart is missing: enter image description here Consulted with people using linux / mac and installed xgboost. They receive the above information. I was wondering if this is related to a specific implementation, I build and install in windows. And how can I get the parameter information displayed above the chart. At the moment, I am getting a diagram, not a red frame and the information inside it. Thanks.

0
source share

Besides the fact that there are already developers on github that builds from source code (creates a c ++ environment, etc.), I found an easier way to do this, which I explained here with the details. Essentially, you need to go to the UC Irvine website and download the .whl file, then go to the cd folder and install xgboost using pip.

0
source share

All Articles