XGBoost crash on Windows 10

GeoffPearl · June 27, 2019, 6:49pm

Hi there

We have two server machines running XGBoost on Windows Server 2016 (one Standard, one DataCenter) - both of these are running fine without issues on version 0.82 installed via pip.

However, on my Windows 10 machine, I get a crash when trying to call predict on one row of data. I have validated that the data being passed is identical across all machines (by loading the input df in both from pickled file), and further to this the python environment is identical in terms of installed packages and their versions.

What is interesting is that when I compile the source code manually in Visual Studio and instead use the compiled xgboost.dll instead of the version downloaded from pip, I do not get the crash and all appears to work fine.

Could this be related possibly to an unwanted interraction between the DLL that forms part of the package, and the C:\Windows\System32 dlls? These system DLLs used in xgboost.dll are the only obvious difference I can make out from looking at process explorer, and the versions on the Windows 10 machine are mostly different from those running Windows Server 2016.

My system specs are as follows, along with the call stack from the crash:

The function call that fails is a call to XGBoosterPredict, called from core.py in the package. Note that n_jobs is set to 1, so there should be no attempt at parallelization despite the position in the code where the crash seems to happen in the call stack above.

I should also point out that the same behavior occurs with version 0.9 of XGBoost - the pip installed DLL crashes, but compiling the source code and using this version does not cause the crash.

If anyone has any tips as to how I can get the pip repo version of the DLL to work it would be much appreciated. Thanks!

hcho3 · June 27, 2019, 9:08pm

@GeoffPearl

XGBoost from PIP includes vcomp140.dll, which is the OpenMP runtime. Can you try removing the bundled vcomp140.dll from the XGBoost installation directory and see if it fixes the crash? (I included vcomp140.dll in the wheel so that users won’t have to install Microsoft Visual C++ Redistributable.)

And here is the build environment I used to build the wheel:

Windows Server 2012 R2
Visual Studio 2017

The wheel has been tested against the following environment (see the build log):

Windows Server 2008 R2
Windows Server 2012 R2
Windows Server 2016
Windows Server 2019

However, it hasn’t been tested against Windows 10. This is because our build server (https://xgboost-ci.net) is currently hosted on AWS, which only supports Server editions of Windows.

GeoffPearl · June 27, 2019, 10:00pm

Hi hcho3 - thanks for your quick response!

I did try this again after removing the vcomp140.dll from the installation directories, but I still get the crash unfortunately - running listdlls does show though that it is now using my c:\windows\system32\vcomp140.dll version in this case.

Thanks for your help - if Windows 10 is not supported yet I will continue to compile the source code for new versions for the time being.

hcho3 · June 27, 2019, 10:18pm

I’m wondering what’s causing incompatibility between Windows Server 2012 R2 and Windows 10. My previous understanding was that Windows Server editions are basically identical to Windows Pro editions.

hcho3 · June 27, 2019, 10:21pm

I created a new issue to keep track: https://github.com/dmlc/xgboost/issues/4616

hcho3 · July 3, 2019, 12:56am

@GeoffPearl Update: I just made a fresh installation of Windows 10 Education Edition inside VirtualBox and was able to run the following script:

import xgboost

# Get data from https://github.com/dmlc/xgboost/tree/master/demo/data
dtrain = xgboost.DMatrix('agaricus.txt.train')
dtest = xgboost.DMatrix('agaricus.txt.test')

params = {'booster': 'gbtree', 'objective': 'binary:logistic',
          'learning_rate': 1.0, 'gamma': 1.0, 'min_child_weight': 1.0,
          'max_depth': 3, 'eval_metric': 'auc'}

bst = xgboost.train(params, dtrain, num_boost_round=10,
                    evals=[(dtrain, 'train'), (dtest, 'test')])

print(bst.predict(dtest))

In short, I was not able to reproduce the crash.

Question: What kind of Python environment were you using inside your Windows 10 machine? I had great success with Miniconda, where I installed NumPy and SciPy with

conda install numpy scipy

hcho3 · July 3, 2019, 12:58am

Often I’ve had problems with the “vanilla” Python (from the official Python website https://www.python.org/), where import numpy will crash the Python session. The reason appears to be that Windows often lacks the necessary system libraries that NumPy is depending on. Thus, I prefer to use Miniconda on Windows.