Loading model from disk uses cpu_predictor instead of gpu_predictor

[Migrated from https://github.com/dmlc/xgboost/issues/3447]

System: Windows 10 x64 Professional
Source GPU build from 0.72 release branch
Using R as follows:

> sessionInfo()
R version 3.5.0 (2018-04-23)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows >= 8 x64 (build 9200)

Matrix products: default

locale:
[1] LC_COLLATE=English_United States.1252  LC_CTYPE=English_United States.1252    LC_MONETARY=English_United States.1252
[4] LC_NUMERIC=C                           LC_TIME=English_United States.1252    

attached base packages:
[1] parallel  stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] stringi_1.1.7     xgboost_0.71.2    doParallel_1.0.11 iterators_1.0.9   foreach_1.4.4     raster_2.6-7      rgdal_1.3-3      
[8] sp_1.3-1         

loaded via a namespace (and not attached):
[1] Rcpp_0.12.17      lattice_0.20-35   codetools_0.2-15  grid_3.5.0        magrittr_1.5      data.table_1.11.4 Matrix_1.2-14    
[8] tools_3.5.0       compiler_3.5.0 
Running this script:

require(xgboost)
data(agaricus.train, package='xgboost')
data(agaricus.test, package='xgboost')
dtrain <- xgb.DMatrix(agaricus.train$data, label = agaricus.train$label)
dtest <- xgb.DMatrix(data = agaricus.test$data, label = agaricus.test$label)
param <- list(max_depth=2, eta=0.1, nthread = 8, tree_method="gpu_exact", predictor="gpu_predictor", objective = "gpu:binary:logistic")
xgbmodel <- xgb.train(param, dtrain, nrounds = 5, verbose = 2)

xgb.save(xgbmodel, fname = "xgboost_model")
xgbmodel_from_disk <- xgb.load(modelfile = "xgboost_model")

#Run model from memory
set.seed(123)
head(predict(xgbmodel,dtest,predictor="gpu_predictor"))
head(predict(xgbmodel,dtest))

#Run model from disk
set.seed(123)
head(predict(xgbmodel_from_disk,dtest,predictor="gpu_predictor"))
head(predict(xgbmodel_from_disk,dtest))

Gives me this output:

> require(xgboost)
> data(agaricus.train, package='xgboost')
> data(agaricus.test, package='xgboost')
> dtrain <- xgb.DMatrix(agaricus.train$data, label = agaricus.train$label)
> dtest <- xgb.DMatrix(data = agaricus.test$data, label = agaricus.test$label)
> param <- list(max_depth=2, eta=0.1, nthread = 8, tree_method="gpu_exact", predictor="gpu_predictor", objective = "gpu:binary:logistic")
> xgbmodel <- xgb.train(param, dtrain, nrounds = 5, verbose = 2)
[08:45:43] Allocated 7MB on [0] GeForce GTX 1070, 6769MB remaining.
[08:45:43] G:\xgboost\src\tree\updater_prune.cc:74: tree pruning end, 1 roots, 6 extra nodes, 0 pruned nodes, max_depth=2
[08:45:43] Allocated 1MB on [0] GeForce GTX 1070, 6767MB remaining.
[08:45:43] G:\xgboost\src\tree\updater_prune.cc:74: tree pruning end, 1 roots, 6 extra nodes, 0 pruned nodes, max_depth=2
[08:45:43] G:\xgboost\src\tree\updater_prune.cc:74: tree pruning end, 1 roots, 6 extra nodes, 0 pruned nodes, max_depth=2
[08:45:43] G:\xgboost\src\tree\updater_prune.cc:74: tree pruning end, 1 roots, 6 extra nodes, 0 pruned nodes, max_depth=2
[08:45:43] G:\xgboost\src\tree\updater_prune.cc:74: tree pruning end, 1 roots, 6 extra nodes, 0 pruned nodes, max_depth=2
> xgb.save(xgbmodel, fname = "xgboost_model")
[1] TRUE
> xgbmodel_from_disk <- xgb.load(modelfile = "xgboost_model")
> #Run model from memory
> set.seed(123)
> head(predict(xgbmodel,dtest,predictor="gpu_predictor"))
[08:45:44] Allocated 0MB on [0] GeForce GTX 1070, 6767MB remaining.
[1] 0.4284496 0.5864953 0.4284496 0.4284496 0.3037330 0.5104644
> head(predict(xgbmodel,dtest))
[08:45:44] Allocated 0MB on [0] GeForce GTX 1070, 6767MB remaining.
[1] 0.4284496 0.5864953 0.4284496 0.4284496 0.3037330 0.5104644
> #Run model from disk
> set.seed(123)
> head(predict(xgbmodel_from_disk,dtest,predictor="gpu_predictor"))
[1] 0.4284496 0.5864953 0.4284496 0.4284496 0.3037330 0.5104644
> head(predict(xgbmodel_from_disk,dtest))
[1] 0.4284496 0.5864953 0.4284496 0.4284496 0.3037330 0.5104644

As can be seen above, only the xgboost from memory runs on the GPU, the one from the disk only runs on the CPU.

1 Like

@korypostma I will take a look at it within two days.

Thank you, much appreciated!

@korypostma Sorry for the delay. I don’t think the extra arguments to predict.xgb.Booster (... in line 271 of xgb.Booster.R) are passed properly to the Booster object. This should be fixed.

@hetong007 I’m not familiar with R here. How does the predict method get access to the Booster object?

I still have to try the GPU version. As in this case, we access the backend with this line:

If there’s a new parameter, we need to add it explicitly in both predict.xgb.Booster and XGBoosterPredict_R from https://github.com/dmlc/xgboost/blob/74db9757b38e51516289704ba236e14b8454d924/R-package/src/xgboost_R.cc#L303

1 Like

@korypostma The pull request https://github.com/dmlc/xgboost/pull/3856 will fix the issue.

Thanks so much, sorry I have not had time to come back to this as I was asked to move on to other things. Perhaps in the future I will be able to come back to this and do more testing to determine feasibility for some of our projects. Thanks again!