R seesion crack when using gpu_hist tree_method


#1

Hello everyone,
I’ve been trying to install xgboost by clone the souce code from github. I complied code by visual studio 2019.
However, it works well when I trains data by hist tree_method, but it always appears a crack when tree_method is gpu_hist.
Is anyone know how to debug it?

Annotation%202019-08-02%20164350

My environment:

  • Windows 10
  • cuda 10.1
  • R 3.6.1

devtools::session_info()

- Session info -----------------------------------------------------------------------------------------------
 setting  value                       
 version  R version 3.6.1 (2019-07-05)
 os       Windows 10 x64              
 system   x86_64, mingw32             
 ui       RStudio                     
 language (EN)                        
 collate  English_United States.1252  
 ctype    English_United States.1252  
 tz       America/New_York            
 date     2019-08-02                  

- Packages ---------------------------------------------------------------------------------------------------
 package     * version date       lib source        
 assertthat    0.2.1   2019-03-21 [1] CRAN (R 3.6.1)
 backports     1.1.4   2019-04-10 [1] CRAN (R 3.6.0)
 callr         3.3.1   2019-07-18 [1] CRAN (R 3.6.1)
 cli           1.1.0   2019-03-19 [1] CRAN (R 3.6.1)
 crayon        1.3.4   2017-09-16 [1] CRAN (R 3.6.1)
 data.table    1.12.2  2019-04-07 [1] CRAN (R 3.6.1)
 desc          1.2.0   2018-05-01 [1] CRAN (R 3.6.1)
 devtools    * 2.1.0   2019-07-06 [1] CRAN (R 3.6.1)
 digest        0.6.20  2019-07-04 [1] CRAN (R 3.6.1)
 fs            1.3.1   2019-05-06 [1] CRAN (R 3.6.1)
 glue          1.3.1   2019-03-12 [1] CRAN (R 3.6.1)
 lattice       0.20-38 2018-11-04 [2] CRAN (R 3.6.1)
 magrittr      1.5     2014-11-22 [1] CRAN (R 3.6.1)
 Matrix        1.2-17  2019-03-22 [2] CRAN (R 3.6.1)
 memoise       1.1.0   2017-04-21 [1] CRAN (R 3.6.1)
 pkgbuild      1.0.3   2019-03-20 [1] CRAN (R 3.6.1)
 pkgload       1.0.2   2018-10-29 [1] CRAN (R 3.6.1)
 prettyunits   1.0.2   2015-07-13 [1] CRAN (R 3.6.1)
 processx      3.4.1   2019-07-18 [1] CRAN (R 3.6.1)
 ps            1.3.0   2018-12-21 [1] CRAN (R 3.6.1)
 R6            2.4.0   2019-02-14 [1] CRAN (R 3.6.1)
 Rcpp          1.0.1   2019-03-17 [1] CRAN (R 3.6.0)
 remotes       2.1.0   2019-06-24 [1] CRAN (R 3.6.1)
 rlang         0.4.0   2019-06-25 [1] CRAN (R 3.6.1)
 rprojroot     1.3-2   2018-01-03 [1] CRAN (R 3.6.1)
 rstudioapi    0.10    2019-03-19 [1] CRAN (R 3.6.1)
 sessioninfo   1.1.1   2018-11-05 [1] CRAN (R 3.6.1)
 stringi       1.4.3   2019-03-12 [1] CRAN (R 3.6.0)
 testthat      2.2.1   2019-07-25 [1] CRAN (R 3.6.1)
 usethis     * 1.5.1   2019-07-04 [1] CRAN (R 3.6.1)
 withr         2.1.2   2018-03-15 [1] CRAN (R 3.6.1)
 xgboost     * 1.0.0.1 2019-08-02 [1] local         

[1] C:/Users/alfre/Documents/R/win-library/3.6
[2] C:/Program Files/R/R-3.6.1/library

rsession.log

02 Aug 2019 18:16:23 [rsession-alfre] ERROR system error 2 (The system cannot find the file specified) [path=C:/Users/alfre/AppData/Local/RStudio-Desktop/jobs/E9477DBB-output.json]; OCCURRED AT: auto __cdecl rstudio::core::FilePath::open_r::<lambda_7681044d383654bc1b82d8906e771cc1>::operator ()(void) const c:\jenkins\workspace\ide\windows-v1.2\src\cpp\core\filepath.cpp:1092; LOGGED FROM: class std::vector<class json_spirit::Value_impl<struct json_spirit::Config_map<class std::basic_string<char,struct std::char_traits<char>,class std::allocator<char> > > >,class std::allocator<class json_spirit::Value_impl<struct json_spirit::Config_map<class std::basic_string<char,struct std::char_traits<char>,class std::allocator<char> > > > > > __cdecl rstudio::session::modules::jobs::Job::output(int) c:\jenkins\workspace\ide\windows-v1.2\src\cpp\session\modules\jobs\job.cpp:433
02 Aug 2019 18:16:39 [rsession-alfre] ERROR system error 2 (The system cannot find the file specified) [path=C:/Users/alfre/AppData/Local/RStudio-Desktop/jobs/E9477DBB-output.json]; OCCURRED AT: auto __cdecl rstudio::core::FilePath::open_r::<lambda_7681044d383654bc1b82d8906e771cc1>::operator ()(void) const c:\jenkins\workspace\ide\windows-v1.2\src\cpp\core\filepath.cpp:1092; LOGGED FROM: class std::vector<class json_spirit::Value_impl<struct json_spirit::Config_map<class std::basic_string<char,struct std::char_traits<char>,class std::allocator<char> > > >,class std::allocator<class json_spirit::Value_impl<struct json_spirit::Config_map<class std::basic_string<char,struct std::char_traits<char>,class std::allocator<char> > > > > > __cdecl rstudio::session::modules::jobs::Job::output(int) c:\jenkins\workspace\ide\windows-v1.2\src\cpp\session\modules\jobs\job.cpp:433
02 Aug 2019 19:56:53 [rsession-alfre] ERROR system error 10053 (An established connection was aborted by the software in your host machine) [request-uri=/events/get_events]; OCCURRED AT: void __cdecl rstudio::session::HttpConnectionImpl<class rstudio_boost::asio::ip::tcp>::sendResponse(const class rstudio::core::http::Response &) c:\jenkins\workspace\ide\windows-v1.2\src\cpp\session\http\sessionhttpconnectionimpl.hpp:111; LOGGED FROM: void __cdecl rstudio::session::HttpConnectionImpl<class rstudio_boost::asio::ip::tcp>::sendResponse(const class rstudio::core::http::Response &) c:\jenkins\workspace\ide\windows-v1.2\src\cpp\session\http\sessionhttpconnectionimpl.hpp:116
02 Aug 2019 20:27:48 [rsession-alfre] ERROR system error 10053 (An established connection was aborted by the software in your host machine) [request-uri=/events/get_events]; OCCURRED AT: void __cdecl rstudio::session::HttpConnectionImpl<class rstudio_boost::asio::ip::tcp>::sendResponse(const class rstudio::core::http::Response &) c:\jenkins\workspace\ide\windows-v1.2\src\cpp\session\http\sessionhttpconnectionimpl.hpp:111; LOGGED FROM: void __cdecl rstudio::session::HttpConnectionImpl<class rstudio_boost::asio::ip::tcp>::sendResponse(const class rstudio::core::http::Response &) c:\jenkins\workspace\ide\windows-v1.2\src\cpp\session\http\sessionhttpconnectionimpl.hpp:116

Here is my R code

library('xgboost')

# Simulate N x p random matrix with some binomial response dependent on pp columns
set.seed(111)
N <- 1000000
p <- 50
pp <- 25
X <- matrix(runif(N * p), ncol = p)
betas <- 2 * runif(pp) - 1
sel <- sort(sample(p, pp))
m <- X[, sel] %*% betas - 1 + rnorm(N)
y <- rbinom(N, 1, plogis(m))

tr <- sample.int(N, N * 0.75)
dtrain <- xgb.DMatrix(X[tr,], label = y[tr])
dtest <- xgb.DMatrix(X[-tr,], label = y[-tr])
wl <- list(train = dtrain, test = dtest)

param <- list(objective = 'reg:logistic', eval_metric = 'auc', subsample = 0.5, nthread = 4,
              max_bin = 64, tree_method = 'gpu_hist')
pt <- proc.time()
bst_gpu <- xgb.train(param, dtrain, watchlist = wl, nrounds = 5)
proc.time() - pt

# Compare to the 'hist' algorithm:
param <- list(objective = 'reg:logistic', eval_metric = 'auc', subsample = 0.5, nthread = 4,
              max_bin = 64, tree_method = 'hist')
pt <- proc.time()
bst_hist <- xgb.train(param, dtrain, watchlist = wl, nrounds = 5)
proc.time() - pt

#2

Did you enable GPU when you compiled the R package?


#3

Yes. I compiled it by adding following arguments into visual studio 2019 cmake settings

-DUSE_CUDA=ON -DR_LIB=ON -DLIBR_EXECUTABLE="C:/Program Files/R/R-3.6.1/bin/x64/R.exe" -DCMAKE_INSTALL_PREFIX="C:/Program Files (x86)/xgboost"

#4

Actually I ran my code by using R package which I complied, it ran a few seconds. On GPU-Z sensors monitor, I can see my GPU is working. But after that, Rstudio popuped a alert “R Session Aborted” and I have to restart RStudio.


#5

Can you try to lower GPU memory usage by setting lower N or subsample?


#6

Also, set a lower number for max_depth, like 4 or 5.


#7

It works!!! Thank you so much! It seems that I can not add subsample into my params list.