Installing XGBoost with GPU support for R on Windows: Build does not complete

aadler · April 11, 2021, 5:30am

Before anything, I want to thank the authors and maintainers for their work with xgboost. I’ve used it via R (CPU only) many times and really appreciate it.

Owning an RTX 3080, I want to take advantage of the GPU. I followed all the directions at the install page, and I have a working GPU-version of keras/tensorflow in R, so I know I have the proper files.

I invoke the configuration as: cmake .. -G"Visual Studio 16 2019" -A x64 -DUSE_CUDA=ON -DR_LIB=ON -DR_VERSION=4.0.0 -DGPU_COMPUTE_VER=86 -DLIBR_HOME="C:\R\RCurrent\R-devel".

I don’t notice anything out of the ordinary, except perhaps for not having backtrace, but I saw no mention of that in the docs. I then invoke the build as: cmake --build . --target install --config Release

A slew of compilation messages ensues, but always seems to stop at

  C:\Users\Parents\AppData\Local\Temp\tmpxft_00005ba0_00000000-7_updater_gpu_hist.cudafe1.stub.c(150): note: see refere
  nce to function template instantiation 'void xgboost::tree::RowPartitioner::UpdatePosition<__nv_dl_wrapper_t<__nv_dl_
  tag<void (__cdecl xgboost::tree::GPUHistMakerDevice<GradientSumT>::* )(int,xgboost::RegTree *),void xgboost::tree::GP
  UHistMakerDevice<GradientSumT>::UpdatePosition(int,xgboost::RegTree *),1>,xgboost::EllpackDeviceAccessor,xgboost::Reg
  Tree::Node,xgboost::FeatureType,xgboost::common::Span<unsigned int,18446744073709551615>>>(xgboost::bst_node_t,xgboos
  t::bst_node_t,xgboost::bst_node_t,UpdatePositionOpT)' being compiled
          with
          [
              GradientSumT=xgboost::detail::GradientPairInternal<float>,
              UpdatePositionOpT=__nv_dl_wrapper_t<__nv_dl_tag<void (__cdecl xgboost::tree::GPUHistMakerDevice<xgboost::
  detail::GradientPairInternal<float>>::* )(int,xgboost::RegTree *),void xgboost::tree::GPUHistMakerDevice<xgboost::det
  ail::GradientPairInternal<float>>::UpdatePosition(int,xgboost::RegTree *),1>,xgboost::EllpackDeviceAccessor,xgboost::
  RegTree::Node,xgboost::FeatureType,xgboost::common::Span<unsigned int,18446744073709551615>>
          ]
C:/Program Files/NVIDIA GPU Computing Toolkit/CUDA/v11.2/include\thrust/system/cuda/detail/reduce.h(945): warning C4244
: 'initializing': conversion from 'Size' to 'thrust::detail::int32_t', possible loss of data [D:\xgboostgpu\build\src\o
bjxgboost.vcxproj]
          with
          [
              Size=size_type
          ]
  C:/Program Files/NVIDIA GPU Computing Toolkit/CUDA/v11.2/include\thrust/system/cuda/detail/reduce.h(1021): note: see
  reference to function template instantiation 'T thrust::cuda_cub::detail::reduce_n_impl<Derived,InputIt,Size,T,Binary
  Op>(thrust::cuda_cub::execution_policy<Derived> &,InputIt,Size,T,BinaryOp)' being compiled
          with
          [
              T=xgboost::GradientPair,
              Derived=thrust::detail::execute_with_allocator<dh::detail::XGBCachingDeviceAllocatorImpl<char> &,thrust::
  cuda_cub::execute_on_stream_base>,
              InputIt=thrust::device_ptr<const xgboost::detail::GradientPairInternal<float>>,
              Size=size_type,
              BinaryOp=thrust::plus<xgboost::detail::GradientPairInternal<float>>
          ]
  C:/Program Files/NVIDIA GPU Computing Toolkit/CUDA/v11.2/include\thrust/system/cuda/detail/reduce.h(1041): note: see
  reference to function template instantiation 'T thrust::cuda_cub::reduce_n<Derived,InputIt,size_type,T,BinaryOp>(thru
  st::cuda_cub::execution_policy<Derived> &,InputIt,Size,T,BinaryOp)' being compiled
          with
          [
              T=xgboost::GradientPair,
              Derived=thrust::detail::execute_with_allocator<dh::detail::XGBCachingDeviceAllocatorImpl<char> &,thrust::
  cuda_cub::execute_on_stream_base>,
              InputIt=thrust::device_ptr<const xgboost::detail::GradientPairInternal<float>>,
              BinaryOp=thrust::plus<xgboost::detail::GradientPairInternal<float>>,
              Size=size_type
          ]
  C:/Program Files/NVIDIA GPU Computing Toolkit/CUDA/v11.2/include\thrust/detail/reduce.inl(71): note: see reference to
   function template instantiation 'T thrust::cuda_cub::reduce<Derived,InputIterator,T,BinaryFunction>(thrust::cuda_cub
  ::execution_policy<Derived> &,InputIt,InputIt,T,BinaryOp)' being compiled
          with
          [
              T=xgboost::GradientPair,
              Derived=thrust::detail::execute_with_allocator<dh::detail::XGBCachingDeviceAllocatorImpl<char> &,thrust::
  cuda_cub::execute_on_stream_base>,
              InputIterator=thrust::device_ptr<const xgboost::detail::GradientPairInternal<float>>,
              BinaryFunction=thrust::plus<xgboost::detail::GradientPairInternal<float>>,
              InputIt=thrust::device_ptr<const xgboost::detail::GradientPairInternal<float>>,
              BinaryOp=thrust::plus<xgboost::detail::GradientPairInternal<float>>
          ]
  D:\xgboostgpu\src\tree\../common/device_helpers.cuh(1268): note: see reference to function template instantiation 'T
  thrust::reduce<DerivedPolicy,Derived,Init,Func>(const thrust::detail::execution_policy_base<DerivedPolicy> &,InputIte
  rator,InputIterator,T,BinaryFunction)' being compiled
          with
          [
              T=xgboost::GradientPair,
              DerivedPolicy=thrust::detail::execute_with_allocator<dh::detail::XGBCachingDeviceAllocatorImpl<char> &,th
  rust::cuda_cub::execute_on_stream_base>,
              Derived=thrust::device_ptr<const xgboost::detail::GradientPairInternal<float>>,
              Init=xgboost::GradientPair,
              Func=thrust::plus<xgboost::detail::GradientPairInternal<float>>,
              InputIterator=thrust::device_ptr<const xgboost::detail::GradientPairInternal<float>>,
              BinaryFunction=thrust::plus<xgboost::detail::GradientPairInternal<float>>
          ]
  D:/xgboostgpu/src/tree/updater_gpu_hist.cu(652): note: see reference to function template instantiation 'Ty dh::Reduc
  e<thrust::detail::execute_with_allocator<Allocator,ExecutionPolicyCRTPBase>,thrust::device_ptr<const T>,xgboost::Grad
  ientPair,thrust::plus<T>>(Policy,InputIt,InputIt,Init,Func)' being compiled
          with
          [
              Allocator=dh::detail::XGBCachingDeviceAllocatorImpl<char> &,
              ExecutionPolicyCRTPBase=thrust::cuda_cub::execute_on_stream_base,
              T=xgboost::detail::GradientPairInternal<float>,
              Policy=thrust::detail::execute_with_allocator<dh::detail::XGBCachingDeviceAllocatorImpl<char> &,thrust::c
  uda_cub::execute_on_stream_base>,
              InputIt=thrust::device_ptr<const xgboost::detail::GradientPairInternal<float>>,
              Init=xgboost::GradientPair,
              Func=thrust::plus<xgboost::detail::GradientPairInternal<float>>
          ]
  D:/xgboostgpu/src/tree/updater_gpu_hist.cu(649): note: while compiling class template member function 'xgboost::tree:
  :ExpandEntry xgboost::tree::GPUHistMakerDevice<GradientSumT>::InitRoot(xgboost::RegTree *,dh::AllReducer *)'
          with
          [
              GradientSumT=xgboost::detail::GradientPairInternal<double>
          ]
C:/Program Files/NVIDIA GPU Computing Toolkit/CUDA/v11.2/include\thrust/system/cuda/detail/reduce.h(973): warning C4244
: 'initializing': conversion from 'Size' to 'thrust::detail::int32_t', possible loss of data [D:\xgboostgpu\build\src\o
bjxgboost.vcxproj]
          with
          [
              Size=size_type
          ]
D:\xgboostgpu\src\common\threading_utils.h(139): warning C4267: 'argument': conversion from 'size_t' to 'int', possible
 loss of data [D:\xgboostgpu\build\src\objxgboost.vcxproj]
  D:\xgboostgpu\src\common\threading_utils.h(148): note: see reference to function template instantiation 'void xgboost
  ::common::ParallelFor<Index,Func>(Index,size_t,Func)' being compiled
          with
          [
              Index=xgboost::omp_ulong,
              Func=xgboost::common::Transform<true>::Evaluator<__nv_hdl_wrapper_t<false,false,__nv_dl_tag<void (__cdecl
   xgboost::tree::TreeEvaluator::* )(int,int,int,unsigned int,float,float),void xgboost::tree::TreeEvaluator::AddSplit<
  true>(xgboost::bst_node_t,xgboost::bst_node_t,xgboost::bst_node_t,xgboost::bst_feature_t,float,float),1>,void (unsign
  ed __int64,xgboost::common::Span<float,18446744073709551615>,xgboost::common::Span<float,18446744073709551615>,xgboos
  t::common::Span<int,18446744073709551615>),int,int,int,unsigned int,float,float>>::LaunchCPU::<lambda_8eeb19893970615
  efc98e2c89f9a4d91>
          ]
  D:\xgboostgpu\src\tree\../common/transform.h(175): note: see reference to function template instantiation 'void xgboo
  st::common::ParallelFor<xgboost::omp_ulong,xgboost::common::Transform<true>::Evaluator<__nv_hdl_wrapper_t<false,false
  ,__nv_dl_tag<void (__cdecl xgboost::tree::TreeEvaluator::* )(int,int,int,unsigned int,float,float),void xgboost::tree
  ::TreeEvaluator::AddSplit<true>(xgboost::bst_node_t,xgboost::bst_node_t,xgboost::bst_node_t,xgboost::bst_feature_t,fl
  oat,float),1>,void (unsigned __int64,xgboost::common::Span<T,18446744073709551615>,xgboost::common::Span<T,1844674407
  3709551615>,xgboost::common::Span<int,18446744073709551615>),int,int,int,unsigned int,float,float>>::LaunchCPU::<lamb
  da_8eeb19893970615efc98e2c89f9a4d91>>(Index,Func)' being compiled
          with
          [
              T=float,
              Index=xgboost::omp_ulong,
              Func=xgboost::common::Transform<true>::Evaluator<__nv_hdl_wrapper_t<false,false,__nv_dl_tag<void (__cdecl
   xgboost::tree::TreeEvaluator::* )(int,int,int,unsigned int,float,float),void xgboost::tree::TreeEvaluator::AddSplit<
  true>(xgboost::bst_node_t,xgboost::bst_node_t,xgboost::bst_node_t,xgboost::bst_feature_t,float,float),1>,void (unsign
  ed __int64,xgboost::common::Span<float,18446744073709551615>,xgboost::common::Span<float,18446744073709551615>,xgboos
  t::common::Span<int,18446744073709551615>),int,int,int,unsigned int,float,float>>::LaunchCPU::<lambda_8eeb19893970615
  efc98e2c89f9a4d91>
          ]
  D:\xgboostgpu\src\tree\../common/transform.h(82): note: see reference to function template instantiation 'void xgboos
  t::common::Transform<true>::Evaluator<__nv_hdl_wrapper_t<false,false,__nv_dl_tag<void (__cdecl xgboost::tree::TreeEva
  luator::* )(int,int,int,unsigned int,float,float),void xgboost::tree::TreeEvaluator::AddSplit<true>(xgboost::bst_node
  _t,xgboost::bst_node_t,xgboost::bst_node_t,xgboost::bst_feature_t,float,float),1>,void (unsigned __int64,xgboost::com
  mon::Span<T,18446744073709551615>,xgboost::common::Span<T,18446744073709551615>,xgboost::common::Span<int,18446744073
  709551615>),int,int,int,unsigned int,float,float>>::LaunchCPU<xgboost::HostDeviceVector<T>,xgboost::HostDeviceVector<
  T>,xgboost::HostDeviceVector<int>>(Functor,xgboost::HostDeviceVector<T> *,xgboost::HostDeviceVector<T> *,xgboost::Hos
  tDeviceVector<int> *) const' being compiled
          with
          [
              T=float,
              Functor=__nv_hdl_wrapper_t<false,false,__nv_dl_tag<void (__cdecl xgboost::tree::TreeEvaluator::* )(int,in
  t,int,unsigned int,float,float),void xgboost::tree::TreeEvaluator::AddSplit<true>(xgboost::bst_node_t,xgboost::bst_no
  de_t,xgboost::bst_node_t,xgboost::bst_feature_t,float,float),1>,void (unsigned __int64,xgboost::common::Span<float,18
  446744073709551615>,xgboost::common::Span<float,18446744073709551615>,xgboost::common::Span<int,18446744073709551615>
  ),int,int,int,unsigned int,float,float>
          ]
  D:\xgboostgpu\src\tree\../common/transform.h(82): note: see reference to function template instantiation 'void xgboos
  t::common::Transform<true>::Evaluator<__nv_hdl_wrapper_t<false,false,__nv_dl_tag<void (__cdecl xgboost::tree::TreeEva
  luator::* )(int,int,int,unsigned int,float,float),void xgboost::tree::TreeEvaluator::AddSplit<true>(xgboost::bst_node
  _t,xgboost::bst_node_t,xgboost::bst_node_t,xgboost::bst_feature_t,float,float),1>,void (unsigned __int64,xgboost::com
  mon::Span<T,18446744073709551615>,xgboost::common::Span<T,18446744073709551615>,xgboost::common::Span<int,18446744073
  709551615>),int,int,int,unsigned int,float,float>>::LaunchCPU<xgboost::HostDeviceVector<T>,xgboost::HostDeviceVector<
  T>,xgboost::HostDeviceVector<int>>(Functor,xgboost::HostDeviceVector<T> *,xgboost::HostDeviceVector<T> *,xgboost::Hos
  tDeviceVector<int> *) const' being compiled
          with
          [
              T=float,
              Functor=__nv_hdl_wrapper_t<false,false,__nv_dl_tag<void (__cdecl xgboost::tree::TreeEvaluator::* )(int,in
  t,int,unsigned int,float,float),void xgboost::tree::TreeEvaluator::AddSplit<true>(xgboost::bst_node_t,xgboost::bst_no
  de_t,xgboost::bst_node_t,xgboost::bst_feature_t,float,float),1>,void (unsigned __int64,xgboost::common::Span<float,18
  446744073709551615>,xgboost::common::Span<float,18446744073709551615>,xgboost::common::Span<int,18446744073709551615>
  ),int,int,int,unsigned int,float,float>
          ]
  D:\xgboostgpu\src\tree\split_evaluator.h(172): note: see reference to function template instantiation 'void xgboost::
  common::Transform<true>::Evaluator<__nv_hdl_wrapper_t<false,false,__nv_dl_tag<void (__cdecl xgboost::tree::TreeEvalua
  tor::* )(int,int,int,unsigned int,float,float),void xgboost::tree::TreeEvaluator::AddSplit<true>(xgboost::bst_node_t,
  xgboost::bst_node_t,xgboost::bst_node_t,xgboost::bst_feature_t,float,float),1>,void (unsigned __int64,xgboost::common
  ::Span<T,18446744073709551615>,xgboost::common::Span<T,18446744073709551615>,xgboost::common::Span<int,18446744073709
  551615>),int,int,int,unsigned int,float,float>>::Eval<xgboost::HostDeviceVector<T>*,xgboost::HostDeviceVector<T>*,xgb
  oost::HostDeviceVector<int>*>(xgboost::HostDeviceVector<T> *,xgboost::HostDeviceVector<T> *,xgboost::HostDeviceVector
  <int> *) const' being compiled
          with
          [
              T=float
          ]
  D:\xgboostgpu\src\tree\split_evaluator.h(172): note: see reference to function template instantiation 'void xgboost::
  common::Transform<true>::Evaluator<__nv_hdl_wrapper_t<false,false,__nv_dl_tag<void (__cdecl xgboost::tree::TreeEvalua
  tor::* )(int,int,int,unsigned int,float,float),void xgboost::tree::TreeEvaluator::AddSplit<true>(xgboost::bst_node_t,
  xgboost::bst_node_t,xgboost::bst_node_t,xgboost::bst_feature_t,float,float),1>,void (unsigned __int64,xgboost::common
  ::Span<T,18446744073709551615>,xgboost::common::Span<T,18446744073709551615>,xgboost::common::Span<int,18446744073709
  551615>),int,int,int,unsigned int,float,float>>::Eval<xgboost::HostDeviceVector<T>*,xgboost::HostDeviceVector<T>*,xgb
  oost::HostDeviceVector<int>*>(xgboost::HostDeviceVector<T> *,xgboost::HostDeviceVector<T> *,xgboost::HostDeviceVector
  <int> *) const' being compiled
          with
          [
              T=float
          ]
  C:\Users\Parents\AppData\Local\Temp\tmpxft_00005ba0_00000000-7_updater_gpu_hist.cudafe1.stub.c(148): note: see refere
  nce to function template instantiation 'void xgboost::tree::TreeEvaluator::AddSplit<true>(xgboost::bst_node_t,xgboost
  ::bst_node_t,xgboost::bst_node_t,xgboost::bst_feature_t,float,float)' being compiled
D:\xgboostgpu\cub\cub\agent\../grid/grid_even_share.cuh(133): warning C4244: '=': conversion from 'OffsetT' to 'int', p
ossible loss of data [D:\xgboostgpu\build\src\objxgboost.vcxproj]
          with
          [
              OffsetT=__int64
          ]
  D:\xgboostgpu\cub\cub\agent\../grid/grid_even_share.cuh(128): note: while compiling class template member function 'v
  oid cub::GridEvenShare<__int64>::DispatchInit(OffsetT,int,int)'
          with
          [
              OffsetT=__int64
          ]
  D:\xgboostgpu\cub\cub\device\dispatch/dispatch_reduce.cuh(481): note: see reference to function template instantiatio
  n 'void cub::GridEvenShare<__int64>::DispatchInit(OffsetT,int,int)' being compiled
          with
          [
              OffsetT=__int64
          ]
  C:\Users\Parents\AppData\Local\Temp\tmpxft_00005ba0_00000000-7_updater_gpu_hist.cudafe1.stub.c(980): note: see refere
  nce to class template instantiation 'cub::GridEvenShare<__int64>' being compiled
D:\xgboostgpu\cub\cub\agent\../grid/grid_even_share.cuh(135): warning C4244: '=': conversion from 'OffsetT' to 'int', p
ossible loss of data [D:\xgboostgpu\build\src\objxgboost.vcxproj]
          with
          [
              OffsetT=__int64
          ]

After which I cannot find a lib folder or an R package. What can I do to fix this? Is there more information that would help? Thank you.

aadler · April 11, 2021, 5:43am

For the record, I just tried building the shared library without the R package, and the installation staled at the same place.

 D:\xgboost\xgboost\cub\cub\agent\../grid/grid_even_share.cuh(128): note: while compiling class template member functi
  on 'void cub::GridEvenShare<__int64>::DispatchInit(OffsetT,int,int)'
          with
          [
              OffsetT=__int64
          ]
  D:\xgboost\xgboost\cub\cub\device\dispatch/dispatch_reduce.cuh(481): note: see reference to function template instant
  iation 'void cub::GridEvenShare<__int64>::DispatchInit(OffsetT,int,int)' being compiled
          with
          [
              OffsetT=__int64
          ]
  C:\Users\Parents\AppData\Local\Temp\tmpxft_00000af8_00000000-7_updater_gpu_hist.cudafe1.stub.c(980): note: see refere
  nce to class template instantiation 'cub::GridEvenShare<__int64>' being compiled
D:\xgboost\xgboost\cub\cub\agent\../grid/grid_even_share.cuh(135): warning C4244: '=': conversion from 'OffsetT' to 'in
t', possible loss of data [D:\xgboost\xgboost\build\src\objxgboost.vcxproj]
          with
          [
              OffsetT=__int64
          ]

aadler · April 11, 2021, 6:07am

For completeness, making the XGBoost shared library without CUDA completed.

aadler · April 11, 2021, 6:39am

More progress. Running from inside VS 2019, I see that the message above is actually the end of objxgboost.vcxproj and that the actual error is in the build of runxgboost in that the build cannot find 'D:\xgboost\xgboost\build\src\objxgboost.dir\Release\c_api\c_api.cc.obj'. Larger snippet of log below.

Thank you.

5>C:\Users\Parents\AppData\Local\Temp\tmpxft_00003988_00000000-7_gpu_predictor.cudafe1.stub.c(1739): note: see reference to class template instantiation 'cub::GridEvenShare<__int64>' being compiled
5>D:\xgboost\xgboost\cub\cub\agent\../grid/grid_even_share.cuh(135): warning C4244: '=': conversion from 'OffsetT' to 'int', possible loss of data
5>        with
5>        [
5>            OffsetT=__int64
5>        ]
5>Done building project "objxgboost.vcxproj".
6>------ Build started: Project: runxgboost, Configuration: Release x64 ------
6>Building Custom Rule D:/xgboost/xgboost/CMakeLists.txt
6>cli_main.cc
6>D:\xgboost\xgboost\src\c_api/c_api_utils.h(154,33): warning C4244: '=': conversion from 'const xgboost::JsonInteger::Int' to 'float', possible loss of data
6>D:\xgboost\xgboost\src\c_api/c_api_utils.h(156,18): warning C4244: '=': conversion from 'double' to 'float', possible loss of data
6>D:\xgboost\xgboost\dmlc-core\include\dmlc\./common.h(68,27): warning C4101: 'ex': unreferenced local variable
6>D:\xgboost\xgboost\include\xgboost/data.h(303): message : see reference to function template instantiation 'void dmlc::OMPException::Run<xgboost::SparsePage::SortRows::<lambda_e8747da59644e603f3d88bad19a78f88>,>(Function)' being compiled
6>        with
6>        [
6>            Function=xgboost::SparsePage::SortRows::<lambda_e8747da59644e603f3d88bad19a78f88>
6>        ]
6>D:\xgboost\xgboost\dmlc-core\include\dmlc\./common.h(73,30): warning C4101: 'ex': unreferenced local variable
6>C:\Program Files (x86)\Microsoft Visual Studio\2019\Community\VC\Tools\MSVC\14.28.29910\include\numeric(35,1): warning C4267: '=': conversion from 'size_t' to '_Ty', possible loss of data
6>        with
6>        [
6>            _Ty=int
6>        ]
6>D:\xgboost\xgboost\src\c_api/c_api_utils.h(115): message : see reference to function template instantiation '_Ty std::accumulate<std::_Vector_const_iterator<std::_Vector_val<std::_Simple_types<unsigned __int64>>>,int,std::multiplies<void>>(const _InIt,const _InIt,_Ty,_Fn)' being compiled
6>        with
6>        [
6>            _Ty=int,
6>            _InIt=std::_Vector_const_iterator<std::_Vector_val<std::_Simple_types<size_t>>>,
6>            _Fn=std::multiplies<void>
6>        ]
6>LINK : fatal error LNK1181: cannot open input file 'D:\xgboost\xgboost\build\src\objxgboost.dir\Release\c_api\c_api.cc.obj'
6>Done building project "runxgboost.vcxproj" -- FAILED.
7>------ Build started: Project: xgboost, Configuration: Release x64 ------
7>Building Custom Rule D:/xgboost/xgboost/CMakeLists.txt
7>LINK : fatal error LNK1181: cannot open input file 'D:\xgboost\xgboost\build\src\objxgboost.dir\Release\c_api\c_api.cc.obj'
7>Done building project "xgboost.vcxproj" -- FAILED.
8>------ Build started: Project: ALL_BUILD, Configuration: Release x64 ------
8>Building Custom Rule D:/xgboost/xgboost/CMakeLists.txt
9>------ Skipped Build: Project: INSTALL, Configuration: Release x64 ------
9>Project not selected to build for this solution configuration 
========== Build: 4 succeeded, 3 failed, 0 up-to-date, 2 skipped ==========

aadler · April 11, 2021, 7:10am

Last post for the night. I’ve tracked down the first failure to this (line 510 in the log)

4>C:\Program Files (x86)\Microsoft Visual Studio\2019\Community\MSBuild\Microsoft\VC\v160\BuildCustomizations\CUDA 11.2.targets(785,9): error MSB3721: The command ""C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.2\bin\nvcc.exe" -gencode=arch=compute_86,code=\"sm_86,compute_86\" -gencode=arch=compute_86,code=\"compute_86,compute_86\" --use-local-env -ccbin "C:\Program Files (x86)\Microsoft Visual Studio\2019\Community\VC\Tools\MSVC\14.28.29910\bin\HostX64\x64" -x cu   -ID:\xgboost\cub -ID:\xgboost\include -I"D:\xgboost\dmlc-core\include" -ID:\xgboost\rabit\include -ID:\xgboost\gputreeshap -I"D:\xgboost\build\dmlc-core\include" -I"C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.2\include"     --keep-dir x64\Release  -maxrregcount=0  --machine 64 --compile -cudart static --expt-extended-lambda --expt-relaxed-constexpr -lineinfo -Xcompiler="/EHsc -Ob2 -openmp /utf-8"   -D_WINDOWS -DNDEBUG -D_CRT_SECURE_NO_WARNINGS -D_CRT_SECURE_NO_DEPRECATE -DXGBOOST_USE_CUDA=1 -DTHRUST_IGNORE_CUB_VERSION_CHECK=1 -DDMLC_LOG_CUSTOMIZE=1 -DXGBOOST_MM_PREFETCH_PRESENT=1 -D__USE_XOPEN2K8 -DDMLC_CORE_USE_CMAKE -DDMLC_USE_CXX11=1 -D"CMAKE_INTDIR=\"Release\"" -DCODE_ANALYSIS -DWIN32 -D_WINDOWS -DNDEBUG -D_CRT_SECURE_NO_WARNINGS -D_CRT_SECURE_NO_DEPRECATE -DXGBOOST_USE_CUDA=1 -DTHRUST_IGNORE_CUB_VERSION_CHECK=1 -DDMLC_LOG_CUSTOMIZE=1 -DXGBOOST_MM_PREFETCH_PRESENT=1 -D__USE_XOPEN2K8 -DDMLC_CORE_USE_CMAKE -DDMLC_USE_CXX11=1 -D"CMAKE_INTDIR=\"Release\"" -D_MBCS -Xcompiler "/EHsc /W3 /nologo /O2 /Fdobjxgboost.dir\Release\objxgboost.pdb /FS   /MT /GR" -o objxgboost.dir\Release\/c_api/c_api.cu.obj "D:\xgboost\src\c_api\c_api.cu"" exited with code 2.
    4>Done building project "objxgboost.vcxproj" -- FAILED.

aadler · April 11, 2021, 1:47pm

The objxgboost.dir\Release\/c_api/ directory is empty but all its sister directories have .obj files.

hcho3 · April 13, 2021, 7:38pm

Sorry for the trouble you are having with building XGBoost from the source. We are currently investigating the possibility of making a pre-built binary package for XGBoost, so that you don’t have to worry about building from the source. We already have done this for the Linux platform; see https://xgboost.readthedocs.io/en/latest/build.html

aadler · April 13, 2021, 11:26pm

Thanks! Is there anything I can do to help?

hcho3 · April 16, 2021, 12:34am

Not really. It may take a while for us to figure out how to get everything to work on Windows. If you are in a hurry and must use XGBoost with the GPU, consider using Linux instead.

aadler · April 22, 2021, 5:14pm

If it helps more, these are the “red” issues that manifest prior to the long error on “Exit with status 2”

**C:\Program Files (x86)\Microsoft Visual Studio\2019\Community\VC\Tools\MSVC\14.28.29910\include\memory(2198): error C26 94: 'std::_Ref_count_obj2<_Ty>::~_Ref_count_obj2(void)': overriding virtual function has less restrictive exception specification than base class virtual member function 'std::_Ref_count_base::~_Ref_count_base(void) noexcept' [D:\xgboost\build\src\objxgboost.vcxproj**]

[snip]

**C:\Program Files (x86)\Microsoft Visual Studio\2019\Community\VC\Tools\MSVC\14.28.29910\include\memory(2198): error C26 94: 'std::_Ref_count_obj2<_Ty>::~_Ref_count_obj2(void)': overriding virtual function has less restrictive exception specification than base class virtual member function 'std::_Ref_count_base::~_Ref_count_base(void) noexcept' [D:\xgboost\build\src\objxgboost.vcxproj]**

aadler · May 2, 2021, 2:09am

Isolating the factors further, the issue seems to be line 2198 of “memory” in objxgboost which causes error C2694. I guess I should open an issue on github, @hcho3?

Thanks

Severity	Code	Description	Project	File	Line	Suppression State
Error	C2694	'std::_Ref_count_obj2<_Ty>::~_Ref_count_obj2(void)': overriding virtual function has less restrictive exception specification than base class virtual member function 'std::_Ref_count_base::~_Ref_count_base(void) noexcept'	objxgboost	C:\Program Files (x86)\Microsoft Visual Studio\2019\Community\VC\Tools\MSVC\14.28.29910\include\memory	2198

hcho3 · May 2, 2021, 4:26am

The <memory> header is a header provided by the C++ standard library. There appears to be an issue with the standard library implementation of Visual Studio and there isn’t much we can do about it.

aadler · May 2, 2021, 6:26am

Can I try using GCC? if so, what in the cmake invocation needs to change? What replaces -G"Visual Studio 16 2019" -A x64? Thanks!

hcho3 · May 3, 2021, 4:38pm

I’m afraid CUDA doesn’t support building with GCC when using Windows.

thiagoveloso · May 17, 2021, 1:40pm

Just logged in to share a solution to this issue. I was facing the exactly same error as @aadler when trying to compile xgboost with GPU support on Windows.

I figured out that downgrading Microsoft Visual Studio to version 2017 (v. 15.0) solved the compilation error. Not sure why I suspected a previous version would be the solution - I guess one of those intuition things that only experiences brings. Notice that there is MVS 2017 v. 15.9 as well, but I haven’t tested that one.

You’ll need a special Microsoft account in order to be able to download any previous versions of MVS, but you can easily use a free Hotmail account to get that.

Last, also notice that the compile command needs to be changed to:

cmake … -G"Visual Studio 15 2017" -A x64 -DUSE_CUDA=ON -DR_LIB=ON

Hope this helps,
-T.

aadler · June 4, 2021, 6:35pm

@hcho3, is there anything in the works to allow MSVS 2019 to be used, or do those of us using Wndows/R have to rely on @thiagoveloso trick of downgrading?

hcho3 · June 6, 2021, 1:05am

No, please use Visual Studio 2017 for now. I just opened https://github.com/dmlc/xgboost/issues/7024 to keep track of the issue, namely XGBoost can’t be built with Visual Studio 2019.

aadler · August 18, 2022, 10:20pm

I now have the same problem on VS 2022, 2019, AND 2017, see below. I am using CUDA 11.7 with an RTX 3080 on Windows 10 64 bit.

D:\xgboost\src\tree\gpu_hist\histogram.cu(315): note: see reference to function template instantiation 'auto xgboost::tree::BuildGradientHistogram::<lambda_6d0a0949a274b91398442d30e87cb3ab>::operator ()<void(__cdecl *)(x
gboost::EllpackDeviceAccessor,xgboost::tree::FeatureGroupsAccessor,xgboost::common::Span<const uint32_t,18446744073709551615>,GradientSumT __restrict ,const xgboost::GradientPair __restrict ,xgboost::tree::HistRounding
)>(void (_cdecl *)(xgboost::EllpackDeviceAccessor,xgboost::tree::FeatureGroupsAccessor,xgboost::common::Span<const uint32_t,18446744073709551615>,GradientSumT __restrict ,const xgboost::GradientPair re
strict ,xgboost::tree::HistRounding)) const’ being compiled
with
[
GradientSumT=xgboost::detail::GradientPairInternal
]
D:\xgboost\src\tree\gpu_hist\histogram.cu(323): note: see reference to function template instantiation ‘void xgboost::tree::BuildGradientHistogram<xgboost::detail::GradientPairInternal>(const xgboost::EllpackDevi
ceAccessor &,const xgboost::tree::FeatureGroupsAccessor &,xgboost::common::Span<const xgboost::detail::GradientPairInternal,18446744073709551615>,xgboost::common::Span<const uint32_t,18446744073709551615>,xgboost:
:common::Span<xgboost::detail::GradientPairInternal,18446744073709551615>,xgboost::tree::HistRounding<xgboost::detail::GradientPairInternal>,bool)’ being compiled
D:\xgboost\src\tree\gpu_hist\histogram.cu(298): error C2131: expression did not evaluate to a constant [D:\xgboost\buildgpu\src\objxgboost.vcxproj]
D:\xgboost\src\tree\gpu_hist\histogram.cu(298): note: failure was caused by a read of a variable outside its lifetime
D:\xgboost\src\tree\gpu_hist\histogram.cu(298): note: see usage of ‘this’
D:\xgboost\src\tree\gpu_hist\histogram.cu(299): warning C4267: ‘initializing’: conversion from ‘size_t’ to ‘int’, possible loss of data [D:\xgboost\buildgpu\src\objxgboost.vcxproj]
*C:\Program Files (x86)\Microsoft Visual Studio\2017\Professional\Common7\IDE\VC\VCTargets\BuildCustomizations\CUDA 11.7.targets(790,9): error MSB3721: The command ""C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.7*
bin\nvcc.exe" -gencode=arch=compute_86,code=“sm_86,compute_86” -gencode=arch=compute_86,code=“compute_86,compute_86” -gencode=arch=compute_86,code=“sm_86,compute_86” --use-local-env -ccbin "C:\Program Files (x86)\Mic
rosoft Visual Studio\2017\Professional\VC\Tools\MSVC\14.16.27023\bin\HostX86\x64" -x cu -ID:\xgboost\include -I"D:\xgboost\dmlc-core\include" -ID:\xgboost\rabit\include -ID:\xgboost\gputreeshap -I"D:\xgboost\buildgpu\dml
c-core\include" -I"C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.7\include" --keep-dir x64\Release -maxrregcount=0 --machine 64 --compile -cudart static --expt-extended-lambda --expt-relaxed-constexpr -Xfatb
in=-compress-all -lineinfo -std=c++14 -Xcompiler="/EHsc -Ob2 -openmp /utf-8" -D_WINDOWS -DNDEBUG -D_CRT_SECURE_NO_WARNINGS -D_CRT_SECURE_NO_DEPRECATE -DXGBOOST_USE_CUDA=1 -DDMLC_LOG_CUSTOMIZE=1 -DXGBOOST_MM_PREFETCH_PRES
**ENT=1 -D__USE_XOPEN2K8 -DDMLC_CORE_USE_CMAKE -DDMLC_USE_CXX11=1 -DXGBOOST_STRICT_R_MODE=1 -DXGBOOST_CUSTOMIZE_GLOBAL_PRNG=1 -DDMLC_LOG_BEFORE_THROW=0 -DDMLC_DISABLE_STDIN=1 -DRABIT_CUSTOMIZE_MSG -DRABIT_STRICT_CXX98 -D"C
MAKE_INTDIR=“Release”" -D_MBCS -DWIN32 -D_WINDOWS -DNDEBUG -D_CRT_SECURE_NO_WARNINGS -D_CRT_SECURE_NO_DEPRECATE -DXGBOOST_USE_CUDA=1 -DDMLC_LOG_CUSTOMIZE=1 -DXGBOOST_MM_PREFETCH_PRESENT=1 -D__USE_XOPEN2K8 -DDMLC_CORE_USE
**CMAKE -DDMLC_USE_CXX11=1 -DXGBOOST_STRICT_R_MODE=1 -DXGBOOST_CUSTOMIZE_GLOBAL_PRNG=1 -DDMLC_LOG_BEFORE_THROW=0 -DDMLC_DISABLE_STDIN=1 -DRABIT_CUSTOMIZE_MSG -DRABIT_STRICT_CXX98 -D"CMAKE_INTDIR=“Release”" -Xcompiler "/
EHsc /W3 /nologo /O2 /Fdobjxgboost.dir\Release\objxgboost.pdb /FS /MT /GR" -o objxgboost.dir\Release\histogram.obj “D:\xgboost\src\tree\gpu_hist\histogram.cu”" exited with code 2. [D:\xgboost\buildgpu\src\objxgboost.vcxp
roj]

aadler · August 19, 2022, 4:42pm

According to Visual Studio, rhe error comes from line 298 in histogram.cu which reads:

constexpr int kMinItemsPerBlock = kItemsPerTile;

If that helps.