As a quite inexperienced xgboost user, I found documentation on distributed xgboost using YARN and there’s also mentioning about xgboost + Spark but not much else.
Is it possible to use distributed xgboost “standalone”? Maybe with dmlc-submit --cluster ssh
?
I’ve also seen xgboost can run on MPI. Does it just mean rabit uses MPI instead of directly using TCP sockets? Should I expect a performance difference between MPI vs. no-MPI?
Is xgboost + Spark the “recommended” way for distributing xgboost?
Thanks!