the distributed version of xgboost can be run on docker or kubernetes ? without on spark
Can a distributed version of xgboost be run on docker or kubernetes
DMLC-Core has recently added Kubernetes tracker: https://github.com/dmlc/dmlc-core/pull/328
The example there uses MXNet, but it should work for XGBoost as well. An example command would be
### Not tested!!
dmlc-core/tracker/dmlc-submit \
--cluster kubernetes \
--jobname test \
--num-workers 4 \
--kube-worker-image [xgboost docker image] \
--kube-server-image [xgboost docker image] \
../../xgboost mushroom.aws.conf nthread=2 \
data=s3://${BUCKET}/xgb-demo/train \
eval[test]=s3://${BUCKET}/xgb-demo/test \
model_dir=s3://${BUCKET}/xgb-demo/model
I suggest that you give it a try. EDIT. The DMLC-Core tracker relies on an existing Kubernetes configuration. See https://github.com/kubernetes-client/python.
And this is for my curiosity: would you be able to give us your reason for avoiding Spark in your use case?