Can a distributed version of xgboost be run on docker or kubernetes


#1

the distributed version of xgboost can be run on docker or kubernetes ? without on spark


#2

DMLC-Core has recently added Kubernetes tracker: https://github.com/dmlc/dmlc-core/pull/328
The example there uses MXNet, but it should work for XGBoost as well. An example command would be

### Not tested!!
dmlc-core/tracker/dmlc-submit \
    --cluster kubernetes \
    --jobname test \
    --num-workers 4 \
    --kube-worker-image [xgboost docker image] \
    --kube-server-image [xgboost docker image] \
    ../../xgboost mushroom.aws.conf nthread=2 \
    data=s3://${BUCKET}/xgb-demo/train \
    eval[test]=s3://${BUCKET}/xgb-demo/test \
    model_dir=s3://${BUCKET}/xgb-demo/model

I suggest that you give it a try. EDIT. The DMLC-Core tracker relies on an existing Kubernetes configuration. See https://github.com/kubernetes-client/python.

And this is for my curiosity: would you be able to give us your reason for avoiding Spark in your use case?