Im using distributed xgb on yarn, not xgb on spark.
According to the distributed-training demo, I think everytime I submit a job, I can only do training and testing on 1 set of parameters, specified in a config file. The running script I’m using now:
$XGB_HOME/dmlc-core/tracker/dmlc-submit --cluster=yarn --num-workers=4 --worker-cores=2 --worker-memory=80g --server-memory=40g\ $XGB_HOME/xgboost $(pwd)/../conf/xgb.conf
and the content of xgb.conf:
booster = gbtree eta = 0.1 objective = binary:logistic ...
Is there any way to do cross validation on xgb on yarn?