XGBoost: How to add custom sampler (re: subsample)?

Is there a way to add my own custom sampler function to XGBoost ? I see there’s subsample parameter which controls the amount of training rows being selected. However, I’d like to experiment with different sampling methods to ensure data selected is IID. Is there a way to do it ?

No, currently we only provide for uniform sampling. You can however create bootstrap samples yourself with your custom sampling method and then feed it into XGBoost.

To clarify, if I create my own sample and train on each of those sample, I would need to combine them later right ? For example, say I train on samples of 100 rows
Model 0 = XGBoost[0…100]
Model 1 = XGBoost[101…200]

Final model would be average of all the predictions from [model0, model1…]. Correct ?

Depends on what you want to do. For example, you can use the bootstrapping method to generate a confidence interval for the prediction of XGBoost model. See https://en.wikipedia.org/wiki/Bootstrapping_(statistics)