Doc states
uniform
: each training instance has an equal probability of being selected. Typically set subsample
>= 0.5 for good results
If I grow large number of shallow trees with small subsample value, wouldn’t it be better to reduce variance ? My dataset is of modest size (50K rows) with 100 features and I was trying to feed 5K-10K rows bootstrapped and build different shallow models to avoid overfitting, but couldn’t figure out why smaller subsample is avoided. I was also trying to relate this to smaller mini batch sizes in deep learning that is recommended.