Meaning of kRtEps


#1

Hi,

I’m looking at the source code. I see a hardcoded variable bst_float kRtEps = 1e-6f in base.h

In the code it’s used in the condition whether to split or not to split. For example, in line 650 of src/tree/updater_colmaker.cc:

if (e.best.loss_chg > kRtEps) 
...

Also there is gamma regularization parameter, which is same as min_split_loss, that is used in prunning only.

So, my question is: is kRtEps effectively the lowest bound for gamma?


#2

It’s the lowest threshold that we consider when splitting a leaf. AFAIK it should have the same effect as gamma, meaning that it will prevent splits that provide minimal gain, but is applied every time we evaluate a split, instead of at pruning time.


#3

@thvasilo Thanks for the reply. I wonder why can’t we use the actual gamma value instead of kRtEps. Then we wouldn’t need prunning, right?


#4

Pruning is an optional step. kRtEps is mostly there so that we don’t split because of floating point errors.

If someone wanted to create a model that fits the data perfectly (i.e. overfits) they would use the model without pruning.

In order to provide that option, pruning (or regularization) is done separately.