Feature Importance Cover

AtR1an · March 4, 2019, 2:55pm

I am trying to obtain the same feature importance metrics available in Python and R but with Java.
To this end, I parse the model dump which works fine but I don’t fully understand what the cover metric means.
In one of my examples, I have two features and 3000 samples and train only a single tree for simplicity.
In my understanding cover means the number of samples affected by a split, so the root split should have a cover of 3000 but in the model dump, I find that it only has a cover of 1500.

Can you please explain to me what the meaning of cover really is?

Kind regards,

AtR1an

natesh2310 · March 8, 2019, 5:08pm

This could be of help.

The root node has all the probabilities set to 0.5 (initial guess).