Can someone help me understand how the multi-class probabilities are calculated?

Is it calculated during bagging by simply counting how many ensembles predicted the label, divided by total ensembles?

Can someone help me understand how the multi-class probabilities are calculated?

Is it calculated during bagging by simply counting how many ensembles predicted the label, divided by total ensembles?

No, we use one-vs-rest method to classify multi-class data. So if you run K boosting rounds, you will obtain K * C trees, where C is the number of classes. At prediction time, we group K * C trees into C groups and compute partial sums for each group, obtaining C scores. Finally, we take the softmax to convert the C scores into probabilities.

1 Like

Thank you for your reply!

Would setting the ‘objective’ in params override the default one-vs-rest approach for multi-class data? Should be set it to binary:logistic, multi:softmax, or not specify it at all? Finally, does the answer to the previous question change if you want to create per-class precision recall curves after training to evaluate performance? Thank you.