XGBoost: AUC VS Accuracy?


Which one is more reliable metric between AUC and Accuracy for binary classification on imbalanced data where setScalePosWeight is set to sum(negative instances) / sum(positive instances) for training set?

On test set, the performances are:

  • AUC = 80%
  • Accuracy = 73%

I think that the Accuracy is naive in the sens that is use 0.5 as threshold


I suggest that you also look at AUCPR, as we want to consider precision and recall with respect to the minority class: http://www.davidsbatista.net/blog/2018/08/19/NLP_Metrics/


Thanks @hcho3 for your answer

In this example the author used randomnes to split its (imbalanced) data into train and test. This can lead to lead to cases where we have all the negative example in the train/test, … which does not preserve the distribution of the target.
Isn’t this a “split leakage” making this metrics unreliable ?


You can try using stratified sampling.


Cool, I’ll do it. Thanks.