Very large NDCG result

Using XGBoost with the following parameters:
{‘objective’: ‘rank:ndcg’, ‘eta’: 0.1, ‘gamma’: 1.0, ‘eval_metric’: ‘ndcg@3’, ‘min_child_weight’: 0.1, ‘max_depth’: 6}
After building a model I get NDCG@3 larger than 1:
Given what I know of NDCG (that is normalized by the ideal ranking DCG and has to be within the (0,1) range) this has to be a bug, right?
I’m not that concerned with this bug in the evaluation since I can compute NDCG on my own from the ranking and the labels. However, I am concerned about how the model was trained given that the objective may also reach wrong values.

Another thing to note: the labels were originally between 0 and 1. But I discovered that XGBoost rounds down the labels to the nearest integer, meaning that most labels were 0. To overcome this issue, I multiplied all the labels by a factor of 1000. (Is there another way around this?). That resulted in some very large labels.
Going back to the large NDCG values, I guess it might be caused by an overflow (raising 2 to the power of a large number in the computation of NDCG).

Any suggestions?

Thank you!

the labels were originally between 0 and 1.

I think this is the issue. XGBoost assumes the label is nonnegative integer, e.g. 0, 1, 2, 3, … Please transform your label to obtain discrete levels of relevance, and do not multiply by 1000.

@hcho3, thanks for your response!
That’s exactly what I was trying to do.
I’m transforming the label values from:
0.0001, 0.001, 0.002, 0.003, 0.01, 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 1
0, 1, 2, 3, 10, 100, 200, 300, 400,…,1000
These are non-negative integers.
Did you means something different? Is there max value for the labels?
As an experiment, I removed all examples with a label above 100. Still got a very large NDCG value.

1000 is a very big number. Typically, relevance judgment is out of 5 or 10. For example:

  • 0: not relevant at all
  • 1: a little relevant
  • 2: moderately relevant
  • 3: quite relevant
  • 4: highly relevant
  • 5: extremely relevant.

Try to use 0-5 or 0-10 for your labels. This is so that you won’t suffer from numerical overflow (seems like that’s what happened to your original example).

@hcho3, thanks again!

Is there a recommended method to transform such values:
0.0001, 0.001, 0.002, 0.003, 0.01, 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 1
Into buckets from 0 to 10?

Maybe categorize them in bands? So 0-0.01 would be mapped to 0, 0.01-0.02 would be mapped to 1, and so forth.