Error in cox-regression while evaluating

stafin · April 21, 2021, 11:37am

I am using XGBoost version 1.3.3 on Windows with Python 3.6.8. On attempting to do training with objective set to survival:cox, I repeatedly get this error.

Traceback (most recent call last):
  File "xgboost_survival_cv.py", line 94, in <module>
    evals=[(data_m, 'train'), (data_v, 'eval')])
  File "C:\Program Files\Python36\lib\site-packages\xgboost\training.py", line 235, in train
    early_stopping_rounds=early_stopping_rounds)
  File "C:\Program Files\Python36\lib\site-packages\xgboost\training.py", line 110, in _train_internal
    if callbacks.after_iteration(bst, i, dtrain, evals):
  File "C:\Program Files\Python36\lib\site-packages\xgboost\callback.py", line 427, in after_iteration
    self._update_history(score, epoch)
  File "C:\Program Files\Python36\lib\site-packages\xgboost\callback.py", line 393, in _update_history
    name, s = d[0], float(d[1])
ValueError: could not convert string to float: '-nan(ind)'

There are no nan in my data
I initially assumed this might be due to some overflow since this error often appeared when the test loss (cox-nloglik) exceeded 20 (in the last successful boosting iteration), as confirmed by the disappearance of the error on using lesser boosting rounds, smaller learn rate, smaller trees (no overfitting and hence no blowup of the test loss) or switching off evaluation (empty evals list). But, later, I got the same error when the test loss was 6 (in the last successful boosting iteration). Further, on removing the evaluation (I need to implement early_stopping_rounds so this is not a long-term option), I still get nan (or inf) in the prediction output, though no error. The data is highly censored (90% right censored), in case that matters.

The run parameters were thus:
{‘colsample_bytree’: 0.8, ‘eta’: 0.1, ‘max_delta_step’: 0, ‘max_depth’: 3, ‘min_child_weight’: 100,
‘num_parallel_tree’: 20, ‘sampling_method’: ‘uniform’, ‘subsample’: 0.8, ‘tree_method’:‘gpu_hist’,
‘verbosity’:1, ‘seed’:0, ‘objective’:‘survival:cox’, ‘eval_metric’:‘cox-nloglik’}

The same error for many different parameter sets. Another example:
{‘colsample_bytree’: 0.8, ‘eta’: 0.3, ‘max_delta_step’: 0, ‘max_depth’: 3, ‘min_child_weight’: 100, ‘num_parallel_tree’: 1, ‘sampling_method’: ‘gradient_based’, ‘subsample’: 0.2}

Let me know if I should post this as a GitHub issue

hcho3 · April 21, 2021, 5:09pm

Yes, please file a GitHub issue.

EdiebahE · October 4, 2021, 4:01pm

Any update or fix on this error in cox-regression? I am getting the same error