Xgb predictor unstable

  • observe

    • for the same batch of samples, sometimes all results were 0.5, when I were retrying to predict later, the results seems to be normal. Specially, in multi proceess (using bash ‘&’), the-always-0.5 will happen by large probability(1/22), each process has two predictor, the the-always-0.5 appears in second predictor.
    • for one sample, sometimes the result was 0.209669 and sometimes the result was 0.201879, is this stable or not ?
  • env (for predict)

    • os: CentOS release 6.7
    • python: Python 3.5.2 |Anaconda 4.2.0 (64-bit)
    • gcc: GCC 4.4.7
    • xgboost: 0.7 (using pip to install)
  • detail
    I used same python env, but trained and predicted on diffrent machines, trained on CentOS Linux release 7.2.1511(GCC version 4.8.5), did the os or gcc cause the problem?

class XGBModel():
    def __init__(self, model_path):
        self.model_path = model_path
        self.model = self._load(self.model_path)

    def _load(self, path):
        with open(path, 'rb') as fr:
            data = pickle.load(fr)
        return data

    def predict(self, libsvm_filename):
        dtest = xgb.DMatrix(libsvm_filename)
        pred = self.model.predict(dtest)
        return pred

I think I may find the key point - missing the ntree_limit parameter with DART booster in predicting. But how the-always-0.5 happened ?

If you have a reproducible script, consider filing an issue in the GitHub repo. We will look at it.

@xgb_7632 thank you addressing this issue. I am getting similar problem. When I am giving inputin the range from 1500-30000 (Number of sample = 54000) with time series then at-least it is getting some prediction and it is near to actual values (need to tune algorithm) but when I am changing input in range of 241-247 (varies with float values upto 2 decimals i.e. 243.5 and 244.67 like this) then prediction is always 0.5.

Why I am getting same error and how did you solve it?

Help appreciated.

@mayur25 When using dart booster, you should explicitly set the parameter ntree_limit , as the issue 3485 says.