XGboost python model reproducability

I’m trying to make sure my xgboost python models are the same when generated using the same [random_state, seed, input features, model params]. However, although the decision-trees are exactly the same (assessed visually using plot_tree), the binary files that are stored on two different Windows machines are different.

I compare models using the “diff” command.
The models are the same as long as they’re run on the same machine, but they’re different when I switch machines.

So far I’ve tried:

  • Using pickle instead of sklearn.externals.joblib to dump the models.
  • creating the exact same conda environment for the two machines.

Any idea what might be causing this?

If the model dump is the same, what is the issue? Do you get different results when you run prediction?

The results are exactly the same. I know this because

  • accuracies on my test data match
  • the thresholds used in the decision tree match up to 10 decimal points.

My concern is, what causes the binary files to be different?

No idea. It might be because Python pickle (or joblib) saves some extra information that’s dependent on the machine.

Thanks anyway.
I’ll update this answer if I find anything.