While training XGBoost model on spark using the XGBoost4J-Spark
, for some of the datasets I see the following warning in the spark executor logs
WARNING: /xgboost/src/learner.cc:979: Number of columns does not match number of features in booster. Columns: 7531 Features: 7535
Most of the times the training gets stuck after this warning and it doesn’t progress any further. Any idea what I could be doing wrong here or is it the dataset ?