How to predict data with less variables?

I’ve trained an xgboost model in R with data containing 160 variables. However, I’d like to predict on inference data with only 10 variables.
Is this possible? If yes, how does this work? Do I only give the 10 variables and the model knows which variables are given despite a different ordering? Do I have to set the missing variables to NULL or blank or something?

You need you add back those variable as missing values, that means number of columns of test and training data should be the same. Please see the missing parameter of xgb.DMatrix.

1 Like

Thanks for your reply @jiamingy !

As far as the documentation concerned I understand the missing parameter as filling in the missing values with a certain value. I don’t understand how this helps to add back those variables to keep the numer of columns of test and training data the same. Do you please have any more guidance?