Hi,
I’m a beginner so please bear with me for this question.
I have been trying to use XGBoost to handle the missing values in my data. I have a column (categorical) that has quite a few NaN values. From what I read online, we don’t have to handle the missing values but XGB handles it.
# fit model no training data
model = XGBClassifier(enable_categorical=True)
model.fit(X_train, y_train)
But when I run it with my dataset, I get this error.
ValueError: DataFrame.dtypes for data must be int, float, bool or categorical. When
categorical type is supplied, DMatrix parameter
`enable_categorical` must be set to `True`.STATE, OCCUPATION, INCOME_GROUP
STATE, OCCUPATION, INCOME_GROUP are my categorical variables as it is without encoding and having missing values.
Do I have to encode my categorical data? Or is there some other way so that I can pass my data to this classifier?
I’m not sure how to encode my data because it has NaNs in it. Can someone please help me with this?