not sure if this has been posted here, but i have a simple classification problem e.g. 0 or 1. i am using xgbclassifier. Going back to my dataset i can see that pretty much all my numerical columns are postively skewed and contain some negative numbers… i would like to remove outliers since xgbclassifer loss function fits onto residuals.
however i cannot use log transform (due to negative values) so am using yeo johnson. i then proceed to remove outliers using IQR… does this seem like a valid approach?