Feature Intreaction


#1

Hi,

I wanted to raise a doubt regarding the definition of feature interaction according to XGBoost documentation. In the statistical sense, feature interaction happens when two (or more) feature combine together to significantly impact the prediction (which can be visualized as the multiplicative term in the linear regression models). But as XGBoost, all the feature in a traversal path are interacting features.

In case of linear models, let’s say you have two features X and Y which do not interact in statistical sense, but you can still make a decision tree splits on both X and Y. So according to your definition, there is interaction between these features but that’s not accurate.

Maybe feature interaction in XGBoost should be named as something else to avoid this confusion?

Thanks
Kshitij


#2

I disagree. There is an interaction between X and Y because each decision tree can be represented as nested axis aligned boxes in 2D plane.

Gradient boosted trees are predictive model, not data generation model. That is, we cannot assume that gradient boosted trees represent the true data generating distribution; they simply provide for a good way to predict y given x. So “feature interaction” found by XGBoost is really feature interaction found in the predictive model and does not necessarily mean that there is interaction between features in strictly statistical sense. (You can generate data with z = x + y and fit a decision tree with low training loss, but that doesn’t mean decision trees are true distribution) Thus, we are justified in using the term “feature interaction”


#3

Hi,

Yes I agree that the interaction found by XGBoost has merit. I’m just saying that since the term “feature interaction” (feature interaction) is very well defined in a classical way in statistics, it’s a bit confusing to use the same terminology for XGBoost. It’s just a suggestion.

For an OR gate, the features don’t have interaction as per the statistical definition but as per XGBoost’s definition they would interact (because you will always have split on both features in the tree).

Thanks