How to measure high-order(3-order) feature importance?


#1

Hi, everyone, thanks in advance.
I wonder what is fair and effective way to measure the high-order (combination) feature’s importance.
As far as I know, from a trained Xgboost classification model (binary classification), we can get
avg_gain
total_gain
weight
cover
for a single feature, let say, feature (A).
In my scenario, I want to calculate feature(ABC). Feature, A, B , & C is the feature that has ever appear in the same tree path. In addition, ABC is equal to ACB and BAC and …,etc
To be more specific, I set the max_tree_depeth to 3, so the three-order feature combination is the highest order-feature.
My target is that I can sort the three-order feature according to their importance and figure out the most important 3-order feature to classify the dataset.


#2

For starter, you can parse the trees and count the occurrence of 3-feature combinations


#3

Thanks fore reply.
Now I use: F(ABC) = (gainA + gainB + gainC)/ ABC_occurance
Wonder if there is any better idea~