Hi,
I’m a university student, and I started to work with xgboost a month ago. I made a Classification model, which can predict a log file’s error type (from 7 different types). The model has an accuracy of 76%. I would like to understand the leaf values. So when I print the trees:
booster[2]:
0:[only<3.11304689e-07] yes=1,no=2,missing=1,gain=0.586026371,cover=3.71875
1:[mgr<3.13217683e-08] yes=3,no=4,missing=3,gain=0.234146357,cover=3.0625
3:[writeback<8.19934996e-07] yes=5,no=6,missing=5,gain=0.269478679,cover=2.84375
5:leaf=0.344827592,cover=2.625
6:leaf=-0.051282052,cover=0.21875
4:leaf=-0.051282052,cover=0.21875
2:leaf=-0.113207549,cover=0.65625
and so on… How can I understand this, or convert these values to labels? I mean if I want to plot this with plot_tree(model), I would like to see the letters values as labels, not values. For example if the leaf’s value is between -0.2 and 0 than this is a ‘A’ error type, if between 0.2 and 0.4 than ‘C’. I’m sorry if I asked something that someone else has already done, I did not find nothing.
Best wishes,
Peter