Replication of logloss for highly imbalanced dataset

I implemented the custom logloss in C++. But when I try to run my program, I face this issue:

Traceback (most recent call last):
  File "xgboost_debugger.py", line 15, in <module>
    import xgboost as xgb
  File "/home/gvikashb/xgboost/python-package/xgboost/__init__.py", line 9, in <module>
    from .core import DMatrix, DeviceQuantileDMatrix, Booster
  File "/home/gvikashb/xgboost/python-package/xgboost/core.py", line 173, in <module>
    _LIB = _load_lib()
  File "/home/gvikashb/xgboost/python-package/xgboost/core.py", line 164, in _load_lib
    'Error message(s): {}\n'.format(os_error_list))
xgboost.core.XGBoostError: XGBoost Library (libxgboost.so) could not be loaded.
Likely causes:
  * OpenMP runtime is not installed (vcomp140.dll or libgomp-1.dll for Windows, libomp.dylib for Mac OSX, libgomp.so for Linux and other UNIX-like OSes). Mac OSX users: Run `brew install libomp` to install OpenMP runtime.
  * You are running 32-bit Python on a 64-bit OS
Error message(s): ["/lib64/libstdc++.so.6: version `CXXABI_1.3.8' not found (required by /home/gvikashb/xgboost/python-package/xgboost/../../lib/libxgboost.so)"]

Any suggestions please?

It appears that you are using two different machines to build and run XGBoost. Is it possible to use the same machine for both steps? The error message suggests that the Linux that’s used for building XGBoost is newer than the Linux that’s used for running XGBoost.

@hcho3 I was able to mimic the logloss in C++. Then, I followed the same steps to build the Alpha loss (https://arxiv.org/pdf/2006.12406.pdf ) by modifying just the first order and second order gradient calculations. But this results in only positive ‘1’ labels and no ‘0’ labels. Can you please tell me the life cycle/ program flow of how methods are called in the C++ implementation so that I can make sure that my implementation is right. For ex: what is the sequence flow or ordering among functions : getGradients(), evalTransform(), predTransform(), probToMargin() etc. Can you please help? Please direct me to any documentation that can be of help. Thanks a lot.

You should inject printing statements in the functions and see which method gets called when.

If my memory is correct, the workflow goes like this:

  • Compute prediction
  • Compute gradient pairs
  • Compute the best split candidate (feature_id, threshold)
  • Update the tree by create the new split
  • Go back to the first step, to perform the next boosting iteration.

See this tutorial for a short overview of the XGBoost algorithm.

Also, have you considered implemented the alpha loss in Python first? Writing C++ means slower pace of development. And if you are not getting good results by writing the alpha loss in Python, writing the loss in C++ is not likely to yield good results. (As for the log-loss, you were able to obtain somewhat sensible results with Python, despite the discrepancy between the native and Python implementations.)

Yes, for the custom logloss implementation in Python, I got sensible results but they were not the same as the native C++ logloss implementation. This is the reason, I wanted to implement alpha loss in C++. Are you saying if alpha loss does not give better results in python, then it wont give better results in C++?

I said it would be unlikely, especially when it’s giving totally nonsensical results.

@goku_grad_asu1 My suggestion is to step back and first prove whether the alpha loss is even a good idea for your problem. XGBoost has a super complex codebase, so you may profit from tinkering with more accessible implementations like https://github.com/eriklindernoren/ML-From-Scratch. (In particular, see this file.) Once the idea is shown to be promising with this codebase, then you can consider porting it to XGBoost.

Thanks @hcho3. I agree to your suggestion. Alpha loss has proven results in few other GBM frameworks. This is the first implementation with XGBoost. The results were sensible in python implementation but it was not up to the expected level. When alpha = 1, the model should replicate logloss obj function results but there were few discrepancies mainly because the native C++ logloss performs better than the python implementation. So, this issue of all labels being classified as positive/‘1’ should be a programming issue from my end.