Def _load_lib(): in Core.py doesn't stop when it finds the right lib file

Hi All,

I had a scenario where I was using jupyter notebook on a dev server. The server had version 0.9.0 of XGBoost where I needed to version 1.0.1 to run a specific model. I pip installed 1.0.1 into my home folder and tried to run the model but I got the following error when loading the pickle file.

File “/home/user1/.local/lib/python3.6/site-packages/xgboost/core.py”, line 1093, in setstate
_LIB.XGBoosterUnserializeFromBuffer(handle, ptr, length))
File “/usr/local/lib/python3.6/ctypes/init.py”, line 361, in getattr
func = self.getitem(name)
File “/usr/local/lib/python3.6/ctypes/init.py”, line 366, in getitem
func = self._FuncPtr((name_or_ordinal, self))
AttributeError: /usr/local/xgboost/libxgboost.so: undefined symbol: XGBoosterUnserializeFromBuffer

After some debugging I found the issue in the following are core.py file and def _load_lib(): method

for lib_path in lib_paths:
    try:
        # needed when the lib is linked with non-system-available
        # dependencies
        os.environ['PATH'] = os.pathsep.join(
            pathBackup + [os.path.dirname(lib_path)])
        lib = ctypes.cdll.LoadLibrary(lib_path)
        lib_success = True
    except OSError as e:
        os_error_list.append(str(e))
        continue
    finally:
        os.environ['PATH'] = os.pathsep.join(pathBackup)

That code tries to load the libxgboost.so library file. The problem that it doesn’t exist once it find the file. In my case it found the file in the correct location which my user folder, but instead of exist the for loop continued and then picked up the old library from the server. It should have stopped once it found the correct file.

The tricky issue is that there isn’t a guarantee that the first found library is the correct library. In some cases, the second or third DLL may be the correct one.