Cannot make XGBoost library work on IBM AIX


#1

Hello, experts! I could not find binaries for IBM AIX, and also I found no evidence XGBoost does not work. I made a successful compilation of XGBoost 0.9 with gcc 8.3 on IBM AIX 7.1, with minor code changes. But sorry to tell, the resulting binaries do not work too well.

Can you please share a successful compilation experience or advise on how to fix?

Scenario:

  1. Tests from demo/binary_classification/ fail:
:~/xgboost/demo/binary_classification
$ python mapfeat.py
:~/xgboost/demo/binary_classification
$ python mknfold.py agaricus.txt 1
:~/xgboost/demo/binary_classification
$ ../../xgboost mushroom.conf
[09:37:32] 6513x126 matrix with 143286 entries loaded from agaricus.txt.train
[09:37:32] 1611x126 matrix with 35442 entries loaded from agaricus.txt.test
[09:37:32] [0]  test-error:0.016139     train-error:0.014433
[09:37:32] [1]  test-error:0.000000     train-error:0.001228
:~/xgboost/demo/binary_classification
$ ../../xgboost mushroom.conf task=pred model_in=0002.model

Result:

[09:40:28] 1611x126 matrix with 35442 entries loaded from agaricus.txt.test
terminate called after throwing an instance of 'std::bad_alloc'
  what():  std::bad_alloc
IOT/Abort trap (core dumped)
  1. import XGBoost in python also fails:
>>> import xgboost
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/opt/freeware/lib/python3.7/site-packages/xgboost-0.83.dev0-py3.7.egg/xgboost/__init__.py", line 11, in <module>
    from .core import DMatrix, Booster
  File "/opt/freeware/lib/python3.7/site-packages/xgboost-0.83.dev0-py3.7.egg/xgboost/core.py", line 161, in <module>
    _LIB = _load_lib()
  File "/opt/freeware/lib/python3.7/site-packages/xgboost-0.83.dev0-py3.7.egg/xgboost/core.py", line 152, in _load_lib
    'Error message(s): {}\n'.format(os_error_list))
xgboost.core.XGBoostError: XGBoost Library (libxgboost.so) could not be loaded.
Likely causes:
  * OpenMP runtime is not installed (vcomp140.dll or libgomp-1.dll for Windows, libgomp.so for UNIX-like OSes)
  * You are running 32-bit Python on a 64-bit OS
Error message(s): ['Could not load module /opt/freeware/lib/python3.7/site-packages/xgboost-0.83.dev0-py3.7.egg/xgboost/libxgboost.so.\nSystem error: Exec format error', 'Could not load module /opt/freeware/lib/python3.7/site-packages/xgboost-0.83.dev0-py3.7.egg/xgboost/./lib/libxgboost.so.\nSystem error: Exec format error']

#2

Can you double-check you are using the right GCC? It looks like you may be using cross-compiler GCC, which would produce incompatible binaries.

I also suggest that you use CMake to build XGBoost, so that platform-specific details will be better handled.


#3

Hello, hcho3! Thank you for your feedback.

  1. We took gcc-8.3.0-1 built for aix7.1 from http://www.bullfreeware.com/affichage.php?id=4944. Does not look like cross-compiler, passed self-compilation test during installation.
  2. cmake-3.14.3-1 (same source) did not see libgomp-8.3.0-1 (same source). We could not fix and defaulted to make.
  3. Looks like code is not super-ready for AIX platform, we had to make the following changes during compilation (please see below). I guess this is not enough since we received memory allocation failure in the compiled binary. Also this is not a memory issue, 37Gb RAM free before test start.

=== BACKUP: DIFF listing of min code changes we had to apply before successful compilatin ===

diff -bur ./src_orig/dmlc-core/include/dmlc/endian.h ./src_patch/dmlc-core/include/dmlc/endian.h
--- ./src_orig/dmlc-core/include/dmlc/endian.h	2019-05-18 17:23:44.814216600 +0300
+++ ./src_patch/dmlc-core/include/dmlc/endian.h	2019-05-31 12:57:18.000000000 +0300
@@ -29,6 +29,8 @@
     #else
       #define DMLC_LITTLE_ENDIAN 0
     #endif
+  #elif defined(_AIX)
+    #define DMLC_LITTLE_ENDIAN 0
   #else
     #error "Unable to determine endianness of your machine; use CMake to compile"
   #endif
diff -bur ./src_orig/dmlc-core/include/dmlc/serializer.h ./src_patch/dmlc-core/include/dmlc/serializer.h
--- ./src_orig/dmlc-core/include/dmlc/serializer.h	2019-05-18 17:23:44.836272200 +0300
+++ ./src_patch/dmlc-core/include/dmlc/serializer.h	2019-05-31 12:57:18.000000000 +0300
@@ -118,6 +118,11 @@
  */
 template<typename T>
 struct UndefinedSerializerFor {
+  inline static void Write(Stream *strm, const T &data) {
+  }
+  inline static bool Read(Stream *strm, T *data) {
+    return false;
+  }
 };
 
 /*!
diff -bur ./src_orig/dmlc-core/Makefile ./src_patch/dmlc-core/Makefile
--- ./src_orig/dmlc-core/Makefile	2019-05-18 17:23:44.763042400 +0300
+++ ./src_patch/dmlc-core/Makefile	2019-05-31 13:21:44.000000000 +0300
@@ -22,13 +22,7 @@
 LDFLAGS+= $(DMLC_LDFLAGS) $(ADD_LDFLAGS)
 CFLAGS+= $(DMLC_CFLAGS) $(ADD_CFLAGS)
 
-ifndef USE_SSE
-	USE_SSE = 1
-endif
-
-ifeq ($(USE_SSE), 1)
-	CFLAGS += -msse2
-endif
+USE_SSE = 0
 
 ifdef DEPS_PATH
 CFLAGS+= -I$(DEPS_PATH)/include
diff -bur ./src_orig/Makefile ./src_patch/Makefile
--- ./src_orig/Makefile	2019-05-18 17:23:27.444739100 +0300
+++ ./src_patch/Makefile	2019-05-31 13:03:46.000000000 +0300
@@ -77,6 +77,8 @@
 	CFLAGS += -g -O0 -fprofile-arcs -ftest-coverage
 else
 	CFLAGS += -O3 -funroll-loops
+
+USE_SSE = 0
 ifeq ($(USE_SSE), 1)
 	CFLAGS += -msse2
 endif
diff -bur ./src_orig/python-package/xgboost/libpath.py ./src_patch/python-package/xgboost/libpath.py
--- ./src_orig/python-package/xgboost/libpath.py	2019-05-18 17:23:28.069662400 +0300
+++ ./src_patch/python-package/xgboost/libpath.py	2019-05-31 13:04:34.000000000 +0300
@@ -33,7 +33,7 @@
             # hack for pip installation when copy all parent source directory here
             dll_path.append(os.path.join(curr_path, './windows/Release/'))
         dll_path = [os.path.join(p, 'xgboost.dll') for p in dll_path]
-    elif sys.platform.startswith('linux') or sys.platform.startswith('freebsd'):
+    elif sys.platform.startswith('linux') or sys.platform.startswith('freebsd') or sys.platform.startswith('aix'):
         dll_path = [os.path.join(p, 'libxgboost.so') for p in dll_path]
     elif sys.platform == 'darwin':
         dll_path = [os.path.join(p, 'libxgboost.dylib') for p in dll_path]
diff -bur ./src_orig/rabit/Makefile ./src_patch/rabit/Makefile
--- ./src_orig/rabit/Makefile	2019-05-18 17:23:46.783429700 +0300
+++ ./src_patch/rabit/Makefile	2019-05-31 13:22:54.000000000 +0300
@@ -9,7 +9,7 @@
 endif
 
 export WARNFLAGS= -Wall -Wextra -Wno-unused-parameter -Wno-unknown-pragmas -std=c++11
-export CFLAGS = -O3 $(WARNFLAGS) -I $(DMLC)/include -I include/
+export CFLAGS = -O3 $(WARNFLAGS) -I $(DMLC)/include -I include/ -pthread
 export LDFLAGS =-Llib
 
 #download mpi
@@ -42,23 +42,7 @@
     endif
 endif
 
-#----------------------------
-# Settings for power and arm arch
-#----------------------------
-ARCH := $(shell uname -a)
-ifneq (,$(filter $(ARCH), powerpc64le ppc64le ))
-	USE_SSE=0
-else
-	USE_SSE=1
-endif
-
-ifndef USE_SSE
-	USE_SSE = 1
-endif
-
-ifeq ($(USE_SSE), 1)
-	CFLAGS += -msse2
-endif
+USE_SSE=0
 
 ifndef WITH_FPIC
 	WITH_FPIC = 1

#4

Yes, XGBoost has not been tested with IBM AIX. Can you install Docker and run a Linux container instead?