Help with Installing XGBoost on Alpine (HomeAssistant/Appdaemon)

Good afternoon, folks. How are you? My name is Jefferson, I’m from Brazil, and I currently need to run some regression code developed in XGBoost on HomeAssistant, through the AppDaemon add-ons.

Context:
Home Assistant is a platform for home automation integration, and AppDaemon is an add-on that allows inserting Python code to run in Home Assistant.

For it to function, HomeAssistant runs on the Alpine Linux operating system.

To install anything on AppDaemon, a package description system is used in the settings section, configured as follows:

system_packages:
  - [enter package name here]
python_packages:
  - [enter package name here]
init_commands: []

I have always used XGBOOST in AppDaemon, and its installation has always been absurdly simple. It only required defining:

system_packages:
  - cmake
  - python3-dev
  - build-base
python_packages:
  - joblib
  - xgboost #XGBOOST definition
init_commands: []
log_level: info

My last installation of xgboost in appdaemon, without any difficulty, was about 4 months ago. At the time I just put -XGBOOST as a python package and the system installed it the first time, without any configuration. Therefore, the problem is current (in the last 4 months something has changed and now needs some additional adjustment)

However, now, after a few months without using xgboost, I tried to install it and I’m getting a new and very specific error:

[ 28%] Building CXX object src/CMakeFiles/objxgboost.dir/common/quantile.cc.o
      /tmp/pip-install-20s9s1bl/xgboost_2731b94a814045d3af3ad8ce437fddab/cpp_src/src/common/io.cc: In function 'std::unique_ptr<xgboost::common::**MMAPFile**> xgboost::common::Open(std::string, std::size_t, std::size_t)':
      /tmp/pip-install-20s9s1bl/xgboost_2731b94a814045d3af3ad8ce437fddab/cpp_src/src/common/io.cc:260:38: error: '**mmap64**' was not declared in this scope; did you mean 'mmap'?
        260 |   ptr = reinterpret_cast<std::byte*>(mmap64(nullptr, view_size, prot, MAP_PRIVATE, fd, view_start));
            |                                      ^~~~~~
            |                                      mmap

The specific mention of mmap64 suggests that the compilation process expects a 64-bit memory mapping function that may be named or implemented differently in the musl libc environment of Alpine Linux compared to glibc environments.

This has never been an issue, and I never needed to add any other dependency to execute, but currently, I can’t solve it alone.

I’ve tried everything, but I’m not sure how to overcome this limitation. As I need to execute some mathematical models to advance in my postgraduate research, I’m completely stuck JUST BECAUSE OF THIS SIMPLE DETAIL.

Would anyone have a suggestion on some dependency that I could install to fix this mmap64 issue?

Any help would be greatly appreciated.

Thank you in advance.

COMPLETE ERROR:

-- Could NOT find Backtrace (missing: Backtrace_LIBRARY Backtrace_INCLUDE_DIR)
      -- /tmp/pip-install-pc_4jb3u/xgboost_7ced82f2690f492f8f0a8d295645e2bf/cpp_src/dmlc-core/cmake/build_config.h.in -> include/dmlc/build_config.h
      -- Configuring done (0.1s)
      -- Generating done (0.0s)
      -- Build files have been written to: /tmp/tmp95dd5ehz/libbuild
      [  2%] Building CXX object dmlc-core/CMakeFiles/dmlc.dir/src/io.cc.o
      [  2%] Building CXX object dmlc-core/CMakeFiles/dmlc.dir/src/config.cc.o
      [  3%] Building CXX object dmlc-core/CMakeFiles/dmlc.dir/src/recordio.cc.o
      [  4%] Building CXX object dmlc-core/CMakeFiles/dmlc.dir/src/data.cc.o
      [  5%] Building CXX object dmlc-core/CMakeFiles/dmlc.dir/src/io/recordio_split.cc.o
      [  6%] Building CXX object dmlc-core/CMakeFiles/dmlc.dir/src/io/line_split.cc.o
      [  7%] Building CXX object dmlc-core/CMakeFiles/dmlc.dir/src/io/indexed_recordio_split.cc.o
      [  8%] Building CXX object dmlc-core/CMakeFiles/dmlc.dir/src/io/input_split_base.cc.o
      [  9%] Building CXX object dmlc-core/CMakeFiles/dmlc.dir/src/io/filesys.cc.o
      [ 10%] Building CXX object dmlc-core/CMakeFiles/dmlc.dir/src/io/local_filesys.cc.o
      [ 11%] Linking CXX static library libdmlc.a
      [ 11%] Built target dmlc
      [ 12%] Building CXX object src/CMakeFiles/objxgboost.dir/collective/socket.cc.o
      [ 13%] Building CXX object src/CMakeFiles/objxgboost.dir/c_api/c_api_error.cc.o
      [ 14%] Building CXX object src/CMakeFiles/objxgboost.dir/c_api/c_api.cc.o
      [ 15%] Building CXX object src/CMakeFiles/objxgboost.dir/collective/communicator.cc.o
      [ 16%] Building CXX object src/CMakeFiles/objxgboost.dir/collective/in_memory_communicator.cc.o
      [ 17%] Building CXX object src/CMakeFiles/objxgboost.dir/collective/in_memory_handler.cc.o
      [ 18%] Building CXX object src/CMakeFiles/objxgboost.dir/common/charconv.cc.o
      [ 19%] Building CXX object src/CMakeFiles/objxgboost.dir/common/column_matrix.cc.o
      [ 20%] Building CXX object src/CMakeFiles/objxgboost.dir/common/common.cc.o
      [ 21%] Building CXX object src/CMakeFiles/objxgboost.dir/common/error_msg.cc.o
      [ 22%] Building CXX object src/CMakeFiles/objxgboost.dir/common/hist_util.cc.o
      [ 23%] Building CXX object src/CMakeFiles/objxgboost.dir/common/host_device_vector.cc.o
      [ 24%] Building CXX object src/CMakeFiles/objxgboost.dir/common/io.cc.o
      [ 25%] Building CXX object src/CMakeFiles/objxgboost.dir/common/json.cc.o
      [ 26%] Building CXX object src/CMakeFiles/objxgboost.dir/common/numeric.cc.o
      [ 27%] Building CXX object src/CMakeFiles/objxgboost.dir/common/pseudo_huber.cc.o
      [ 28%] Building CXX object src/CMakeFiles/objxgboost.dir/common/quantile.cc.o
      /tmp/pip-install-pc_4jb3u/xgboost_7ced82f2690f492f8f0a8d295645e2bf/cpp_src/src/common/io.cc: In function 'std::unique_ptr<xgboost::common::MMAPFile> xgboost::common::Open(std::string, std::size_t, std::size_t)':
      /tmp/pip-install-pc_4jb3u/xgboost_7ced82f2690f492f8f0a8d295645e2bf/cpp_src/src/common/io.cc:260:38: error: 'mmap64' was not declared in this scope; did you mean 'mmap'?
        260 |   ptr = reinterpret_cast<std::byte*>(mmap64(nullptr, view_size, prot, MAP_PRIVATE, fd, view_start));
            |                                      ^~~~~~
            |                                      mmap
      make[2]: *** [src/CMakeFiles/objxgboost.dir/build.make:244: src/CMakeFiles/objxgboost.dir/common/io.cc.o] Error 1
      make[2]: *** Waiting for unfinished jobs....
      make[1]: *** [CMakeFiles/Makefile2:264: src/CMakeFiles/objxgboost.dir/all] Error 2
      make: *** [Makefile:156: all] Error 2
      Traceback (most recent call last):
        File "/tmp/pip-install-pc_4jb3u/xgboost_7ced82f2690f492f8f0a8d295645e2bf/packager/nativelib.py", line 102, in build_libxgboost
          _build(generator=generator)
        File "/tmp/pip-install-pc_4jb3u/xgboost_7ced82f2690f492f8f0a8d295645e2bf/packager/nativelib.py", line 68, in _build
          subprocess.check_call([build_tool, f"-j{nproc}"], cwd=build_dir)
        File "/usr/lib/python3.11/subprocess.py", line 413, in check_call
          raise CalledProcessError(retcode, cmd)
      subprocess.CalledProcessError: Command '['make', '-j8']' returned non-zero exit status 2.
      
      During handling of the above exception, another exception occurred:
      
      Traceback (most recent call last):
        File "/usr/lib/python3.11/site-packages/pip/_vendor/pyproject_hooks/_in_process/_in_process.py", line 353, in <module>
          main()
        File "/usr/lib/python3.11/site-packages/pip/_vendor/pyproject_hooks/_in_process/_in_process.py", line 335, in main
          json_out['return_val'] = hook(**hook_input['kwargs'])
                                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
        File "/usr/lib/python3.11/site-packages/pip/_vendor/pyproject_hooks/_in_process/_in_process.py", line 152, in prepare_metadata_for_build_wheel
          whl_basename = backend.build_wheel(metadata_directory, config_settings)
                         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
        File "/tmp/pip-install-pc_4jb3u/xgboost_7ced82f2690f492f8f0a8d295645e2bf/packager/pep517.py", line 79, in build_wheel
          libxgboost = locate_or_build_libxgboost(
                       ^^^^^^^^^^^^^^^^^^^^^^^^^^^
        File "/tmp/pip-install-pc_4jb3u/xgboost_7ced82f2690f492f8f0a8d295645e2bf/packager/nativelib.py", line 170, in locate_or_build_libxgboost
          return build_libxgboost(cpp_src_dir, build_dir=build_dir, build_config=build_config)
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
        File "/tmp/pip-install-pc_4jb3u/xgboost_7ced82f2690f492f8f0a8d295645e2bf/packager/nativelib.py", line 106, in build_libxgboost
          _build(generator=generator)
        File "/tmp/pip-install-pc_4jb3u/xgboost_7ced82f2690f492f8f0a8d295645e2bf/packager/nativelib.py", line 68, in _build
          subprocess.check_call([build_tool, f"-j{nproc}"], cwd=build_dir)
        File "/usr/lib/python3.11/subprocess.py", line 413, in check_call
          raise CalledProcessError(retcode, cmd)
      subprocess.CalledProcessError: Command '['make', '-j8']' returned non-zero exit status 2.
      [end of output]
  
  note: This error originates from a subprocess, and is likely not a problem with pip.
error: metadata-generation-failed

× Encountered error while generating package metadata.
╰─> See above for output.

note: This is an issue with the package mentioned above, not pip.
hint: See above for details.
[18:00:58] FATAL: Failed installing package xgboost
s6-rc: warning: unable to start service init-appdaemon: command exited 1
/run/s6/basedir/scripts/rc.init: warning: s6-rc failed to properly bring all the services up! Check your logs (in /run/uncaught-logs/current if you have in-container logging) for more information.
/run/s6/basedir/scripts/rc.init: fatal: stopping the container.
s6-rc: info: service legacy-cont-init: stopping
s6-rc: info: service legacy-cont-init successfully stopped
s6-rc: info: service fix-attrs: stopping
s6-rc: info: service base-addon-log-level: stopping
s6-rc: info: service fix-attrs successfully stopped
s6-rc: info: service base-addon-log-level successfully stopped
s6-rc: info: service base-addon-banner: stopping
s6-rc: info: service base-addon-banner successfully stopped
s6-rc: info: service s6rc-oneshot-runner: stopping
s6-rc: info: service s6rc-oneshot-runner successfully stopped

XGBoost does not officially support musl environments. You will need to patch XGBoost in order to build it.

To use mmap64 in musl, you’ll need to define the macro _LARGEFILE64_SOURCE.
Try this patch:

From b14518564be08f1d062cf855502a2c60d4d09959 Mon Sep 17 00:00:00 2001
From: Hyunsu Cho <chohyu01@cs.washington.edu>
Date: Tue, 19 Mar 2024 15:07:45 -0700
Subject: [PATCH] Enable mmap64 on Musl

---
 src/CMakeLists.txt | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/src/CMakeLists.txt b/src/CMakeLists.txt
index 161889f9e..7012607c9 100644
--- a/src/CMakeLists.txt
+++ b/src/CMakeLists.txt
@@ -30,6 +30,8 @@ if(LOG_CAPI_INVOCATION)
   target_compile_definitions(objxgboost PRIVATE -DLOG_CAPI_INVOCATION=1)
 endif()

+target_compile_definitions(objxgboost PUBLIC -D_LARGEFILE64_SOURCE=1)
+
 # For MSVC: Call msvc_use_static_runtime() once again to completely
 # replace /MD with /MT. See https://github.com/dmlc/xgboost/issues/4462
 # for issues caused by mixing of /MD and /MT flags
--
2.34.1

Hello, hcho3

First of all, thank you very much for the response you provided a few months ago. I followed your guidance at the time and applied the patch without any difficulty. Consequently, I was able to install XGBoost perfectly in appdaemon.

However, a few days ago, I needed to restart the environment to update homeassistant/appdaemon, and every time appdaemon restarts, it is necessary to reinstall everything again. Now, when trying to reinstall using the same methodology you provided, I am encountering an error with the patch you supplied, as it seems there have been updates to the xgboost repository.

Currently, I am receiving an error when trying to apply the patch:

A few months ago, when I tried to execute the code that clones the repository, copies the patch to the folder, and applies it (defining LARGEFILE64 ):

git clone --recursive https://github.com/dmlc/xgboost.git /homeassistant/appdaemon/xgboost
cp /homeassistant/appdaemon/xgboost-musl.patch /homeassistant/appdaemon/xgboost/
cd /homeassistant/appdaemon/xgboost && patch -p1 < xgboost-musl.patch
cd /homeassistant/appdaemon/xgboost && mkdir build && cd build && cmake .. && make -j$(nproc)

Everything executed perfectly. Now, after the repository update, the following error occurs:


787259b412c18ab8d5f24bf2b8bd6a59ff8208f3’
patching file src/CMakeLists.txt

Hunk #1 FAILED at 30.
1 out of 1 hunk FAILED -- saving rejects to file src/CMakeLists.txt.rej
[20:33:33] FATAL: Failed executing init command: cd /homeassistant/appdaemon/xgboost && patch -p1 < xgboost-musl.patch
s6-rc: warning: unable to start service init-appdaemon: command exited 1
...

Is there an updated version of your patch that can be used to enable the XGBoost installation again?

Note:
As I quickly checked, src/CMakeLists.txt:
Line 30 still contains the snippet from your patch:
target_compile_definitions(objxgboost PRIVATE -DLOG_CAPI_INVOCATION=1)

But now it shows the error “Hunk #1 FAILED at 30” and cannot apply the patch in the version you provided above, which was solving the mmap64 issue.

Make sure to use the consistent commit when checking out XGBoost:

git clone --recursive https://github.com/dmlc/xgboost.git -b v2.0.3 /homeassistant/appdaemon/xgboost