Skip to content

nix-builder: switch to manylinux_2_28, dynamically link libstdc++#558

Open
danieldk wants to merge 3 commits into
mainfrom
manylinux-2.28
Open

nix-builder: switch to manylinux_2_28, dynamically link libstdc++#558
danieldk wants to merge 3 commits into
mainfrom
manylinux-2.28

Conversation

@danieldk
Copy link
Copy Markdown
Member

Before this change, we were ensuring manylinux_2_28 compliance by:

  • Building and linking against glibc 2.27 (there was no nixpkgs revision with 2.28).
  • Linking libstdc++ statically.

Linking libstdc++ statically has turned out to be a pain point. If a kernel uses libstdc++ functionality that does some global initialization, the initialization of the dynamically-linked libstdc++ (e.g. through Torch) and the statically linked libstdc++ could conflict. One example is std::regex, which indirectly initializes some global code for locale handling.

As an alternative, I explored just linking libstdc++ dynamically, which worked up to some point by avoiding certain C++ features. However gcc 13 moved the initialization of the standard stream objects into the shared library. As a result, library compiled with gcc 13 or later requires a libstdc++ that is newer than manylinux_2_28.

The EL8 (RHEL, AlmaLinux, etc.) gcc toolset used in the manylinux_2_28 solves this by using EL8 libstdc++ and providing a static library that gets linked into a binary for newer C++ features. Since this is part of a fairly large patchset to gcc, it does not seem feasible to reproduce this easily in the Nix gcc derivations. So, instead, we repackage the gcc toolsets from AlmaLinux as Nix derivations in this change. This gives us a toolchain that is as close to the official manylinux_2_28 toolchain as possible.

The packaged toolchains are exposed as a standard nixpkgs stdenv, so that they can be used with other Nix derivations (such as the Torch/tvm-ffi extensions).

One exception is made in reproducing the toolchain and that is that we build glibc 2.28 ourselves. glibc carries the dynamic loader. The dynamic loader in AlmaLinux embeds FHS paths such as /lib64, which are not valid in Nix, and lead to linking errors since the library directory of e.g. libc.so.6 cannot be found. By rebuilding glibc 2.28, we get a dynamic loader with the correct paths into the Nix store.

Before this change, we were ensuring manylinux_2_28 compliance by:

* Building and linking against glibc 2.27 (there was no nixpkgs revision
  with 2.28).
* Linking libstdc++ statically.

Linking libstdc++ statically has turned out to be a pain point. If a
kernel uses libstdc++ functionality that does some global
initialization, the initialization of the dynamically-linked libstdc++
(e.g. through Torch) and the statically linked libstdc++ could conflict.
One example is `std::regex`, which indirectly initializes some global code
for locale handling.

As an alternative, I explored just linking libstdc++ dynamically, which
worked up to some point by avoiding certain C++ features. However gcc 13
moved the initialization of the standard stream objects into the shared
library. As a result, library compiled with gcc 13 or later requires a
libstdc++ that is newer than manylinux_2_28.

The EL8 (RHEL, AlmaLinux, etc.) gcc toolset used in the manylinux_2_28
solves this by using EL8 libstdc++ and providing a static library that
gets linked into a binary for newer C++ features. Since this is part of
a fairly large patchset to gcc, it does not seem feasible to reproduce
this easily in the Nix gcc derivations. So, instead, we repackage the
gcc toolsets from AlmaLinux as Nix derivations in this change. This
gives us a toolchain that is as close to the official manylinux_2_28
toolchain as possible.

The packaged toolchains are exposed as a standard nixpkgs stdenv, so
that they can be used with other Nix derivations (such as the
Torch/tvm-ffi extensions).

One exception is made in reproducing the toolchain and that is that we
build glibc 2.28 ourselves. glibc carries the dynamic loader. The
dynamic loader in AlmaLinux embeds FHS paths such as `/lib64`, which are
not valid in Nix, and lead to linking errors since the library directory
of e.g. `libc.so.6` cannot be found. By rebuilding glibc 2.28, we get a
dynamic loader with the correct paths into the Nix store.
@@ -0,0 +1,272 @@
/*
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Everything in the glibc_2_28 directory comes from Nix and does not need a review. Changes are:

  • Bump the version to 2.28.
  • Move postFixup outside the condition to remove a stray dependency on glibc 2.27.

@HuggingFaceDocBuilderDev
Copy link
Copy Markdown

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

@@ -0,0 +1,40 @@
{
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The JSON files in this directory are auto-generated from repomd data and do not need to be reviewed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants