Skip to content

Conversation

@ddelnano
Copy link
Member

@ddelnano ddelnano commented Oct 9, 2024

Summary: Upgrade bcc and libbpf to fix BPF program compilation on 6.10 and later kernels

Bcc provides some "virtual" includes to BPF programs. The compat/linux/virtual_bpf.h file in particular needs to be kept in sync with libbpf and matches the header guard of the include/uapi/linux/bpf.h file. This means that while our linux headers were updated, our older bcc install was inserting an older copy of the uapi/linux/bpf.h file -- one that didn't contain the bpf_wq declaration.

  include/linux/bpf.h:348:10: error: invalid application of 'sizeof' to an incomplete type 'struct bpf_wq'
                  return sizeof(struct bpf_wq);
                         ^     ~~~~~~~~~~~~~~~
  include/linux/bpf.h:348:24: note: forward declaration of 'struct bpf_wq'
                  return sizeof(struct bpf_wq);
                                       ^
  include/linux/bpf.h:377:10: error: invalid application of '__alignof' to an incomplete type 'struct bpf_wq'
                  return __alignof__(struct bpf_wq);
                         ^          ~~~~~~~~~~~~~~~
  include/linux/bpf.h:377:29: note: forward declaration of 'struct bpf_wq'
                  return __alignof__(struct bpf_wq);

Note: while this fixes the 6.10 compilation issue, our 6.10 qemu build fails without disabling this logic. 6.10 kernels added BPF token support. This changes the BPF permission model slightly and causes the BPF instruction limit to be dependent on the permissions of the BPF syscall caller (linux source).

This new BPF token logic coupled with our qemu setup, causes our 6.10 build to fallback to the 4096 instruction limit. I'll be addressing this in #2040 and #2042. Those issues shouldn't block this change since that loop limit code can be bypasses at runtime with our current cli flags.

Relevant Issues: Closes #2035

Type of change: /kind bugfix

Test Plan: Built 6.10 and 6.11 kernels and the associated linux headers from #2036 and verified that a local qemu build passes

  • Verify #ci:bpf-build-all-kernels build passes

Changelog Message: Upgraded bcc and libbpf to support kernels 6.10 and later

@ddelnano ddelnano requested a review from a team as a code owner October 9, 2024 14:51
…er kernel

Signed-off-by: Dom Del Nano <ddelnano@gmail.com>
Signed-off-by: Dom Del Nano <ddelnano@gmail.com>
@ddelnano ddelnano force-pushed the ddelnano/upgrade-bcc-to-fix-6.10+-kernel-issues branch from 898bc47 to 34162ea Compare October 9, 2024 14:56
Comment on lines -149 to +152
sha256 = "4d503428c7aead070a59630dd0906318a430b3e279a35f51ec601fbdd7d31eb6",
strip_prefix = "libbpf-3b0973892891744d20ae79e99c0d1a26a59c4222",
sha256 = "859a31e9101237338d46eb62a62cb8fcb342c9ce0f9b9137e5a3a728c088c338",
strip_prefix = "libbpf-42065ea6627ff6e1ab4c65e51042a70fbf30ff7c",
urls = [
"https://github.com/libbpf/libbpf/archive/3b0973892891744d20ae79e99c0d1a26a59c4222.tar.gz",
"https://github.com/libbpf/libbpf/archive/42065ea6627ff6e1ab4c65e51042a70fbf30ff7c.tar.gz",
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the libbpf version that bcc v0.31.1 uses.

bcc

Comment on lines -130 to 135
com_github_iovisor_bcc = dict(
sha256 = "d34f9484588a9c25be936c910c86f8b25b04e5b0c802d0630e77cc9a8a272aed",
strip_prefix = "bcc-e0698be7b797129cb113912e96ad741a551e2291",
sha256 = "416426fbe22d617d8aed088062f4489e69176136e99dc0b933df58e83d9175da",
strip_prefix = "bcc-e57be8465b9cf238f1c04b1c7e154fd1db85326d",
urls = [
"https://github.com/pixie-io/bcc/archive/e0698be7b797129cb113912e96ad741a551e2291.tar.gz",
"https://github.com/pixie-io/bcc/archive/e57be8465b9cf238f1c04b1c7e154fd1db85326d.tar.gz",
],
),
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the pixie9 branch, which rebased our patches on top of v0.31.0 tag (latest bcc release).

@ddelnano ddelnano changed the title Upgrade bcc and libbpf to fix BPF program compilation on 6.10 and later kernel Upgrade bcc and libbpf to fix BPF program compilation on 6.10 and later kernels Oct 11, 2024
Copy link
Contributor

@oazizi000 oazizi000 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be safe for pre-6.10 kernels, so we should land this.

For 6.10+, investigation is needed to see if just our qemu test environment is broken or if there is a bigger issue that will affect the PEM more generally.

We are consciously decided to go ahead with this, and do the the investigation as a follow-up.

@ddelnano ddelnano merged commit 738111f into pixie-io:main Oct 11, 2024
@ddelnano ddelnano deleted the ddelnano/upgrade-bcc-to-fix-6.10+-kernel-issues branch October 11, 2024 22:19
ddelnano added a commit that referenced this pull request Dec 16, 2024
Summary: Fix release note generation script

Our releases have blank release notes. This makes it difficult for end
users to understand what has changed between releases. This PR updates
the existing script that was built to auto generate changelog notes.

Relevant Issues: N/A

Type of change: /kind bug

Test Plan: Ran the script for each artifact type and verified the output
was expected
- [x] cli release notes are expected
```
$ ./scripts/create_release_tag.sh cli -n
$ git tag -l --format='%(contents)' release/cli/v0.9.0-pre-ddelnano-fix-release-note-generation.4
### New Features
- (#2048) Enhanced the `px` cli to detect OpenShift clusters
and prompt to install the appropriate SecurityContextConstraints before
proceeding with a deploy
```
- [x] vizier release notes are expected
```
# Needed to modify prev_tag in script since v0.14.13 to main's HEAD doesn't have vizier changelog messages
$ ./scripts/create_release_tag.sh vizier -n
$ git tag -l --format='%(contents)' release/vizier/v0.15.0-pre-main.4
### Bug Fixes
- (#2047) Ensures that the `--stirling_bpf_loop_limit` and
`--stirling_bpf_chunk_limit` values are respected if explicitly provided
on the command line. For 5.1 and later kernels, cli provided values
would have been ignored

```
- [x] cloud release notes are generated correctly
```
$ ./scripts/create_release_tag.sh cloud -n
Generating changelog from release/cloud/v0.1.8..release/cloud/v0.2.0-pre-ddelnano-fix-release-note-generation.1

$ git tag -l --format='%(contents)' release/cloud/v0.2.0-pre-ddelnano-fix-release-note-generation.1
### New Features
- (#2043) Add support for rendering differential flamegraphs in
the `StackTraceFlameGraph` display spec
### Bug Fixes
- (#2041) Upgraded bcc and libbpf to support kernels 6.10 and
later
```
ddelnano added a commit to ddelnano/pixie that referenced this pull request Aug 6, 2025
…er kernels (pixie-io#2041)

Summary: Upgrade bcc and libbpf to fix BPF program compilation on 6.10
and later kernels

Bcc provides some
"[virtual](https://github.com/iovisor/bcc/blob/cb1ba20f4800f556dc940682ba7016c50bd0a3ac/src/cc/exported_files.cc#L28-L48)"
includes to BPF programs. The `compat/linux/virtual_bpf.h` file in
particular needs to be kept in sync with libbpf and matches the [header
guard](https://github.com/iovisor/bcc/blob/cb1ba20f4800f556dc940682ba7016c50bd0a3ac/src/cc/compat/linux/virtual_bpf.h#L9)
of the `include/uapi/linux/bpf.h` file. This means that while our linux
headers were updated, our older bcc install was inserting an older copy
of the `uapi/linux/bpf.h` file -- one that didn't contain the `bpf_wq`
declaration.

```
  include/linux/bpf.h:348:10: error: invalid application of 'sizeof' to an incomplete type 'struct bpf_wq'
                  return sizeof(struct bpf_wq);
                         ^     ~~~~~~~~~~~~~~~
  include/linux/bpf.h:348:24: note: forward declaration of 'struct bpf_wq'
                  return sizeof(struct bpf_wq);
                                       ^
  include/linux/bpf.h:377:10: error: invalid application of '__alignof' to an incomplete type 'struct bpf_wq'
                  return __alignof__(struct bpf_wq);
                         ^          ~~~~~~~~~~~~~~~
  include/linux/bpf.h:377:29: note: forward declaration of 'struct bpf_wq'
                  return __alignof__(struct bpf_wq);
```

Note: while this fixes the 6.10 compilation issue, our 6.10 qemu build
fails without disabling [this
logic](https://github.com/pixie-io/pixie/blob/3c41d554215528e688328aef94192e696db617dc/src/stirling/source_connectors/socket_tracer/socket_trace_connector.cc#L464-L472).
6.10 kernels added BPF token support. This changes the BPF permission
model slightly and causes the BPF instruction limit to be dependent on
the permissions of the BPF syscall caller ([linux
source](https://elixir.bootlin.com/linux/v6.11.1/source/kernel/bpf/syscall.c#L2757)).

This new BPF token logic coupled with our qemu setup, causes our 6.10
build to fallback to the 4096 instruction limit. I'll be addressing this
in pixie-io#2040 and pixie-io#2042. Those issues shouldn't block this change since that
loop limit code can be bypasses at runtime with our current cli flags.

Relevant Issues: Closes pixie-io#2035

Type of change: /kind bugfix

Test Plan: Built 6.10 and 6.11 kernels and the associated linux headers
from pixie-io#2036 and verified that a local qemu build passes
- [x] Verify `#ci:bpf-build-all-kernels` build passes

Changelog Message: Upgraded bcc and libbpf to support kernels 6.10 and
later

---------

Signed-off-by: Dom Del Nano <ddelnano@gmail.com>
GitOrigin-RevId: 738111f
ddelnano added a commit to ddelnano/pixie that referenced this pull request Aug 6, 2025
Summary: Fix release note generation script

Our releases have blank release notes. This makes it difficult for end
users to understand what has changed between releases. This PR updates
the existing script that was built to auto generate changelog notes.

Relevant Issues: N/A

Type of change: /kind bug

Test Plan: Ran the script for each artifact type and verified the output
was expected
- [x] cli release notes are expected
```
$ ./scripts/create_release_tag.sh cli -n
$ git tag -l --format='%(contents)' release/cli/v0.9.0-pre-ddelnano-fix-release-note-generation.4
### New Features
- (pixie-io#2048) Enhanced the `px` cli to detect OpenShift clusters
and prompt to install the appropriate SecurityContextConstraints before
proceeding with a deploy
```
- [x] vizier release notes are expected
```
# Needed to modify prev_tag in script since v0.14.13 to main's HEAD doesn't have vizier changelog messages
$ ./scripts/create_release_tag.sh vizier -n
$ git tag -l --format='%(contents)' release/vizier/v0.15.0-pre-main.4
### Bug Fixes
- (pixie-io#2047) Ensures that the `--stirling_bpf_loop_limit` and
`--stirling_bpf_chunk_limit` values are respected if explicitly provided
on the command line. For 5.1 and later kernels, cli provided values
would have been ignored

```
- [x] cloud release notes are generated correctly
```
$ ./scripts/create_release_tag.sh cloud -n
Generating changelog from release/cloud/v0.1.8..release/cloud/v0.2.0-pre-ddelnano-fix-release-note-generation.1

$ git tag -l --format='%(contents)' release/cloud/v0.2.0-pre-ddelnano-fix-release-note-generation.1
### New Features
- (pixie-io#2043) Add support for rendering differential flamegraphs in
the `StackTraceFlameGraph` display spec
### Bug Fixes
- (pixie-io#2041) Upgraded bcc and libbpf to support kernels 6.10 and
later
```

GitOrigin-RevId: e2a6737
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Socket tracer unable to start on 6.10 and later kernels

4 participants