Skip to content

Preserve pointer provenance during string table offset arithmetic#641

Open
Flowdalic wants to merge 1 commit into
NixOS:masterfrom
Flowdalic:shstrtab-ptr-provenance
Open

Preserve pointer provenance during string table offset arithmetic#641
Flowdalic wants to merge 1 commit into
NixOS:masterfrom
Flowdalic:shstrtab-ptr-provenance

Conversation

@Flowdalic
Copy link
Copy Markdown

When calculating the address of the section header string table, the previous implementation cast the base data pointer to a size_t to perform integer arithmetic, and then cast the resulting integer back to a const char *.

This breaks standard C/C++ pointer provenance rules. Compilers rely on pointer arithmetic to track object origins, bounds, and aliasing. Losing this provenance chain by laundering the pointer through an integer type can lead to undefined behavior, as the compiler may no longer recognize the resulting pointer as safely pointing within the original allocation.

This patch resolves the issue by using standard pointer arithmetic instead of integer arithmetic. By adding the offset directly to the base pointer (fileContents->data() + shstrtabOffset) and only casting the final, derived pointer to the target type, the provenance chain remains perfectly intact.

Adhering to these provenance rules is strictly required on capability-based architectures like CHERI (e.g., CHERI RISC-V). On these systems, pointers are hardware capabilities containing bounds and validity tags. The integer cast stripped this metadata, resulting in an untagged, invalid pointer that caused a hardware trap (SIGSEGV) when reading the string table. This fix allows the compiler use correctly derived and valid capabilities.

Additionally, replace the __builtin_add_overflow check on the absolute memory address with a more straightforward bounds check against the total file size.

When calculating the address of the section header string table, the
previous implementation cast the base data pointer to a size_t to
perform integer arithmetic, and then cast the resulting integer back
to a const char *.

This breaks standard C/C++ pointer provenance rules. Compilers rely on
pointer arithmetic to track object origins, bounds, and aliasing.
Losing this provenance chain by laundering the pointer through an
integer type can lead to undefined behavior, as the compiler may no
longer recognize the resulting pointer as safely pointing within the
original allocation.

Fix the issue by using standard pointer arithmetic instead of integer
arithmetic. By adding the offset directly to the base
pointer (fileContents->data() + shstrtabOffset) and only casting the
final, derived pointer to the target type, the provenance chain
remains perfectly intact.

Adhering to these provenance rules is strictly required on
capability-based architectures like CHERI (e.g., CHERI RISC-V). On
these systems, pointers are hardware capabilities containing bounds
and validity tags. The integer cast stripped this metadata, resulting
in an untagged, invalid pointer that caused a hardware trap (SIGSEGV)
when reading the string table. This fix allows the compiler use
correctly derived and valid capabilities.

Additionally, replace the __builtin_add_overflow() check on the
absolute memory address with a more straightforward bounds check
against the total file size.
@Flowdalic Flowdalic force-pushed the shstrtab-ptr-provenance branch from 18ec58c to f999432 Compare May 21, 2026 13:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant