Skip to content

feat: Implement SORT() directive for linker scripts#1994

Open
Abh10705 wants to merge 10 commits into
wild-linker:mainfrom
Abh10705:feature/section-sorting
Open

feat: Implement SORT() directive for linker scripts#1994
Abh10705 wants to merge 10 commits into
wild-linker:mainfrom
Abh10705:feature/section-sorting

Conversation

@Abh10705
Copy link
Copy Markdown

@Abh10705 Abh10705 commented May 28, 2026

Transitioned from hashmaps to flat vectors (Vec) for harvested sections. This ensures continuous memory layout and significantly improves L1 cache locality during the final layout binary search.
Fixes #1661

- Introduces Harvester engine in layout.rs to collect and sort sections lexicographically.

- Implements physical section reordering in elf_writer.rs with O(1) skip logic for optimized performance.

- Adds comprehensive integration test and 50k-section stress test verification.

- Verified physical byte layout is contiguous and alphabetical via objdump.

Benchmark (Warm cache):

- GNU ld: 0.221s

- wild: 0.093s

Fixes wild-linker#1661
Copy link
Copy Markdown
Member

@davidlattimore davidlattimore left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've only partially reviewed, but I just noticed that this is a new PR. Was there a reason to open a new PR rather than continuing with 1857? Also, I just noticed that some of the comments I made on that PR haven't yet been addressed in this PR, so it's probably worthwhile looking back at the comments on that PR

Comment thread libwild/src/elf_writer.rs Outdated
let _span = debug_span!("write_file", filename = %object.input).entered();
let _file_span = layout.args().common().trace_span_for_file(object.file_id);

// Fast O(1) lookup to skip harvested sections
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Anything that involves a loop is unlikely to be O(1).

Comment thread libwild/src/elf_writer.rs Outdated
let _file_span = layout.args().common().trace_span_for_file(object.file_id);

// Fast O(1) lookup to skip harvested sections
let epilogue = layout
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel like there's an easier way to get the epilogue. Pretty sure we store the FileId of the epilogue somewhere

Comment thread libwild/src/elf_writer.rs
// The actual build-id will be filled in later once all writing has completed. It's important
// that we fill it with zeros now however, since if we're overwriting an existing file, there
// might be other data there and we don't zero it, then the build ID will be hashing that data.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This extra blank line seems unnecessary and separates the above comment from the code it's referring to

Comment thread libwild/src/layout.rs Outdated
harvested_sections_registry.push(sec.clone());
}

// Crucial: We MUST sort by the IDs here, otherwise the Binary Search in Step 3 will fail.
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To what extent was this PR written by an LLM and to what extent do you understand the code that was produced? I ask here because referring to "step 3" seems like the kind of thing an LLM would do. There doesn't seem to be numbered "steps" defined anywhere.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeh that was during the merge conflict i coudn't figure it out, so i pasted my file and the file from the wild repo and asked it to make a new merged file, it asked for some context as to what I was doing/ changing to the code and added these comments I should have paid more attention in removing them

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if you go over our conversation on zulip, you should get the idea if I understand what Im doing, respectfully

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the explanation. Hope I didn't offend, it's just that I know this PR is a difficult undertaking for someone without previous experience in the codebase and I worry when I see signs of LLM use in those sorts of circumstances. That makes sense that you used it for fixing merge conflicts and that it introduced changes that were perhaps unintended.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

wasn't offended at all, it didn't even merge those files properly I had to restart and fix those myself line by line, while i asked it to merge the file I also asked it to review my work and that is where the comments are from, I removed them as i was fixing the merge conflicts but ig this one got left

Comment thread libwild/src/layout.rs
pub(crate) size: u64,
pub(crate) alignment: Alignment,
pub(crate) mem_offset: u64,
pub(crate) _name: &'data [u8],
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this here?


__attribute__((used, section(".text.func_a"))) int func_a() { return 1; }

__attribute__((used, section(".text.func_b"))) int func_b() { return 2; }
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test doesn't seem to actually be testing that the sections are sorted. If I disable sorting, e.g. by always setting SectionPattern.sorted to false, the test still passes. It'd also be good to test that sorted sections from multiple input objects get correctly sorted together.

Comment thread libwild/src/linker_script.rs Outdated

Ok(SectionPattern { name, sorted: true })
} else {
// Original behavior: bare pattern
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Documenting "original behaviour" makes sense from the perspective of reading a PR, but doesn't make sense for someone reading the code later, after the PR is merged. Comments should always be written with the future reader in mind. This comment is probably unnecessary.

Comment thread libwild/src/linker_script.rs Outdated
fn parse_pattern<'input>(input: &mut &'input BStr) -> winnow::Result<SectionPattern<'input>> {
// Handling SORT(...) wrapper: SORT is an alias for SORT_BY_NAME in GNU ld.
if input.starts_with(b"SORT") {
// Consume "SORT"
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comments like this don't really add anything. It's pretty much just repeating what the next line says.

Comment thread libwild/src/linker_script.rs Outdated
Ok(pattern)
fn parse_pattern<'input>(input: &mut &'input BStr) -> winnow::Result<SectionPattern<'input>> {
// Handling SORT(...) wrapper: SORT is an alias for SORT_BY_NAME in GNU ld.
if input.starts_with(b"SORT") {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm pretty sure there is a helper that checks starts_with and then consumes

*(.text)

/* Protect our harvested sections from the GC! */
KEEP(*(SORT(.text.*)))
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It'd be good to test both with and without KEEP and make sure that GC works as expected for sorted sections that aren't referenced.

@Abh10705
Copy link
Copy Markdown
Author

I've only partially reviewed, but I just noticed that this is a new PR. Was there a reason to open a new PR rather than continuing with 1857? Also, I just noticed that some of the comments I made on that PR haven't yet been addressed in this PR, so it's probably worthwhile looking back at the comments on that PR

yeh that just skipped my mind, i will close this one and go back to the previous one

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Implement linker-script SORT(...) for input sections

2 participants