feat: Support GDB index#1990
Conversation
| fallback_cu: u32, | ||
| mut on_entry: impl FnMut(&'data [u8], u32), | ||
| ) { | ||
| for section_name in [".debug_gnu_pubnames", ".debug_gnu_pubtypes"] { |
There was a problem hiding this comment.
The symbol table construction assumes the presence of these two sections by adding the compiler flag -ggnu-pubnames (see the added integration test). Without these sections, the symbol table will not be built and only the CU list and address table will be constructed. This effectively means you won't benefit from GDB index, but since lld and mold behave this way, I've aligned the implementation.
There was a problem hiding this comment.
Can we emit a warning to the user telling the option is leading to a GDB index result?
|
I did I have no idea what happened... 🤷 |
marxin
left a comment
There was a problem hiding this comment.
Given the numerous findings I created - can you explain what was the genesis of the PR and how much was a LLM involved? I know we had a PR for the very same functionality that was closed right after it was opened. Is it an incarnation of the changes, or was it created completely independently?
| fn parse_cu_boundaries(data: &[u8]) -> Vec<CuBoundary> { | ||
| let mut cus = Vec::new(); | ||
| let mut offset = 0usize; | ||
| while offset + 4 <= data.len() { |
There was a problem hiding this comment.
Might be good linking a place where the CU encoding is actually defined.
| 4 + init_len as usize | ||
| }; | ||
| if total == 0 || offset + total > data.len() { | ||
| break; |
There was a problem hiding this comment.
Shouldn't we rather report a warning as the emitted GDB index will be likely incomeplete/corrupted?
| } | ||
|
|
||
| /// Walk `.debug_info` bytes and return `(offset, total_length)` for each CU. | ||
| fn parse_cu_boundaries(data: &[u8]) -> Vec<CuBoundary> { |
There was a problem hiding this comment.
As I am more thinking about it - in general, can't we just the gimli crate for the parsing of the various DWARF-related data structures? Like here, using https://docs.rs/gimli/latest/gimli/read/struct.DebugInfo.html#method.units. I know there might be some performance penalty, but we should definitely measure it.
There was a problem hiding this comment.
From this, the only function that can currently be written using gimli is this one. It feels somewhat excessive to bring in gimli just for this purpose (since constructing indices requires only offset and length)... 🤔
| /// | ||
| /// Each set has a header pointing to a CU in `.debug_info`, followed by | ||
| /// (die_offset, attrs_byte, NUL-terminated name) entries. | ||
| fn parse_pubnames_sets(data: &[u8]) -> Vec<PubnamesSet<'_>> { |
There was a problem hiding this comment.
There was a problem hiding this comment.
gimli doesn't appear to support GNU attributes bytes, so I think we can't use them as-is.
https://github.com/gimli-rs/gimli/blob/45dde2309bda8f385acdf0c945ebc4f275ffe124/src/read/lookup.rs#L186-L199
|
|
||
| // Emit into the output buffer. | ||
| let total = cp_off as usize + cv_data.len() + str_data.len(); | ||
| let len = buf.len().min(total); |
There was a problem hiding this comment.
This should be a hard error if the allocated buffer does not match the expected written data.
|
@marxin (I haven't seen the review details yet)
I think you mean #1605. I used that as a reference when implementing this, since I had also reviewed it at the time and therefore already had some familiarity with the surrounding context and implementation details. However, I did not reuse it directly for licensing concerns (and in any case, it does not support version 9 as-is). I mainly used LLMs to help refine the implementation through review and iterative revisions. |
Yes - fine, that's a good approach you chose ;)
All good! Let's iterate on the PR, given you knowledge, I guess you can address the findings pretty fast. Happy to review and help you with the feature. |
This patch adds support for
--gdb-indexand--no-gdb-index. Although we support version 9 here, the test environment may produce a version 7 GDB index when using lld. This patch also adds a test that checks whether specific symbols are included in the symbol table of the.gdb_indexsection. Because of this requirement, the integration test includes logic to support versions earlier than version 9 as well.Closes #811