Skip to content

Fix zero-length group matches#40

Open
Avlaak wants to merge 1 commit into
cesanta:masterfrom
Avlaak:fix/zero-length-group-matches
Open

Fix zero-length group matches#40
Avlaak wants to merge 1 commit into
cesanta:masterfrom
Avlaak:fix/zero-length-group-matches

Conversation

@Avlaak
Copy link
Copy Markdown

@Avlaak Avlaak commented Mar 26, 2026

Fix zero-length group matches

Summary

When a capturing group matches but captures zero characters (e.g. ([0-9]*) on
a non-digit input), the corresponding slre_cap entry was left uninitialized.
This change explicitly sets ptr = NULL and len = 0 for such captures.

Root cause

In bar(), the capture assignment only ran when n > 0:

if (info->caps != NULL && n > 0) {
    info->caps[bi - 1].ptr = s + j;
    info->caps[bi - 1].len = n;
}

When a group like ([0-9]*) legitimately matched zero characters (n == 0),
the caps slot was never written to. The caller received stale or
uninitialized data, making it impossible to distinguish "group participated
but captured nothing" from "group was never evaluated".

Fix

Added an else if branch for n == 0:

} else if (info->caps != NULL && n == 0) {
    info->caps[bi - 1].ptr = NULL;
    info->caps[bi - 1].len = 0;
}

This gives callers a clear contract:

n caps[i].ptr caps[i].len Meaning
> 0 points to match length Group captured content
== 0 NULL 0 Group participated, captured nothing
< 0 Match failed (returned early)

Tests

Added regression tests for zero-length captures:

  • ([0-9]*) on "abc" — match returns 0, ptr == NULL, len == 0
  • "([0-9]*)" on "" — match returns 2, ptr == NULL, len == 0
  • "([0-9]*)" "([0-9]*)" on "" "1" — first cap NULL/0, second cap "1"/1
  • ([0-9]*) ([0-9]*) on "1 abc" — first cap "1"/1, second cap NULL/0

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant