Skip to content

fix: Transliterate retries with larger buffer on overflow (#159)#223

Open
imnasnainaec wants to merge 8 commits into
masterfrom
worktree-agent-a11b72db1418df08e
Open

fix: Transliterate retries with larger buffer on overflow (#159)#223
imnasnainaec wants to merge 8 commits into
masterfrom
worktree-agent-a11b72db1418df08e

Conversation

@imnasnainaec

@imnasnainaec imnasnainaec commented Jun 2, 2026

Copy link
Copy Markdown
Contributor

Summary

  • Transliterator.Transliterate now automatically retries with a doubled buffer when ICU reports BUFFER_OVERFLOW_ERROR, instead of immediately throwing.
  • Characters that expand dramatically during transliteration (e.g. U+FDFA) now work with the default multiplier.

Fixes #159 (including the memory leak observed in a comment)

This fix is technically a breaking change, since Transliterate no longer throws an error in situations it used to...

🤖 Generated with Claude Code


This fix is technically a breaking change, since Transliterate no longer throws an error in situations it used to...

Devin review: https://app.devin.ai/review/sillsdev/icu-dotnet/pull/223

…#159)

Characters that expand dramatically during transliteration (e.g. U+FDFA ﷺ)
previously caused an immediate OverflowException when the default multiplier
was too small. The method now retries once with a doubled buffer before throwing.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@github-actions

github-actions Bot commented Jun 2, 2026

Copy link
Copy Markdown

Test Results

       8 files  ±  0     648 suites  ±0   9s ⏱️ -1s
   439 tests +  3     435 ✔️ +  3      4 💤 ±0  0 ±0 
3 600 runs  +24  3 390 ✔️ +24  210 💤 ±0  0 ±0 

Results for commit 23fa6dd. ± Comparison against base commit 293cca1.

♻️ This comment has been updated with latest results.

imnasnainaec and others added 5 commits June 2, 2026 14:00
On BUFFER_OVERFLOW_ERROR, utrans_transUChars updates textLength to the
actual number of UChars required, so use that directly rather than
doubling the current capacity.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Use Max(textLength, textCapacity * 2) on retry so the buffer is always
at least doubled even if ICU reports a partial output length.

Update Transliterate_Overflow test to assert success rather than
OverflowException, since the retry now handles the case correctly.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Assert exact output value in Transliterate_Overflow (not just non-empty)
- Add test for multiple high-expansion chars with multiplier=1
- Add test for empty string input

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@imnasnainaec imnasnainaec self-assigned this Jun 2, 2026
@imnasnainaec imnasnainaec marked this pull request as ready for review June 3, 2026 16:26
@imnasnainaec imnasnainaec requested a review from ermshiperete June 8, 2026 18:30
Comment thread source/icu.net.tests/TransliteratorTests.cs Outdated
Comment thread source/icu.net.tests/TransliteratorTests.cs
Comment thread source/icu.net/Transliterator.cs Outdated
Comment thread source/icu.net/Transliterator.cs
@imnasnainaec imnasnainaec requested a review from ermshiperete June 9, 2026 17:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Overflow exception is thrown by Transliterate method for certain characters

2 participants