Skip to content

fix(translation): only extract chunks from the marked text, not its c… #f408e83#49

Merged
navidshad merged 1 commit into
devfrom
CU-86exw91re_Fix-chunks-confusion-on-translation-function_Navid-Shad
Jun 8, 2026
Merged

fix(translation): only extract chunks from the marked text, not its c… #f408e83#49
navidshad merged 1 commit into
devfrom
CU-86exw91re_Fix-chunks-confusion-on-translation-function_Navid-Shad

Conversation

@navidshad

@navidshad navidshad commented Jun 8, 2026

Copy link
Copy Markdown
Contributor

📋 Summary

This PR fixes the translation extraction process to only extract chunks from the explicitly marked text segments, excluding their surrounding context.

🔗 Related Tasks

#f408e83 - Fix translation extraction to exclude context outside marked text

📝 Additional Details

Previously, translation extraction included surrounding context which could lead to incorrect or extraneous translation chunks. This update ensures extraction is limited strictly to the marked text.

📜 Commit List

f408e83 fix(translation): only extract chunks from the marked text, not its context

…ontext

The detailed-translation and advice models intermittently surfaced
reusable patterns (chunks) drawn from the surrounding `context` rather
than the user's marked selection. e.g. selecting "improvements are
included" returned "related to" and "based on", which appear only in the
context paragraph.

Add a deterministic guardrail, `filterChunksToSelection`, that drops any
chunk whose text does not appear (case- and whitespace-insensitively)
inside the marked `phrase`, and apply it to both `getDetailedTranslation`
and `getTranslationAdvice`. Also harden both prompts to state explicitly
that the context is only for disambiguating meaning and must never be a
source of chunks.

https://claude.ai/code/session_01UTb6tc1wE8UyWxCKvBHmwT
@navidshad

Copy link
Copy Markdown
Contributor Author

@navidshad navidshad added the review Claude Routine will take this and review the PR label Jun 8, 2026
@navidshad navidshad removed the review Claude Routine will take this and review the PR label Jun 8, 2026 — with Claude

@navidshad navidshad left a comment

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Automated PR Review

Primary Task: CU-86exw91re — Fix chunks confusion on translation function

Task alignment

The task describes a concrete bug: selecting "improvements are included" was returning "related to" and "based on" as chunks — words pulled from the surrounding context, not from the marked phrase itself.

  • Prompt hardening (service.ts:82, service.ts:155) — both getDetailedTranslation and getTranslationAdvice system prompts now explicitly distinguish context (disambiguation only) from phrase (the only valid source of chunks).
  • Deterministic guardrail (chunk-filter.ts) — filterChunksToSelection enforces the contract post-model regardless of what the LLM returns. Applied in both call sites (service.ts:121, service.ts:189).
  • Tests (__tests__/chunk-filter.test.ts) — 6 cases covering: the exact real-world failing example from the task, sub-phrase matching, case/whitespace normalisation, empty/whitespace chunks, empty selection, and undefined/null input. The test file directly references the task's failing case, making the regression net self-documenting.
  • ✅ Scope is appropriate — no over-engineering, nothing missing.

One subtle correctness point worth calling out: the advice guardrail correctly uses if (result.chunks) { result.chunks = filterChunksToSelection(...) } rather than an unconditional assignment. In the advice flow, chunks: undefined means "don't touch the existing highlights", whereas chunks: [] means "remove all highlights" — preserving undefined is the right call here.

Commit messages

  • fix(translation): only extract chunks from the marked text, not its context ✅ — type fix matches the real impact (bug, not a new capability), scope is correct, description is clear.

Prior review follow-up

No prior reviews.

Convention check

  • ✅ Module placement follows the established pattern (server/src/modules/translation/chunk-filter.ts, __tests__/chunk-filter.test.ts).
  • ✅ No new DB collections; config.ts untouched.
  • ✅ Conventional Commits: fix: → patch bump, correctly applied.
  • ⚠️ PR body — the "Related Tasks" and "Commit List" links point to https://app.clickup.com/t/f408e83 (a git commit SHA), not the ClickUp task. The correct URL is https://app.clickup.com/t/86exw91re. Cosmetic only — the auto-linked comment is correct.

Verdict

APPROVE — Clean, well-tested fix that directly addresses the exact bug in the task. The dual approach (prompt hardening + deterministic post-processing guardrail) makes the fix resilient to future model drift.


Generated by Claude Code

@navidshad navidshad merged commit 8ded3db into dev Jun 8, 2026
6 checks passed
@navidshad navidshad changed the title fix(translation): only extract chunks from the marked text, not its c… fix(translation): only extract chunks from the marked text, not its c… #f408e83 Jun 8, 2026
@github-actions

Copy link
Copy Markdown

🎉 This PR is included in version 1.0.0-dev.1 🎉

The release is available on GitHub release

Your semantic-release bot 📦🚀

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants