From 66bbf9a6b9fc5cc0b3a5baa505d68548a4370b97 Mon Sep 17 00:00:00 2001 From: Min Huang <70873102+min0625@users.noreply.github.com> Date: Sun, 28 Jun 2026 05:45:53 +0000 Subject: [PATCH] fix: LC_ALL priority, empty lang filtering, and Windows arch detection - Prioritize LC_ALL over LANG in getSystemLanguage (LC_ALL has higher precedence per POSIX) - Filter empty strings from MINT_TARGET_LANG so trailing/double commas are gracefully ignored - Fix install.ps1 arch detection to use PROCESSOR_ARCHITECTURE env var instead of RuntimeInformation.OSArchitecture - Add tests covering LC_ALL override and empty-segment filtering - Document security model (system/user separation + nonce) in READMEs and AGENTS.md Co-Authored-By: Claude Sonnet 4.6 --- AGENTS.md | 4 +- README.ja.md | 1 + README.md | 1 + README.zh-TW.md | 1 + cmd/mint/main.go | 21 +++++++---- cmd/mint/main_test.go | 3 ++ script/install.ps1 | 87 ++++++++++++++++++++++++++++--------------- script/install.sh | 26 +++++++++++-- 8 files changed, 103 insertions(+), 41 deletions(-) diff --git a/AGENTS.md b/AGENTS.md index 8c3d2df..ef24d33 100644 --- a/AGENTS.md +++ b/AGENTS.md @@ -91,11 +91,11 @@ bin/mint # compiled binary (gitignored) - CLI framework: `github.com/spf13/cobra` — root command with `--target` / `-t` (target language), `--source` / `-s` (source language), and `--verbose` / `-v` (diagnostic output to stderr) flags. - Configuration: `github.com/spf13/viper` — reads env vars with `MINT_` prefix; no config files. - LLM backends called directly via raw `net/http` (no heavy SDKs); keeps binary minimal. -- `Completer` interface in `internal/llm/` allows provider backends without breaking changes. +- `Completer` interface in `internal/llm/` allows provider backends without breaking changes. `Complete(ctx, system, user, w)` keeps task instructions (`system`) separate from untrusted user input (`user`) so translated content cannot contaminate the instruction context. Each request also embeds a random nonce as a delimiter in the system prompt; the same nonce wraps the user text, preventing injected content from escaping its boundary even if it mimics delimiter syntax. Weaker models that echo the nonce lines back are filtered before the output reaches the caller. - Language detection: when no source language is given, the input language is inferred — by the LLM in rotation mode, or implicitly by the rewrite prompt for a single target. - Source language: optional `--source` / `-s` flag (BCP-47 tag); flag-only, no env var (a source is per-input, not a persistent preference). When set it skips detection and anchors the rewrite prompt to translate *from* that language, so cross-language homographs (e.g. French `chat` → English `cat`) and romanized input (e.g. `konnichiwa` → `hello`) are translated rather than treated as already-target text. Empty (the default) preserves the original auto-detect behavior. Pure language-neutral input still passes through unchanged regardless. - Language-neutral pass-through: if detected language is `neutral`, input is printed unchanged with no translation call. - Same-language behavior: if detected input language matches the target language, the tool performs grammar and spelling correction instead of translation. -- Target language priority: `--target` flag → `MINT_TARGET_LANG` env var → system locale (`$LANG` / `$LC_ALL`) → `en`. +- Target language priority: `--target` flag → `MINT_TARGET_LANG` env var → system locale (`$LC_ALL` / `$LANG`) → `en`. - Language rotation: `MINT_TARGET_LANG` accepts a comma-separated list (e.g., `en,zh-TW,ja`); when the detected input language matches a tag in the list, the tool translates to the next tag (wraps around). BCP-47 variants sharing the same primary subtag (e.g., `zh-HK` and `zh-TW`) are treated as equivalent. - Input from positional arg or stdin (auto-detected). diff --git a/README.ja.md b/README.ja.md index b236189..b73033d 100644 --- a/README.ja.md +++ b/README.ja.md @@ -30,6 +30,7 @@ cat document.txt | mint -t fr # ファイル全体を翻訳 - **スマート修正** — 入力言語とターゲット言語が同じ場合は、翻訳ではなく文法やスペルの修正を行います。 - **ストリーミング出力** — レスポンスをリアルタイムでストリーミングするため、長文翻訳でも待たされません。 - **コンポーザブル** — stdin/stdoutを重視した設計により、`grep`、`sed`、`xargs`などのツールとシームレスに連携可能です。 +- **セキュア** — system/userメッセージの分離とリクエストごとのランダムnonceデリミタにより、信頼できない入力はモデルの指示から隔離されます。悪意あるコンテンツを翻訳してもLLMの動作をハイジャックすることはできません。 --- diff --git a/README.md b/README.md index d2f3747..4ab0ee8 100644 --- a/README.md +++ b/README.md @@ -30,6 +30,7 @@ cat document.txt | mint -t fr # translate a whole file - **Smart correction** — Same-language input? Auto-corrects grammar & spelling instead of translating - **Streaming** — Output streams in real-time, no waiting for long translations - **Composable** — Pipe-friendly stdin/stdout; pairs seamlessly with `grep`, `sed`, `xargs`, and friends +- **Secure** — Untrusted input is isolated from model instructions via system/user message separation and per-request random-nonce delimiters; translating adversarial content cannot hijack the LLM's behavior --- diff --git a/README.zh-TW.md b/README.zh-TW.md index f05cd9b..b93ba2d 100644 --- a/README.zh-TW.md +++ b/README.zh-TW.md @@ -30,6 +30,7 @@ cat document.txt | mint -t fr # 翻譯整個檔案 - **智慧修正** — 輸入語言與目標語言相同?自動修正語法與拼字,而非翻譯 - **串流輸出** — 即時串流回應,翻譯長文不需等待 - **可組合** — 友善的 stdin/stdout 設計;與 `grep`、`sed`、`xargs` 等工具無縫搭配 +- **安全** — 透過 system/user 訊息分離與每次請求的隨機 nonce 分隔符,將不可信任的輸入與模型指令隔離;翻譯惡意內容時無法劫持 LLM 的行為 --- diff --git a/cmd/mint/main.go b/cmd/mint/main.go index 2ff3a85..f1fb185 100644 --- a/cmd/mint/main.go +++ b/cmd/mint/main.go @@ -261,6 +261,9 @@ func resolveInput(args []string) (string, error) { return text, nil } +// normLang lowercases and trims whitespace from a language tag. +func normLang(s string) string { return strings.ToLower(strings.TrimSpace(s)) } + // resolveTargetLangs resolves target languages based on priority: // 1. Flag Lang Code (--target / -t) - single language only // 2. Config Lang Code (MINT_TARGET_LANG) - single or comma-separated languages @@ -270,7 +273,7 @@ func resolveTargetLangs(flagLang, configLang string) []string { // Priority 1: Flag Lang Code (single language only) if flagLang != "" { // Remove any whitespace and normalize - flagLang = strings.ToLower(strings.TrimSpace(flagLang)) + flagLang = normLang(flagLang) // Flag should not contain commas - use only the first part if present if first, _, found := strings.Cut(flagLang, ","); found { flagLang = first @@ -280,12 +283,16 @@ func resolveTargetLangs(flagLang, configLang string) []string { } // Priority 2: Config Lang Code (supports multiple comma-separated languages) - if configLang != "" { - langs := strings.Split(configLang, ",") - for i, lang := range langs { - langs[i] = strings.ToLower(strings.TrimSpace(lang)) + raw := strings.Split(configLang, ",") + + langs := make([]string, 0, len(raw)) + for _, lang := range raw { + if l := normLang(lang); l != "" { + langs = append(langs, l) } + } + if len(langs) > 0 { return langs } @@ -307,7 +314,7 @@ func resolveSourceLang(flagLang string) string { return "" } - flagLang = strings.ToLower(strings.TrimSpace(flagLang)) + flagLang = normLang(flagLang) // A source is a single language; ignore anything after the first comma. if first, _, found := strings.Cut(flagLang, ","); found { @@ -456,7 +463,7 @@ func buildDetectPrompt(text string) (system, user string) { // getSystemLanguage gets the system language from the OS locale. func getSystemLanguage() string { - for _, env := range []string{"LANG", "LC_ALL"} { + for _, env := range []string{"LC_ALL", "LANG"} { if lang := os.Getenv(env); lang != "" { // Strip encoding suffix before cutting on "_": // "C.UTF-8" → "C"; "en_US.UTF-8" → "en_US" (no change here) diff --git a/cmd/mint/main_test.go b/cmd/mint/main_test.go index ffea94c..f8d7eea 100644 --- a/cmd/mint/main_test.go +++ b/cmd/mint/main_test.go @@ -102,6 +102,8 @@ func TestResolveTargetLangs(t *testing.T) { {"config single lang", "", "fr", []string{"fr"}}, {"config multiple langs", "", "en,zh-TW,ja", []string{"en", langZhTw, "ja"}}, {"config langs trimmed and lowercased", "", " EN , ZH-TW ", []string{"en", langZhTw}}, + {"config trailing comma ignored", "", "en,", []string{"en"}}, + {"config double comma ignored", "", "en,,zh-TW", []string{"en", langZhTw}}, } for _, tt := range tests { t.Run(tt.name, func(t *testing.T) { @@ -241,6 +243,7 @@ func TestGetSystemLanguage(t *testing.T) { {"C.UTF-8 locale skipped, uses LC_ALL", "C.UTF-8", "fr_FR.UTF-8", "fr"}, {"POSIX locale skipped, uses LC_ALL", "POSIX", "de_DE.UTF-8", "de"}, {"LC_ALL used when LANG empty", "", "ja_JP.UTF-8", "ja"}, + {"LC_ALL overrides LANG when both set", "en_US.UTF-8", "ja_JP.UTF-8", "ja"}, {"both empty returns empty string", "", "", ""}, } for _, tt := range tests { diff --git a/script/install.ps1 b/script/install.ps1 index 995f2de..aa8af30 100644 --- a/script/install.ps1 +++ b/script/install.ps1 @@ -19,6 +19,13 @@ param() Set-StrictMode -Version Latest $ErrorActionPreference = 'Stop' +# Windows PowerShell 5.1 may default to TLS 1.0, which GitHub rejects; force TLS 1.2. +[Net.ServicePointManager]::SecurityProtocol = + [Net.ServicePointManager]::SecurityProtocol -bor [Net.SecurityProtocolType]::Tls12 + +# The progress bar throttles Invoke-WebRequest downloads 10-50x in PowerShell 5.1. +$ProgressPreference = 'SilentlyContinue' + $Repo = 'min0625/mint' $Binary = 'mint' $InstallDir = if ($env:MINT_INSTALL_DIR) { $env:MINT_INSTALL_DIR } else { Join-Path $HOME '.local\bin' } @@ -28,18 +35,27 @@ function Write-Success { param($Msg) Write-Host "[ok] $Msg" -ForegroundColor Gre function Write-Warn { param($Msg) Write-Host "[!] $Msg" -ForegroundColor Yellow } function Get-Arch { - $arch = [System.Runtime.InteropServices.RuntimeInformation]::OSArchitecture + # PROCESSOR_ARCHITEW6432 is set by Windows when a 32-bit process runs on a 64-bit OS (WOW64); + # it reflects the native OS architecture. For 64-bit processes it is empty, so fall back to + # PROCESSOR_ARCHITECTURE which is always the process/OS architecture for 64-bit processes. + $arch = if ($env:PROCESSOR_ARCHITEW6432) { $env:PROCESSOR_ARCHITEW6432 } else { $env:PROCESSOR_ARCHITECTURE } switch ($arch) { - 'X64' { return 'amd64' } - 'Arm64' { return 'arm64' } + 'AMD64' { return 'amd64' } + 'ARM64' { return 'arm64' } default { throw "Unsupported architecture: $arch. Only x86_64 (amd64) and arm64 are supported on Windows." } } } function Get-LatestVersion { - $api = "https://api.github.com/repos/$Repo/releases/latest" - $response = Invoke-RestMethod -Uri $api -UseBasicParsing - return $response.tag_name + $api = "https://api.github.com/repos/$Repo/releases/latest" + try { + $response = Invoke-RestMethod -Uri $api -UseBasicParsing + return $response.tag_name + } catch { + # Covers both a failed request and a response whose JSON lacks tag_name + # (accessing a missing property throws under Set-StrictMode Latest). + throw "Could not determine latest version. Set MINT_VERSION manually." + } } function Get-RemoteFile { @@ -52,18 +68,24 @@ function Test-PathContains { ($env:PATH -split ';') -contains $Dir } -function Show-PathHint { +function Add-ToUserPath { param([string]$Dir) - Write-Warn "$Dir is not in your PATH" - Write-Host "" - Write-Host " Add it permanently (current user):" - Write-Host "" - Write-Host " [Environment]::SetEnvironmentVariable('PATH', `"$Dir;`$env:PATH`", 'User')" - Write-Host "" - Write-Host " Or for the current session only:" - Write-Host "" - Write-Host " `$env:PATH = `"$Dir;`$env:PATH`"" - Write-Host "" + # Read the persisted user PATH from the registry (not $env:PATH, which is the + # merged Machine+User+process value); prepend $Dir if it isn't already there. + $userPath = [Environment]::GetEnvironmentVariable('PATH', 'User') + $entries = if ($userPath) { $userPath -split ';' } else { @() } + if ($entries -notcontains $Dir) { + $newPath = if ($userPath) { "$Dir;$userPath" } else { $Dir } + # SetEnvironmentVariable at User scope persists the change and broadcasts + # WM_SETTINGCHANGE so newly launched processes pick it up. + [Environment]::SetEnvironmentVariable('PATH', $newPath, 'User') + Write-Success "Added $Dir to your user PATH" + } + # Make mint usable in the current session immediately, without a new terminal. + if (($env:PATH -split ';') -notcontains $Dir) { + $env:PATH = "$Dir;$env:PATH" + } + Write-Warn "Open a new terminal for the PATH change to apply everywhere." } # ============================================================================= @@ -75,7 +97,7 @@ Write-Host "" $arch = Get-Arch -$version = if ($env:MINT_VERSION) { $env:MINT_VERSION } else { $null } +$version = $env:MINT_VERSION if (-not $version) { Write-Info "Fetching latest version..." $version = Get-LatestVersion @@ -100,10 +122,18 @@ try { # --- verify checksum ------------------------------------------------------- Write-Info "Verifying checksum..." - $checksumPath = Join-Path $tmpDir 'SHA256SUMS' + $checksumPath = Join-Path $tmpDir 'SHA256SUMS' + $checksumAvailable = $false try { Get-RemoteFile -Url $checksumUrl -Dest $checksumPath - $line = Get-Content $checksumPath | Where-Object { $_ -match [regex]::Escape($archive) } + $checksumAvailable = $true + } catch { + Write-Warn "Could not download checksum file: $($_.Exception.Message)" + } + + if ($checksumAvailable) { + $line = Get-Content $checksumPath | + Where-Object { $_ -match "^[0-9a-f]+\s+$([regex]::Escape($archive))\s*$" } if ($line) { $expected = ($line -split '\s+')[0].ToLower() $actual = (Get-FileHash -Path $archivePath -Algorithm SHA256).Hash.ToLower() @@ -111,10 +141,9 @@ try { throw "Checksum mismatch — download may be corrupted" } Write-Success "Checksum verified" + } else { + Write-Warn "No checksum entry found for $archive — skipping verification" } - } catch { - if ($_.Exception.Message -match 'Checksum mismatch') { throw } - Write-Warn "Could not verify checksum: $($_.Exception.Message)" } # --- extract --------------------------------------------------------------- @@ -132,13 +161,13 @@ try { Write-Success "$Binary $version installed to $dstExe" Write-Host "" - if (Test-PathContains $InstallDir) { - Write-Host "Run " -NoNewline - Write-Host "mint --help" -ForegroundColor Cyan -NoNewline - Write-Host " to get started." - } else { - Show-PathHint $InstallDir + if (-not (Test-PathContains $InstallDir)) { + Add-ToUserPath $InstallDir + Write-Host "" } + Write-Host "Run " -NoNewline + Write-Host "mint --help" -ForegroundColor Cyan -NoNewline + Write-Host " to get started." } finally { Remove-Item -Recurse -Force $tmpDir -ErrorAction SilentlyContinue } diff --git a/script/install.sh b/script/install.sh index 72045e4..9d98acc 100644 --- a/script/install.sh +++ b/script/install.sh @@ -75,10 +75,21 @@ check_path() { # --- print PATH setup hint --------------------------------------------------- print_path_hint() { local shell_config + # $SHELL is the user's login shell — the right target for a persistent PATH + # hint (the script itself runs under bash via the curl|bash pipe, so $0 would + # be misleading). # shellcheck disable=SC2088 # intentional: ~ is displayed as a hint to the user, not expanded case "${SHELL}" in */zsh) shell_config='~/.zshrc' ;; - */bash) shell_config='~/.bashrc' ;; + # macOS terminals start bash as a login shell, which reads ~/.bash_profile; + # Linux interactive shells read ~/.bashrc. + */bash) + if [[ "$(uname -s)" == Darwin ]]; then + shell_config='~/.bash_profile' + else + shell_config='~/.bashrc' + fi + ;; */fish) shell_config='~/.config/fish/config.fish' ;; *) shell_config="your shell config file" ;; esac @@ -125,7 +136,12 @@ main() { version="${MINT_VERSION:-}" if [[ -z "${version}" ]]; then info "Fetching latest version..." - version="$(fetch_latest_version)" + # A failed fetch (404 / API rate-limit) makes the command substitution + # exit non-zero; the `if` context suppresses set -e so we reach the + # friendly hint below instead of aborting with a raw curl error. + if ! version="$(fetch_latest_version)"; then + version="" + fi [[ -z "${version}" ]] && error "Could not determine latest version. Set MINT_VERSION manually." fi @@ -153,7 +169,9 @@ main() { info "Verifying checksum..." if curl -fsSL "${checksum_url}" -o "${tmp_dir}/SHA256SUMS" 2> /dev/null; then local expected actual - expected="$(grep -F " ${archive}" "${tmp_dir}/SHA256SUMS" | awk '{print $1}')" + # Match the filename field exactly ($2) so a longer entry that merely + # contains ${archive} as a substring (e.g. an .sbom artifact) can't match. + expected="$(awk -v f="${archive}" '$2 == f {print $1}' "${tmp_dir}/SHA256SUMS")" if [[ -n "${expected}" ]]; then if [[ "${checksum_bin}" == "sha256sum" ]]; then actual="$(sha256sum "${tmp_dir}/${archive}" | awk '{print $1}')" @@ -168,6 +186,8 @@ main() { else warn "Could not verify checksum: failed to download SHA256SUMS" fi + else + warn "No sha256sum or shasum found — skipping checksum verification" fi # --- extract ---------------------------------------------------------------