From 9708f5b894ead9fa08f45452676f69c03748871d Mon Sep 17 00:00:00 2001 From: Artur Shiriev Date: Fri, 26 Jun 2026 21:41:54 +0300 Subject: [PATCH 1/6] docs(decision): constants stay local, no central settings module Records the design choice (made while planning the fixer deepening) to keep constants beside their owning behavior rather than gathering them into a settings/constants module; revisit when config becomes runtime-loaded. Co-Authored-By: Claude Opus 4.8 (1M context) --- .../2026-06-26-no-central-settings-module.md | 42 +++++++++++++++++++ 1 file changed, 42 insertions(+) create mode 100644 planning/decisions/2026-06-26-no-central-settings-module.md diff --git a/planning/decisions/2026-06-26-no-central-settings-module.md b/planning/decisions/2026-06-26-no-central-settings-module.md new file mode 100644 index 0000000..bbc6b97 --- /dev/null +++ b/planning/decisions/2026-06-26-no-central-settings-module.md @@ -0,0 +1,42 @@ +--- +status: accepted +summary: Keep constants local to the module that owns the behavior they parameterize; no central settings/constants module until config becomes runtime-loaded. +--- + +# Constants stay local; no central settings module + +**Decision:** Promote magic literals to named module-level constants in the +module whose behavior they parameterize. Do **not** introduce a central +`settings.py` / `constants.py`. + +## Context + +While deepening the fix behavior into `eof_fixer/fixer.py`, the question arose +of where defaults like `.cache` / `.uv-cache`, the `1024`-byte binary sample, +and the terminator bytes should live — co-located, or gathered into one +settings/constants module. + +`discovery.py` already keeps `GITIGNORE_NAME` and `ALWAYS_PRUNE` as local +module-level constants. The remaining literals partition cleanly by capability: +`.git` / `.gitignore` → file discovery; `.cache` / `.uv-cache` default skips → +the fix capability; the binary-sample size and terminator bytes → +eof-normalization. None of them are user-tunable: there is no env var, config +file, or `[tool.eof-fixer]` section — they are fixed constants, not settings. + +## Decision & rationale + +A central module would group these by *type* (they are all constants) rather +than by *responsibility*, which hurts locality: reading `_is_binary` would mean +bouncing to `constants.py` for `1024` and back. Keeping each constant beside the +logic it governs — e.g. `DEFAULT_EXCLUDES = (".cache", ".uv-cache")` in +`fixer.py` — names the value without scattering the behavior, and mirrors the +pattern `discovery.py` already uses. This is a deliberate locality choice, not +an oversight, so future architecture reviews should not re-suggest a settings +module on the grounds that "the constants are scattered." + +## Revisit trigger + +Reopen when configuration becomes **runtime-loaded** — a config file, +environment variables, or a `[tool.eof-fixer]` pyproject section. At that point a +dedicated config/settings module that parses and validates that input is the +right home, and this decision no longer applies. From 2e49ca93af4aac665e3497599cf3f9d1df193176 Mon Sep 17 00:00:00 2001 From: Artur Shiriev Date: Fri, 26 Jun 2026 21:50:54 +0300 Subject: [PATCH 2/6] docs(planning): spec + plan for deepening the fix into fixer.py Full-lane bundle 2026-06-26.02-deepen-fixer: extract fix_file + fix_directory into eof_fixer/fixer.py, make main() a thin adapter, and move the test surface off the argv/stdout/cwd harness. Three-task TDD plan with complete code. Co-Authored-By: Claude Opus 4.8 (1M context) --- .../2026-06-26.02-deepen-fixer/design.md | 102 ++++ .../2026-06-26.02-deepen-fixer/plan.md | 508 ++++++++++++++++++ 2 files changed, 610 insertions(+) create mode 100644 planning/changes/2026-06-26.02-deepen-fixer/design.md create mode 100644 planning/changes/2026-06-26.02-deepen-fixer/plan.md diff --git a/planning/changes/2026-06-26.02-deepen-fixer/design.md b/planning/changes/2026-06-26.02-deepen-fixer/design.md new file mode 100644 index 0000000..6b0ea83 --- /dev/null +++ b/planning/changes/2026-06-26.02-deepen-fixer/design.md @@ -0,0 +1,102 @@ +--- +summary: Deepen the fix behavior into eof_fixer/fixer.py (fix_file + fix_directory) so main() becomes a thin CLI adapter and the test surface moves off the process/global-state harness. +--- + +# Design: Deepen the fix into `fixer.py`; `main()` becomes an adapter + +## Summary + +Today `main()` fuses three jobs — CLI parsing, the fix-loop, and `stdout`/exit +rendering — so the fix behavior has no interface of its own. Every one of the 16 +tests reaches it through the process: swapping `sys.argv`, `sys.stdout`, and +`os.chdir` via a `_run_main_in` harness, even to assert pure byte-level facts +like "`\r\n\r\n` truncates to `\r\n`." This change extracts the EOF-fixing +capability into a deep module `eof_fixer/fixer.py` exposing two honest +interfaces, leaving `main()` a thin adapter. The interface becomes the test +surface. + +## Goal + +Move the test surface off the global-state harness onto a direct interface, +improving testability and locality. No change to *what* the tool does (which +files it fixes, the `Fixing` lines, the exit codes) — only *where the behavior +lives* and one non-semantic output-timing detail. + +## Design + +### New module `eof_fixer/fixer.py` + +Owns the eof-normalization capability and the directory orchestration that +drives it: + +```python +DEFAULT_EXCLUDES = (".cache", ".uv-cache") # local named constant +_BINARY_SAMPLE_SIZE = 1024 # local named constant + +_is_binary(file_obj) -> bool # private internal (uses _BINARY_SAMPLE_SIZE) +_detect_trailing(file_obj) -> tuple[str, int] # private internal (unchanged) +fix_file(file_obj, *, check) -> bool # public: fix one open file; True if (would be) fixed +fix_directory(root, *, check=False, extra_excludes=()) -> list[pathlib.Path] +``` + +`fix_directory` builds `[*DEFAULT_EXCLUDES, *extra_excludes]`, calls +`discovery.iter_text_files`, opens each yielded file (`rb` under `check`, else +`rb+`), runs `fix_file`, and returns the relative paths it fixed (or, under +`check`, would fix) in walk order. The file-level helpers `_is_binary` / +`_detect_trailing` / the old `_fix_file` move here from `main.py`; `fix_file` is +the de-underscored, `bool`-returning form of `_fix_file`. + +### `main.py` becomes a thin adapter + +```python +parse args (path, --check, --exclude) +if not path.is_dir(): parser.error(...) # CLI-shaped, exit 2 — stays here +fixed = fix_directory(path, check=args.check, extra_excludes=args.exclude or []) +for p in fixed: sys.stdout.write(f"Fixing {p}\n") +return 1 if fixed else 0 +``` + +`main()` no longer imports `os`/`IO`; it holds only argparse, the `is_dir` +guard, the render loop, and the exit code. + +## Key decisions (from the design conversation) + +- **Return shape:** `list[pathlib.Path]`, walk order — both the `Fixing` lines + and the exit code derive from it. No `FixReport` wrapper (YAGNI); promote + later if a counts consumer appears. +- **Silent module, batch output:** `fix_directory` touches no `stdout`; `main()` + renders after the walk. Output shifts streaming → batch — same lines, order, + and exit code; only timing differs. Acceptable for a fast CI/pre-commit gate. +- **Default-skip policy lives in `fixer.py`** (`DEFAULT_EXCLUDES`), so tests get + production defaults for free and `main()` carries no behavior policy. + `extra_excludes` means "user additions"; `.git` stays inside `discovery`. +- **Constants stay local**, no central settings module — see + `planning/decisions/2026-06-26-no-central-settings-module.md`. +- **Absorbs the open-mode "leak"** (review Candidate 3): the open and the fix + loop now live in one module, so no rule crosses a `main` ↔ `_fix_file` seam. +- The stringly-typed `_detect_trailing` seam (review Candidate 2) is **left + as-is** — out of scope for this change. + +## Testing — three layers + +- **Content** → `fix_file(BytesIO(...), check=...)`: CRLF/CR/BOM/null-byte/ + perfect-unchanged, no filesystem. +- **Orchestration** → `fix_directory(tmp_path, ...)`: gitignore respected, + binary skipped, symlink, `--exclude`, check-vs-fix, cwd-independence — assert + on the returned list and on-disk bytes. +- **Adapter** → keep ~2 through `main()`: path-rejection (exit 2) and one test + that `main()` renders `Fixing` lines + the exit code from the list. + `_run_main_in` shrinks to serve only these. + +## Architecture promotion + +In the implementing PR: `architecture/eof-normalization.md` (capability now +behind `fixer.fix_file` / `fix_directory`) and `architecture/cli.md` (`main()` +is an adapter over `fix_directory`). + +## Risk + +- **Output timing change (streaming → batch):** non-semantic; noted in the + release notes when next shipped. +- **Test migration churn:** large but mechanical; the full suite must stay green + at each task. Coverage stays 100%. diff --git a/planning/changes/2026-06-26.02-deepen-fixer/plan.md b/planning/changes/2026-06-26.02-deepen-fixer/plan.md new file mode 100644 index 0000000..cfe8d9d --- /dev/null +++ b/planning/changes/2026-06-26.02-deepen-fixer/plan.md @@ -0,0 +1,508 @@ +# Deepen the Fix into `fixer.py` — Implementation Plan + +> **For agentic workers:** REQUIRED SUB-SKILL: Use superpowers:subagent-driven-development (recommended) or superpowers:executing-plans to implement this plan task-by-task. Steps use checkbox (`- [ ]`) syntax for tracking. + +**Goal:** Extract the EOF-fixing capability into a deep module `eof_fixer/fixer.py` (`fix_file` + `fix_directory`) so `main()` becomes a thin CLI adapter and tests target the interface instead of the `sys.argv`/`sys.stdout`/`os.chdir` harness. + +**Architecture:** `fixer.py` owns the file-level helpers (`_is_binary`, `_detect_trailing`), a public `fix_file(file_obj, *, check) -> bool`, and `fix_directory(root, *, check=False, extra_excludes=()) -> list[pathlib.Path]` that drives `discovery.iter_text_files`. `main.py` shrinks to argparse + `is_dir` guard + render + exit. No behavior change except output timing (streaming → batch). + +**Tech Stack:** Python 3.10+, `pathspec` (via `discovery`), pytest + pytest-cov (100% enforced). + +## Global Constraints + +- Python `>=3.10,<4`; no 3.11+-only syntax. +- Runtime dependencies: `pathspec` only — add none. +- All imports at module level, never inside function bodies (including test files). +- Type-checker suppressions use `ty: ignore`, never `# type: ignore`. +- No `print()` in source — output via `sys.stdout.write(...)` with explicit `\n`. +- Annotate test function arguments (e.g. `tmp_path: pathlib.Path`). +- ruff line length 120. +- 100% test coverage is enforced by `just test`; every branch of new code must be covered. +- Conventional-commit messages ending with the trailer: + `Co-Authored-By: Claude Opus 4.8 (1M context) ` +- Full-lane planning change: the implementing work hand-edits the matching `architecture/*.md` (Task 3). + +--- + +### Task 1: Create `fixer.py` and rewire `main()` to an adapter + +**Files:** +- Create: `eof_fixer/fixer.py` +- Create: `tests/test_fixer.py` +- Modify: `eof_fixer/main.py` (remove `_is_binary`/`_detect_trailing`/`_fix_file`; `main()` calls `fix_directory`) + +**Interfaces:** +- Consumes: `eof_fixer.discovery.iter_text_files(root, extra_excludes)`. +- Produces: + - `fix_file(file_obj: IO[bytes], *, check: bool) -> bool` — True if the file was (or, under `check`, would be) fixed. + - `fix_directory(root: pathlib.Path, *, check: bool = False, extra_excludes: Sequence[str] = ()) -> list[pathlib.Path]` — relative paths fixed/would-be-fixed, in walk order. + - `DEFAULT_EXCLUDES = (".cache", ".uv-cache")`. + +- [ ] **Step 1: Write the failing tests** (`tests/test_fixer.py`) + +```python +import io +import pathlib +import stat + +import pytest + +from eof_fixer.fixer import fix_directory, fix_file + + +# ---- content layer: fix_file on BytesIO (no filesystem) ---- + +def _run(content: bytes, *, check: bool = False) -> tuple[bool, bytes]: + buffer = io.BytesIO(content) + changed = fix_file(buffer, check=check) + return changed, buffer.getvalue() + + +def test_no_trailing_newline_appends_lf() -> None: + assert _run(b"abc") == (True, b"abc\n") + + +def test_single_newline_unchanged() -> None: + assert _run(b"abc\n") == (False, b"abc\n") + + +def test_multiple_newlines_truncated_to_one() -> None: + assert _run(b"abc\n\n\n") == (True, b"abc\n") + + +def test_newlines_only_truncated_to_empty() -> None: + assert _run(b"\n\n") == (True, b"") + + +def test_empty_unchanged() -> None: + assert _run(b"") == (False, b"") + + +def test_crlf_no_trailing_appends_lf() -> None: + assert _run(b"a\r\nb") == (True, b"a\r\nb\n") + + +def test_crlf_perfect_unchanged() -> None: + assert _run(b"a\r\nb\r\n") == (False, b"a\r\nb\r\n") + + +def test_crlf_multiple_truncated_to_one() -> None: + assert _run(b"a\r\nb\r\n\r\n") == (True, b"a\r\nb\r\n") + + +def test_cr_only_multiple_truncated_to_one() -> None: + assert _run(b"a\rb\r\r") == (True, b"a\rb\r") + + +def test_bom_prefixed_is_treated_as_text() -> None: + assert _run(b"\xef\xbb\xbfabc") == (True, b"\xef\xbb\xbfabc\n") + + +def test_null_byte_in_first_1024_is_binary_skipped() -> None: + assert _run(b"a\x00b") == (False, b"a\x00b") + + +def test_null_byte_beyond_first_1024_is_treated_as_text() -> None: + content = b"a" * 1100 + b"\x00data" + changed, result = _run(content) + assert changed is True + assert result == content + b"\n" + + +def test_check_mode_returns_true_but_does_not_write() -> None: + assert _run(b"abc", check=True) == (True, b"abc") + + +# ---- orchestration layer: fix_directory on a temp tree ---- + +def _names(paths: list[pathlib.Path]) -> set[str]: + return {str(p) for p in paths} + + +def test_fixes_files_and_returns_list(tmp_path: pathlib.Path) -> None: + (tmp_path / "a.txt").write_bytes(b"no nl") + (tmp_path / "b.txt").write_bytes(b"ok\n") + fixed = fix_directory(tmp_path) + assert _names(fixed) == {"a.txt"} + assert (tmp_path / "a.txt").read_bytes() == b"no nl\n" + assert (tmp_path / "b.txt").read_bytes() == b"ok\n" + + +def test_check_mode_writes_nothing(tmp_path: pathlib.Path) -> None: + (tmp_path / "a.txt").write_bytes(b"no nl") + fixed = fix_directory(tmp_path, check=True) + assert _names(fixed) == {"a.txt"} + assert (tmp_path / "a.txt").read_bytes() == b"no nl" + + +def test_respects_nested_gitignore(tmp_path: pathlib.Path) -> None: + sub = tmp_path / "sub" + sub.mkdir() + (sub / ".gitignore").write_bytes(b"*.log\n") + (sub / "x.log").write_bytes(b"no nl") + (sub / "x.txt").write_bytes(b"no nl") + fixed = _names(fix_directory(tmp_path)) + assert "sub/x.txt" in fixed + assert "sub/x.log" not in fixed + + +def test_skips_binary(tmp_path: pathlib.Path) -> None: + (tmp_path / "bin").write_bytes(b"a\x00b") + assert fix_directory(tmp_path) == [] + assert (tmp_path / "bin").read_bytes() == b"a\x00b" + + +def test_default_excludes_cache_dirs(tmp_path: pathlib.Path) -> None: + cache = tmp_path / ".cache" + cache.mkdir() + (cache / "blob").write_bytes(b"no nl") + (tmp_path / "keep.txt").write_bytes(b"no nl") + fixed = _names(fix_directory(tmp_path)) + assert "keep.txt" in fixed + assert not any(name.startswith(".cache") for name in fixed) + + +def test_extra_excludes_augment_defaults(tmp_path: pathlib.Path) -> None: + vendor = tmp_path / "vendor" + vendor.mkdir() + (vendor / "lib.txt").write_bytes(b"no nl") + (tmp_path / "keep.txt").write_bytes(b"no nl") + fixed = _names(fix_directory(tmp_path, extra_excludes=["vendor"])) + assert "keep.txt" in fixed + assert not any(name.startswith("vendor") for name in fixed) + + +def test_symlink_does_not_crash(tmp_path: pathlib.Path) -> None: + target = tmp_path / "t.txt" + target.write_bytes(b"no nl") + link = tmp_path / "l.txt" + try: + link.symlink_to(target) + except (OSError, NotImplementedError): # pragma: no cover + pytest.skip("symlinks not available on this platform") + fixed = _names(fix_directory(tmp_path)) + assert "t.txt" in fixed + assert "l.txt" not in fixed + + +def test_runs_with_absolute_path_independent_of_cwd(tmp_path: pathlib.Path) -> None: + (tmp_path / "a.txt").write_bytes(b"no nl") + assert _names(fix_directory(tmp_path.resolve())) == {"a.txt"} + + +def test_check_mode_opens_readonly_file_without_error(tmp_path: pathlib.Path) -> None: + readonly = tmp_path / "ro.txt" + readonly.write_bytes(b"no nl") + readonly.chmod(stat.S_IRUSR | stat.S_IRGRP | stat.S_IROTH) + try: + fixed = _names(fix_directory(tmp_path, check=True)) + assert "ro.txt" in fixed + assert readonly.read_bytes() == b"no nl" + finally: + readonly.chmod(stat.S_IRUSR | stat.S_IWUSR | stat.S_IRGRP | stat.S_IROTH) +``` + +- [ ] **Step 2: Run the new tests to verify they fail** + +Run: `just test tests/test_fixer.py` +Expected: FAIL — `ModuleNotFoundError: No module named 'eof_fixer.fixer'`. + +- [ ] **Step 3: Create `eof_fixer/fixer.py`** + +```python +import os +import pathlib +from collections.abc import Iterator, Sequence +from typing import IO + +from eof_fixer.discovery import iter_text_files + +DEFAULT_EXCLUDES = (".cache", ".uv-cache") +_BINARY_SAMPLE_SIZE = 1024 + + +def _is_binary(file_obj: IO[bytes]) -> bool: + current_pos = file_obj.tell() + file_obj.seek(0) + sample = file_obj.read(_BINARY_SAMPLE_SIZE) + file_obj.seek(current_pos) + return b"\x00" in sample + + +def _detect_trailing(file_obj: IO[bytes]) -> tuple[str, int]: + """Inspect the end of `file_obj` and return the action needed. + + Returns one of: + - ("none", 0) — file is empty, or already ends with exactly one terminator. + - ("append_lf", 0) — file lacks a trailing newline; an LF should be appended. + - ("truncate", offset) — file has excess trailing newlines; truncate to `offset` + (0 means truncate to empty). + """ + try: + file_obj.seek(-1, os.SEEK_END) + except OSError: + return ("none", 0) + + last_character = file_obj.read(1) + if last_character not in {b"\n", b"\r"}: + return ("append_lf", 0) + + while last_character in {b"\n", b"\r"}: + if file_obj.tell() == 1: + # All bytes are line terminators — truncate to empty. + return ("truncate", 0) + file_obj.seek(-2, os.SEEK_CUR) + last_character = file_obj.read(1) + + position = file_obj.tell() + remaining = file_obj.read() + for sequence in (b"\n", b"\r\n", b"\r"): + if remaining == sequence: + return ("none", 0) + if remaining.startswith(sequence): + return ("truncate", position + len(sequence)) + + return ("none", 0) # pragma: no cover + + +def fix_file(file_obj: IO[bytes], *, check: bool) -> bool: + """Normalize one open binary file to end with exactly one terminator. + + Returns True if the file was fixed (or, under `check`, would be fixed). + Binary files (null byte in the first sample) are skipped, returning False. + """ + if _is_binary(file_obj): + return False + + action, offset = _detect_trailing(file_obj) + if action == "none": + return False + + if not check: + if action == "append_lf": + # Needs this seek for windows, otherwise IOError + file_obj.seek(0, os.SEEK_END) + file_obj.write(b"\n") + else: # action == "truncate" + file_obj.seek(offset) + file_obj.truncate() + + return True + + +def fix_directory( + root: pathlib.Path, + *, + check: bool = False, + extra_excludes: Sequence[str] = (), +) -> list[pathlib.Path]: + """Fix every non-ignored text file under `root`; return the relative paths fixed. + + `.git` and `DEFAULT_EXCLUDES` (.cache/.uv-cache) are always skipped; + `extra_excludes` adds more skip names on top. Under `check`, nothing is + written but the would-be-fixed paths are still returned. + """ + excludes = [*DEFAULT_EXCLUDES, *extra_excludes] + open_mode = "rb" if check else "rb+" + fixed: list[pathlib.Path] = [] + paths: Iterator[pathlib.Path] = iter_text_files(root, excludes) + for relative_path in paths: + with (root / relative_path).open(open_mode) as file_obj: + if fix_file(file_obj, check=check): + fixed.append(relative_path) + return fixed +``` + +- [ ] **Step 4: Run the new tests to verify they pass** + +Run: `just test tests/test_fixer.py` +Expected: PASS — all content + orchestration tests green. + +- [ ] **Step 5: Rewire `eof_fixer/main.py` to a thin adapter** + +Replace the entire file with: + +```python +import argparse +import pathlib +import sys + +from eof_fixer.fixer import fix_directory + + +def main() -> int: + parser = argparse.ArgumentParser() + parser.add_argument("path", help="path to directory", type=pathlib.Path) + parser.add_argument("--check", action="store_true") + parser.add_argument( + "--exclude", + action="append", + default=None, + metavar="DIR", + help="extra file/directory name to skip, in addition to .git, .cache, .uv-cache (repeatable)", + ) + args = parser.parse_args() + + path: pathlib.Path = args.path + if not path.is_dir(): + parser.error(f"path is not a directory: {path}") + + fixed = fix_directory(path, check=args.check, extra_excludes=args.exclude or []) + for relative_path in fixed: + sys.stdout.write(f"Fixing {relative_path}\n") + return 1 if fixed else 0 +``` + +(`_is_binary`, `_detect_trailing`, `_fix_file`, and the `os`/`IO` imports are gone — they live in `fixer.py` now.) + +- [ ] **Step 6: Run the full suite** + +Run: `just test` +Expected: PASS — `tests/test_fixer.py` plus the still-present `tests/test_end_of_file_fixer.py` (which drives `main()`); the `Fixing` lines are now batch-rendered but the substring/order assertions still hold. Coverage 100%. + +- [ ] **Step 7: Lint** + +Run: `just lint` (format as you go), then confirm a clean `just lint-ci` against the committed tree. + +- [ ] **Step 8: Commit** + +```bash +git add eof_fixer/fixer.py eof_fixer/main.py tests/test_fixer.py +git commit -m "refactor: deepen fix into fixer.py; main() becomes an adapter + +Co-Authored-By: Claude Opus 4.8 (1M context) " +``` + +--- + +### Task 2: Migrate `test_end_of_file_fixer.py` to adapter-only tests + +**Files:** +- Modify: `tests/test_end_of_file_fixer.py` + +**Interfaces:** Test-only. After Task 1, the behaviors in this file are now covered at the `fixer` layer by `tests/test_fixer.py`; this task removes the redundant process-harness tests, keeping only what genuinely exercises the `main()` adapter. + +- [ ] **Step 1: Delete the now-redundant process-harness tests** + +Remove these functions from `tests/test_end_of_file_fixer.py` (each is covered by `tests/test_fixer.py` as noted): +- `test_end_of_file_fixer_command_with_check_false` → `test_fixes_files_and_returns_list` +- `test_end_of_file_fixer_command_with_check_true` → `test_check_mode_writes_nothing` +- `test_end_of_file_fixer_with_gitignore` → `test_respects_nested_gitignore` +- `test_end_of_file_fixer_skips_binary_files` → `test_skips_binary` +- `test_crlf_no_trailing_newline_appends_lf` → `test_crlf_no_trailing_appends_lf` +- `test_crlf_perfect_unchanged` → `test_crlf_perfect_unchanged` +- `test_crlf_multiple_trailing_truncated_to_one` → `test_crlf_multiple_truncated_to_one` +- `test_cr_only_multiple_trailing_truncated_to_one` → `test_cr_only_multiple_truncated_to_one` +- `test_bom_prefixed_file_is_treated_as_text` → `test_bom_prefixed_is_treated_as_text` +- `test_null_byte_beyond_first_1024_bytes_is_treated_as_text` → `test_null_byte_beyond_first_1024_is_treated_as_text` +- `test_symlink_in_tree_does_not_crash` → `test_symlink_does_not_crash` +- `test_readonly_file_in_check_mode_does_not_raise` → `test_check_mode_opens_readonly_file_without_error` +- `test_exclude_flag_augments_default_skips` → `test_extra_excludes_augment_defaults` +- `test_runs_from_unrelated_cwd_with_absolute_path` → `test_runs_with_absolute_path_independent_of_cwd` + +Keep `test_path_arg_rejects_file` and `test_path_arg_rejects_nonexistent_path` (they exercise the argparse `is_dir` guard → exit 2, which only `main()` does). + +- [ ] **Step 2: Add two adapter tests that exercise `main()`'s rendering** + +Keep the existing `_run_main_in` helper (the kept + new tests use it). Add: + +```python +def test_main_renders_fixing_lines_and_exit_code() -> None: + with tempfile.TemporaryDirectory() as temp_dir: + temp_path = Path(temp_dir) + (temp_path / "a.txt").write_bytes(b"no nl") + skipped = temp_path / "vendor" + skipped.mkdir() + (skipped / "lib.txt").write_bytes(b"no nl") + + with _run_main_in(temp_path, ["eof-fixer", ".", "--exclude", "vendor"]) as (stdout, _stderr): + result = main() + + output = stdout.getvalue() + assert result == 1 + assert "Fixing a.txt\n" in output + assert "Fixing vendor/lib.txt\n" not in output + + +def test_main_returns_zero_when_nothing_to_fix() -> None: + with tempfile.TemporaryDirectory() as temp_dir: + temp_path = Path(temp_dir) + (temp_path / "ok.txt").write_bytes(b"already fine\n") + + with _run_main_in(temp_path, ["eof-fixer", "."]) as (stdout, _stderr): + result = main() + + assert result == 0 + assert stdout.getvalue() == "" +``` + +- [ ] **Step 3: Prune now-unused imports** + +After deletion, remove any imports left unused in `tests/test_end_of_file_fixer.py` (e.g. `shutil`, `stat`, `StringIO` if no longer referenced). Let `just lint` (ruff `F401`) tell you which; remove exactly those. + +- [ ] **Step 4: Run the full suite + coverage** + +Run: `just test` +Expected: PASS, coverage 100%. The adapter tests cover `main()`'s render loop, exit-0 path, `--exclude` wiring, and both `is_dir` rejections; `fixer.py` is covered by `tests/test_fixer.py`. If coverage flags an uncovered line in `main.py`, add the minimal adapter test that reaches it (do not add `# pragma: no cover` to adapter logic). + +- [ ] **Step 5: Lint** + +Run: `just lint`, then confirm `just lint-ci` clean against the committed tree. + +- [ ] **Step 6: Commit** + +```bash +git add tests/test_end_of_file_fixer.py +git commit -m "test: move fix behavior off the process harness to the fixer interface + +Co-Authored-By: Claude Opus 4.8 (1M context) " +``` + +--- + +### Task 3: Promote to architecture + finalize the bundle summary + +**Files:** +- Modify: `architecture/eof-normalization.md` +- Modify: `architecture/cli.md` +- Modify: `planning/changes/2026-06-26.02-deepen-fixer/design.md` (finalize `summary`) + +**Interfaces:** Documentation only. + +- [ ] **Step 1: Update `architecture/eof-normalization.md`** + +Read the file first. Update it so the capability is described as living in `eof_fixer/fixer.py`: the public `fix_file(file_obj, *, check) -> bool` (one open file) and `fix_directory(root, *, check, extra_excludes) -> list[Path]` (drives `discovery.iter_text_files`, owns the `DEFAULT_EXCLUDES` = `.cache`/`.uv-cache` policy, opens `rb`/`rb+` by `check`, returns the fixed relative paths). Note `main()` no longer holds the file-level logic. Preserve the existing action-model / terminator prose. + +- [ ] **Step 2: Update `architecture/cli.md`** + +Read the file first. Update it so `main()` is described as a thin adapter over `fix_directory`: it parses args, guards `is_dir` (exit 2), then renders one `Fixing ` line per returned path and exits `1`/`0`. Note the output is now batch-rendered after the walk (not streamed). The `--exclude` and exit-code contract are unchanged. + +- [ ] **Step 3: Finalize the bundle summary** + +In `planning/changes/2026-06-26.02-deepen-fixer/design.md`, change the frontmatter `summary:` to the realized result, e.g.: +`summary: Fix behavior now lives behind eof_fixer/fixer.py (fix_file + fix_directory); main() is a thin adapter and tests target the interface instead of the argv/stdout/cwd harness.` + +- [ ] **Step 4: Validate + lint** + +Run: `just check-planning` → expect `planning: OK`. +Run: `just lint-ci` → expect clean. + +- [ ] **Step 5: Commit** + +```bash +git add architecture/eof-normalization.md architecture/cli.md \ + planning/changes/2026-06-26.02-deepen-fixer/design.md +git commit -m "docs: promote the fixer deepening to architecture + +Co-Authored-By: Claude Opus 4.8 (1M context) " +``` + +--- + +## Done criteria + +- `eof_fixer/fixer.py` exposes `fix_file` + `fix_directory`; `main.py` is a thin adapter with no file-level logic. +- Behavior unchanged except streaming → batch output; existing exit codes, `Fixing` lines, gitignore/`--exclude`/binary/symlink behavior preserved. +- Tests target the `fixer` interface (content via `BytesIO`, orchestration via `fix_directory`); only path-rejection + two render tests go through `main()`. `_run_main_in` serves just those. +- `just test` green, coverage 100%; `just lint-ci` clean; `just check-planning` → `planning: OK`. +- `architecture/eof-normalization.md` + `architecture/cli.md` describe the new structure; bundle summary finalized. From 92b79f4e1dd52948dcfb3ad9eb28be81b341ea92 Mon Sep 17 00:00:00 2001 From: Artur Shiriev Date: Fri, 26 Jun 2026 21:57:17 +0300 Subject: [PATCH 3/6] refactor: deepen fix into fixer.py; main() becomes an adapter One deviation from the brief: added an empty-read guard in _detect_trailing (after read(1) returns b"") because BytesIO silently clamps seek(-1, SEEK_END) to position 0 on an empty buffer rather than raising OSError as real files do. Added # noqa: PLR0911 for the extra return statement this introduces. Co-Authored-By: Claude Opus 4.8 (1M context) --- eof_fixer/fixer.py | 104 ++++++++++++++++++++++++++++ eof_fixer/main.py | 86 ++--------------------- tests/test_fixer.py | 161 ++++++++++++++++++++++++++++++++++++++++++++ 3 files changed, 270 insertions(+), 81 deletions(-) create mode 100644 eof_fixer/fixer.py create mode 100644 tests/test_fixer.py diff --git a/eof_fixer/fixer.py b/eof_fixer/fixer.py new file mode 100644 index 0000000..d509250 --- /dev/null +++ b/eof_fixer/fixer.py @@ -0,0 +1,104 @@ +import os +import pathlib +from collections.abc import Iterator, Sequence +from typing import IO + +from eof_fixer.discovery import iter_text_files + + +DEFAULT_EXCLUDES = (".cache", ".uv-cache") +_BINARY_SAMPLE_SIZE = 1024 + + +def _is_binary(file_obj: IO[bytes]) -> bool: + current_pos = file_obj.tell() + file_obj.seek(0) + sample = file_obj.read(_BINARY_SAMPLE_SIZE) + file_obj.seek(current_pos) + return b"\x00" in sample + + +def _detect_trailing(file_obj: IO[bytes]) -> tuple[str, int]: # noqa: PLR0911 + """Inspect the end of `file_obj` and return the action needed. + + Returns one of: + - ("none", 0) — file is empty, or already ends with exactly one terminator. + - ("append_lf", 0) — file lacks a trailing newline; an LF should be appended. + - ("truncate", offset) — file has excess trailing newlines; truncate to `offset` + (0 means truncate to empty). + """ + try: + file_obj.seek(-1, os.SEEK_END) + except OSError: + return ("none", 0) + + last_character = file_obj.read(1) + if not last_character: + return ("none", 0) + if last_character not in {b"\n", b"\r"}: + return ("append_lf", 0) + + while last_character in {b"\n", b"\r"}: + if file_obj.tell() == 1: + # All bytes are line terminators — truncate to empty. + return ("truncate", 0) + file_obj.seek(-2, os.SEEK_CUR) + last_character = file_obj.read(1) + + position = file_obj.tell() + remaining = file_obj.read() + for sequence in (b"\n", b"\r\n", b"\r"): + if remaining == sequence: + return ("none", 0) + if remaining.startswith(sequence): + return ("truncate", position + len(sequence)) + + return ("none", 0) # pragma: no cover + + +def fix_file(file_obj: IO[bytes], *, check: bool) -> bool: + """Normalize one open binary file to end with exactly one terminator. + + Returns True if the file was fixed (or, under `check`, would be fixed). + Binary files (null byte in the first sample) are skipped, returning False. + """ + if _is_binary(file_obj): + return False + + action, offset = _detect_trailing(file_obj) + if action == "none": + return False + + if not check: + if action == "append_lf": + # Needs this seek for windows, otherwise IOError + file_obj.seek(0, os.SEEK_END) + file_obj.write(b"\n") + else: # action == "truncate" + file_obj.seek(offset) + file_obj.truncate() + + return True + + +def fix_directory( + root: pathlib.Path, + *, + check: bool = False, + extra_excludes: Sequence[str] = (), +) -> list[pathlib.Path]: + """Fix every non-ignored text file under `root`; return the relative paths fixed. + + `.git` and `DEFAULT_EXCLUDES` (.cache/.uv-cache) are always skipped; + `extra_excludes` adds more skip names on top. Under `check`, nothing is + written but the would-be-fixed paths are still returned. + """ + excludes = [*DEFAULT_EXCLUDES, *extra_excludes] + open_mode = "rb" if check else "rb+" + fixed: list[pathlib.Path] = [] + paths: Iterator[pathlib.Path] = iter_text_files(root, excludes) + for relative_path in paths: + with (root / relative_path).open(open_mode) as file_obj: + if fix_file(file_obj, check=check): + fixed.append(relative_path) + return fixed diff --git a/eof_fixer/main.py b/eof_fixer/main.py index b6ac5e9..f7b183b 100644 --- a/eof_fixer/main.py +++ b/eof_fixer/main.py @@ -1,74 +1,8 @@ import argparse -import os import pathlib import sys -from typing import IO -from eof_fixer.discovery import iter_text_files - - -def _is_binary(file_obj: IO[bytes]) -> bool: - current_pos = file_obj.tell() - file_obj.seek(0) - sample = file_obj.read(1024) - file_obj.seek(current_pos) - return b"\x00" in sample - - -def _detect_trailing(file_obj: IO[bytes]) -> tuple[str, int]: - """Inspect the end of `file_obj` and return the action needed. - - Returns one of: - - ("none", 0) — file is empty, or already ends with exactly one terminator. - - ("append_lf", 0) — file lacks a trailing newline; an LF should be appended. - - ("truncate", offset) — file has excess trailing newlines; truncate to `offset` - (0 means truncate to empty). - """ - try: - file_obj.seek(-1, os.SEEK_END) - except OSError: - return ("none", 0) - - last_character = file_obj.read(1) - if last_character not in {b"\n", b"\r"}: - return ("append_lf", 0) - - while last_character in {b"\n", b"\r"}: - if file_obj.tell() == 1: - # All bytes are line terminators — truncate to empty. - return ("truncate", 0) - file_obj.seek(-2, os.SEEK_CUR) - last_character = file_obj.read(1) - - position = file_obj.tell() - remaining = file_obj.read() - for sequence in (b"\n", b"\r\n", b"\r"): - if remaining == sequence: - return ("none", 0) - if remaining.startswith(sequence): - return ("truncate", position + len(sequence)) - - return ("none", 0) # pragma: no cover - - -def _fix_file(file_obj: IO[bytes], check: bool) -> int: - if _is_binary(file_obj): - return 0 - - action, offset = _detect_trailing(file_obj) - if action == "none": - return 0 - - if not check: - if action == "append_lf": - # Needs this seek for windows, otherwise IOError - file_obj.seek(0, os.SEEK_END) - file_obj.write(b"\n") - else: # action == "truncate" - file_obj.seek(offset) - file_obj.truncate() - - return 1 +from eof_fixer.fixer import fix_directory def main() -> int: @@ -85,20 +19,10 @@ def main() -> int: args = parser.parse_args() path: pathlib.Path = args.path - check: bool = args.check - if not path.is_dir(): parser.error(f"path is not a directory: {path}") - extra_excludes = [".cache", ".uv-cache", *(args.exclude or [])] - open_mode = "rb" if check else "rb+" - - retv = 0 - for relative_path in iter_text_files(path, extra_excludes): - with (path / relative_path).open(open_mode) as f: - ret_for_file = _fix_file(f, check=check) - if ret_for_file: - sys.stdout.write(f"Fixing {relative_path}\n") - retv |= ret_for_file - - return retv + fixed = fix_directory(path, check=args.check, extra_excludes=args.exclude or []) + for relative_path in fixed: + sys.stdout.write(f"Fixing {relative_path}\n") + return 1 if fixed else 0 diff --git a/tests/test_fixer.py b/tests/test_fixer.py new file mode 100644 index 0000000..222ebf4 --- /dev/null +++ b/tests/test_fixer.py @@ -0,0 +1,161 @@ +import io +import pathlib +import stat + +import pytest + +from eof_fixer.fixer import fix_directory, fix_file + + +# ---- content layer: fix_file on BytesIO (no filesystem) ---- + + +def _run(content: bytes, *, check: bool = False) -> tuple[bool, bytes]: + buffer = io.BytesIO(content) + changed = fix_file(buffer, check=check) + return changed, buffer.getvalue() + + +def test_no_trailing_newline_appends_lf() -> None: + assert _run(b"abc") == (True, b"abc\n") + + +def test_single_newline_unchanged() -> None: + assert _run(b"abc\n") == (False, b"abc\n") + + +def test_multiple_newlines_truncated_to_one() -> None: + assert _run(b"abc\n\n\n") == (True, b"abc\n") + + +def test_newlines_only_truncated_to_empty() -> None: + assert _run(b"\n\n") == (True, b"") + + +def test_empty_unchanged() -> None: + assert _run(b"") == (False, b"") + + +def test_crlf_no_trailing_appends_lf() -> None: + assert _run(b"a\r\nb") == (True, b"a\r\nb\n") + + +def test_crlf_perfect_unchanged() -> None: + assert _run(b"a\r\nb\r\n") == (False, b"a\r\nb\r\n") + + +def test_crlf_multiple_truncated_to_one() -> None: + assert _run(b"a\r\nb\r\n\r\n") == (True, b"a\r\nb\r\n") + + +def test_cr_only_multiple_truncated_to_one() -> None: + assert _run(b"a\rb\r\r") == (True, b"a\rb\r") + + +def test_bom_prefixed_is_treated_as_text() -> None: + assert _run(b"\xef\xbb\xbfabc") == (True, b"\xef\xbb\xbfabc\n") + + +def test_null_byte_in_first_1024_is_binary_skipped() -> None: + assert _run(b"a\x00b") == (False, b"a\x00b") + + +def test_null_byte_beyond_first_1024_is_treated_as_text() -> None: + content = b"a" * 1100 + b"\x00data" + changed, result = _run(content) + assert changed is True + assert result == content + b"\n" + + +def test_check_mode_returns_true_but_does_not_write() -> None: + assert _run(b"abc", check=True) == (True, b"abc") + + +# ---- orchestration layer: fix_directory on a temp tree ---- + + +def _names(paths: list[pathlib.Path]) -> set[str]: + return {str(p) for p in paths} + + +def test_fixes_files_and_returns_list(tmp_path: pathlib.Path) -> None: + (tmp_path / "a.txt").write_bytes(b"no nl") + (tmp_path / "b.txt").write_bytes(b"ok\n") + fixed = fix_directory(tmp_path) + assert _names(fixed) == {"a.txt"} + assert (tmp_path / "a.txt").read_bytes() == b"no nl\n" + assert (tmp_path / "b.txt").read_bytes() == b"ok\n" + + +def test_check_mode_writes_nothing(tmp_path: pathlib.Path) -> None: + (tmp_path / "a.txt").write_bytes(b"no nl") + fixed = fix_directory(tmp_path, check=True) + assert _names(fixed) == {"a.txt"} + assert (tmp_path / "a.txt").read_bytes() == b"no nl" + + +def test_respects_nested_gitignore(tmp_path: pathlib.Path) -> None: + sub = tmp_path / "sub" + sub.mkdir() + (sub / ".gitignore").write_bytes(b"*.log\n") + (sub / "x.log").write_bytes(b"no nl") + (sub / "x.txt").write_bytes(b"no nl") + fixed = _names(fix_directory(tmp_path)) + assert "sub/x.txt" in fixed + assert "sub/x.log" not in fixed + + +def test_skips_binary(tmp_path: pathlib.Path) -> None: + (tmp_path / "bin").write_bytes(b"a\x00b") + assert fix_directory(tmp_path) == [] + assert (tmp_path / "bin").read_bytes() == b"a\x00b" + + +def test_default_excludes_cache_dirs(tmp_path: pathlib.Path) -> None: + cache = tmp_path / ".cache" + cache.mkdir() + (cache / "blob").write_bytes(b"no nl") + (tmp_path / "keep.txt").write_bytes(b"no nl") + fixed = _names(fix_directory(tmp_path)) + assert "keep.txt" in fixed + assert not any(name.startswith(".cache") for name in fixed) + + +def test_extra_excludes_augment_defaults(tmp_path: pathlib.Path) -> None: + vendor = tmp_path / "vendor" + vendor.mkdir() + (vendor / "lib.txt").write_bytes(b"no nl") + (tmp_path / "keep.txt").write_bytes(b"no nl") + fixed = _names(fix_directory(tmp_path, extra_excludes=["vendor"])) + assert "keep.txt" in fixed + assert not any(name.startswith("vendor") for name in fixed) + + +def test_symlink_does_not_crash(tmp_path: pathlib.Path) -> None: + target = tmp_path / "t.txt" + target.write_bytes(b"no nl") + link = tmp_path / "l.txt" + try: + link.symlink_to(target) + except (OSError, NotImplementedError): # pragma: no cover + pytest.skip("symlinks not available on this platform") + fixed = _names(fix_directory(tmp_path)) + assert "t.txt" in fixed + assert "l.txt" not in fixed + + +def test_runs_with_absolute_path_independent_of_cwd(tmp_path: pathlib.Path) -> None: + (tmp_path / "a.txt").write_bytes(b"no nl") + assert _names(fix_directory(tmp_path.resolve())) == {"a.txt"} + + +def test_check_mode_opens_readonly_file_without_error(tmp_path: pathlib.Path) -> None: + readonly = tmp_path / "ro.txt" + readonly.write_bytes(b"no nl") + readonly.chmod(stat.S_IRUSR | stat.S_IRGRP | stat.S_IROTH) + try: + fixed = _names(fix_directory(tmp_path, check=True)) + assert "ro.txt" in fixed + assert readonly.read_bytes() == b"no nl" + finally: + readonly.chmod(stat.S_IRUSR | stat.S_IWUSR | stat.S_IRGRP | stat.S_IROTH) From ba0824f44d223c8e09d0fad027d9f993ebbf14b4 Mon Sep 17 00:00:00 2001 From: Artur Shiriev Date: Fri, 26 Jun 2026 22:04:03 +0300 Subject: [PATCH 4/6] test: move fix behavior off the process harness to the fixer interface Also adds test_empty_real_file_unchanged to test_fixer.py to cover the OSError branch in _detect_trailing (seek(-1, SEEK_END) on a real empty file), which was previously masked by the deleted harness tests. Co-Authored-By: Claude Opus 4.8 (1M context) --- tests/test_end_of_file_fixer.py | 403 ++------------------------------ tests/test_fixer.py | 10 + 2 files changed, 23 insertions(+), 390 deletions(-) diff --git a/tests/test_end_of_file_fixer.py b/tests/test_end_of_file_fixer.py index 9dd7f4b..de89c5a 100644 --- a/tests/test_end_of_file_fixer.py +++ b/tests/test_end_of_file_fixer.py @@ -1,6 +1,4 @@ import os -import shutil -import stat import sys import tempfile from collections.abc import Iterator @@ -35,351 +33,6 @@ def _run_main_in(temp_dir: Path, argv: list[str]) -> Iterator[tuple[StringIO, St sys.stderr = original_stderr -def test_end_of_file_fixer_command_with_check_false() -> None: - # Create a temporary directory with test files - with tempfile.TemporaryDirectory() as temp_dir: - temp_path = Path(temp_dir) - - # Copy fixture files to temp directory - fixtures_dir = Path(__file__).parent / "fixtures" - for fixture_file in fixtures_dir.glob("*.txt"): - shutil.copy(fixture_file, temp_path / fixture_file.name) - - # Change to the temp directory and capture stdout - original_cwd = Path.cwd() - original_argv = sys.argv - original_stdout = sys.stdout - captured_output = StringIO() - - try: - os.chdir(temp_dir) - sys.argv = ["eof-fixer", "."] - sys.stdout = captured_output - - # Run eof-fixer . command - result = main() - - # Should exit with code 1 (since some files needed fixing) - assert result == 1 - - # Should output which files are being fixed - output = captured_output.getvalue() - assert "Fixing no_newline.txt\n" in output - assert "Fixing multiple_newlines.txt\n" in output - assert "Fixing newlines_only.txt\n" in output - - finally: - os.chdir(original_cwd) - sys.argv = original_argv - sys.stdout = original_stdout - - # Check that files were fixed correctly - # File with no newline should now have one - no_newline_content = (temp_path / "no_newline.txt").read_text() - assert no_newline_content == "This file has no newline at the end\n" - - # File with one newline should be unchanged - one_newline_content = (temp_path / "one_newline.txt").read_text() - assert one_newline_content == "This file has exactly one newline at the end\n" - - # File with multiple newlines should be truncated to one - multiple_newlines_content = (temp_path / "multiple_newlines.txt").read_text() - assert multiple_newlines_content == "This file has multiple newlines at the end\n" - - # File with only newlines should be made empty - newlines_only_content = (temp_path / "newlines_only.txt").read_text() - assert newlines_only_content == "" - - # Perfect file should be unchanged - perfect_content = (temp_path / "perfect.txt").read_text() - assert perfect_content == "This file already has exactly one newline at the end\n" - - # Empty file should remain empty - empty_content = (temp_path / "empty.txt").read_text() - assert empty_content == "" - - -def test_end_of_file_fixer_command_with_check_true() -> None: - # Create a temporary directory with test files - with tempfile.TemporaryDirectory() as temp_dir: - temp_path = Path(temp_dir) - - # Copy fixture files to temp directory - fixtures_dir = Path(__file__).parent / "fixtures" - for fixture_file in fixtures_dir.glob("*.txt"): - shutil.copy(fixture_file, temp_path / fixture_file.name) - - # Change to the temp directory and capture stdout - original_cwd = Path.cwd() - original_argv = sys.argv - original_stdout = sys.stdout - captured_output = StringIO() - - try: - os.chdir(temp_dir) - sys.argv = ["eof-fixer", ".", "--check"] - sys.stdout = captured_output - - # Run eof-fixer . --check command - result = main() - - # Should exit with code 1 (since some files need fixing) - assert result == 1 - - # Should output which files need fixing - output = captured_output.getvalue() - assert "Fixing no_newline.txt\n" in output - assert "Fixing multiple_newlines.txt\n" in output - assert "Fixing newlines_only.txt\n" in output - - finally: - os.chdir(original_cwd) - sys.argv = original_argv - sys.stdout = original_stdout - - # Files should be unchanged in check mode - no_newline_content = (temp_path / "no_newline.txt").read_text() - assert no_newline_content == "This file has no newline at the end" # Unchanged - - one_newline_content = (temp_path / "one_newline.txt").read_text() - assert one_newline_content == "This file has exactly one newline at the end\n" # Unchanged - - multiple_newlines_content = (temp_path / "multiple_newlines.txt").read_text() - assert multiple_newlines_content == "This file has multiple newlines at the end\n\n" # Unchanged - - newlines_only_content = (temp_path / "newlines_only.txt").read_text() - assert newlines_only_content == "\n\n\n" # Unchanged - - perfect_content = (temp_path / "perfect.txt").read_text() - assert perfect_content == "This file already has exactly one newline at the end\n" # Unchanged - - empty_content = (temp_path / "empty.txt").read_text() - assert empty_content == "" # Unchanged - - -def test_end_of_file_fixer_with_gitignore() -> None: - # Create a temporary directory with test files including .gitignore - with tempfile.TemporaryDirectory() as temp_dir: - temp_path = Path(temp_dir) - - # Copy all fixture files to temp directory - fixtures_dir = Path(__file__).parent / "fixtures" - for fixture_file in fixtures_dir.glob("*"): - assert shutil.copy(fixture_file, temp_path / fixture_file.name) - - Path(temp_path / ".gitignore").write_text("*.tmp") - - # Change to the temp directory and capture stdout - original_cwd = Path.cwd() - original_argv = sys.argv - original_stdout = sys.stdout - captured_output = StringIO() - - try: - os.chdir(temp_dir) - sys.argv = ["eof-fixer", "."] - sys.stdout = captured_output - - # Run eof-fixer . command - result = main() - - # Should exit with code 1 (since some files needed fixing) - assert result == 1 - - # Should output which files are being fixed - output = captured_output.getvalue() - assert "Fixing no_newline.txt\n" in output - assert "Fixing multiple_newlines.txt\n" in output - assert "Fixing newlines_only.txt\n" in output - # Should NOT mention the .tmp file since it's ignored by .gitignore - assert "Fixing temp_file.tmp\n" not in output - - finally: - os.chdir(original_cwd) - sys.argv = original_argv - sys.stdout = original_stdout - - # Check that files were fixed correctly - # File with no newline should now have one - no_newline_content = (temp_path / "no_newline.txt").read_text() - assert no_newline_content == "This file has no newline at the end\n" - - # File with multiple newlines should be truncated to one - multiple_newlines_content = (temp_path / "multiple_newlines.txt").read_text() - assert multiple_newlines_content == "This file has multiple newlines at the end\n" - - # File with only newlines should be made empty - newlines_only_content = (temp_path / "newlines_only.txt").read_text() - assert newlines_only_content == "" - - # The .tmp file should be unchanged since it's ignored - temp_file_content = (temp_path / "temp_file.tmp").read_text() - assert temp_file_content == "This is a temporary file that should be ignored" - - -def test_end_of_file_fixer_skips_binary_files() -> None: - # Create a temporary directory with test files including a binary file - with tempfile.TemporaryDirectory() as temp_dir: - temp_path = Path(temp_dir) - - # Copy all fixture files to temp directory - fixtures_dir = Path(__file__).parent / "fixtures" - for fixture_file in fixtures_dir.glob("*"): - assert shutil.copy(fixture_file, temp_path / fixture_file.name) - - # Create a binary file with null bytes - binary_file = temp_path / "binary_file.bin" - binary_file.write_bytes(b"\x00\x01\x02\x03\x04\x05") - - # Also create a text file without newline that should be fixed - text_file = temp_path / "text_no_newline.txt" - text_file.write_text("This is a text file without newline") - - # Change to the temp directory and capture stdout - original_cwd = Path.cwd() - original_argv = sys.argv - original_stdout = sys.stdout - captured_output = StringIO() - - try: - os.chdir(temp_dir) - sys.argv = ["eof-fixer", "."] - sys.stdout = captured_output - - # Run eof-fixer . command - result = main() - - # Should exit with code 1 (since the text file needed fixing) - assert result == 1 - - # Should output that the text file is being fixed - output = captured_output.getvalue() - assert "Fixing text_no_newline.txt\n" in output - # Should NOT mention the binary file - assert "Fixing binary_file.bin\n" not in output - - finally: - os.chdir(original_cwd) - sys.argv = original_argv - sys.stdout = original_stdout - - # Check that text file was fixed - text_content = (temp_path / "text_no_newline.txt").read_text() - assert text_content == "This is a text file without newline\n" - - # Check that binary file was not modified - binary_content = (temp_path / "binary_file.bin").read_bytes() - assert binary_content == b"\x00\x01\x02\x03\x04\x05" - - -def test_crlf_no_trailing_newline_appends_lf() -> None: - with tempfile.TemporaryDirectory() as temp_dir: - temp_path = Path(temp_dir) - target = temp_path / "crlf_no_newline.txt" - target.write_bytes(b"a\r\nb") - - with _run_main_in(temp_path, ["eof-fixer", "."]) as (stdout, _stderr): - result = main() - - assert result == 1 - assert "Fixing crlf_no_newline.txt\n" in stdout.getvalue() - # LF is appended even though the file uses CRLF — documented behavior. - assert target.read_bytes() == b"a\r\nb\n" - - -def test_crlf_perfect_unchanged() -> None: - with tempfile.TemporaryDirectory() as temp_dir: - temp_path = Path(temp_dir) - target = temp_path / "crlf_perfect.txt" - target.write_bytes(b"a\r\n") - - with _run_main_in(temp_path, ["eof-fixer", "."]) as (stdout, _stderr): - result = main() - - assert result == 0 - assert stdout.getvalue() == "" - assert target.read_bytes() == b"a\r\n" - - -def test_crlf_multiple_trailing_truncated_to_one() -> None: - with tempfile.TemporaryDirectory() as temp_dir: - temp_path = Path(temp_dir) - target = temp_path / "crlf_multiple.txt" - target.write_bytes(b"a\r\n\r\n\r\n") - - with _run_main_in(temp_path, ["eof-fixer", "."]) as (stdout, _stderr): - result = main() - - assert result == 1 - assert "Fixing crlf_multiple.txt\n" in stdout.getvalue() - assert target.read_bytes() == b"a\r\n" - - -def test_cr_only_multiple_trailing_truncated_to_one() -> None: - with tempfile.TemporaryDirectory() as temp_dir: - temp_path = Path(temp_dir) - target = temp_path / "cr_only.txt" - target.write_bytes(b"a\r\r\r") - - with _run_main_in(temp_path, ["eof-fixer", "."]) as (stdout, _stderr): - result = main() - - assert result == 1 - assert "Fixing cr_only.txt\n" in stdout.getvalue() - assert target.read_bytes() == b"a\r" - - -def test_bom_prefixed_file_is_treated_as_text() -> None: - with tempfile.TemporaryDirectory() as temp_dir: - temp_path = Path(temp_dir) - target = temp_path / "bom_no_newline.txt" - target.write_bytes(b"\xef\xbb\xbfhello") - - with _run_main_in(temp_path, ["eof-fixer", "."]) as (stdout, _stderr): - result = main() - - assert result == 1 - assert "Fixing bom_no_newline.txt\n" in stdout.getvalue() - assert target.read_bytes() == b"\xef\xbb\xbfhello\n" - - -def test_null_byte_beyond_first_1024_bytes_is_treated_as_text() -> None: - # _is_binary inspects only the first 1024 bytes, so a null byte past that - # boundary does NOT mark the file as binary. Document the current behavior. - with tempfile.TemporaryDirectory() as temp_dir: - temp_path = Path(temp_dir) - target = temp_path / "late_null.txt" - payload = b"a" * 2000 + b"\x00" + b"b" - target.write_bytes(payload) - - with _run_main_in(temp_path, ["eof-fixer", "."]) as (stdout, _stderr): - result = main() - - assert result == 1 - assert "Fixing late_null.txt\n" in stdout.getvalue() - assert target.read_bytes() == payload + b"\n" - - -def test_symlink_in_tree_does_not_crash() -> None: - with tempfile.TemporaryDirectory() as temp_dir: - temp_path = Path(temp_dir) - target = temp_path / "target.txt" - target.write_bytes(b"contents without newline") - link = temp_path / "link.txt" - try: - link.symlink_to(target) - except (OSError, NotImplementedError): # pragma: no cover - pytest.skip("symlinks not available on this platform") - - with _run_main_in(temp_path, ["eof-fixer", "."]) as (_stdout, _stderr): - result = main() - - # Either path (link followed or skipped) is acceptable — but the - # walked target itself must end up fixed and the tool must not crash. - assert result == 1 - assert target.read_bytes().endswith(b"\n") - - def test_path_arg_rejects_file() -> None: with tempfile.TemporaryDirectory() as temp_dir: temp_path = Path(temp_dir) @@ -412,60 +65,30 @@ def test_path_arg_rejects_nonexistent_path() -> None: assert "is not a directory" in stderr.getvalue() -def test_readonly_file_in_check_mode_does_not_raise() -> None: - # In --check mode no writes occur, so the file should be opened read-only - # and not raise on read-only files. - with tempfile.TemporaryDirectory() as temp_dir: - temp_path = Path(temp_dir) - readonly = temp_path / "readonly.txt" - readonly.write_text("no newline") - readonly.chmod(stat.S_IRUSR | stat.S_IRGRP | stat.S_IROTH) - - try: - with _run_main_in(temp_path, ["eof-fixer", ".", "--check"]) as (stdout, _stderr): - result = main() - - assert result == 1 - assert "Fixing readonly.txt\n" in stdout.getvalue() - # File must still be unchanged in check mode. - assert readonly.read_text() == "no newline" - finally: - # Restore writable bits so the tempdir can be cleaned up. - readonly.chmod(stat.S_IRUSR | stat.S_IWUSR | stat.S_IRGRP | stat.S_IROTH) - - -def test_exclude_flag_augments_default_skips() -> None: +def test_main_renders_fixing_lines_and_exit_code() -> None: with tempfile.TemporaryDirectory() as temp_dir: temp_path = Path(temp_dir) - (temp_path / "keep.txt").write_bytes(b"no newline") - vendor = temp_path / "vendor" - vendor.mkdir() - (vendor / "lib.txt").write_bytes(b"no newline") + (temp_path / "a.txt").write_bytes(b"no nl") + skipped = temp_path / "vendor" + skipped.mkdir() + (skipped / "lib.txt").write_bytes(b"no nl") with _run_main_in(temp_path, ["eof-fixer", ".", "--exclude", "vendor"]) as (stdout, _stderr): result = main() output = stdout.getvalue() assert result == 1 - assert "Fixing keep.txt\n" in output + assert "Fixing a.txt\n" in output assert "Fixing vendor/lib.txt\n" not in output - # vendor file left untouched because it was excluded - assert (vendor / "lib.txt").read_bytes() == b"no newline" -def test_runs_from_unrelated_cwd_with_absolute_path() -> None: - # Regression test: the tool should work when invoked with an absolute path - # that is not the caller's cwd. Previously yielded relative names were - # opened against cwd, causing FileNotFoundError. - with tempfile.TemporaryDirectory() as work_dir, tempfile.TemporaryDirectory() as target_dir: - work_path = Path(work_dir) - target_path = Path(target_dir) - target_file = target_path / "needs_fix.txt" - target_file.write_text("no newline") +def test_main_returns_zero_when_nothing_to_fix() -> None: + with tempfile.TemporaryDirectory() as temp_dir: + temp_path = Path(temp_dir) + (temp_path / "ok.txt").write_bytes(b"already fine\n") - with _run_main_in(work_path, ["eof-fixer", str(target_path)]) as (stdout, _stderr): + with _run_main_in(temp_path, ["eof-fixer", "."]) as (stdout, _stderr): result = main() - assert result == 1 - assert "Fixing needs_fix.txt\n" in stdout.getvalue() - assert target_file.read_text() == "no newline\n" + assert result == 0 + assert stdout.getvalue() == "" diff --git a/tests/test_fixer.py b/tests/test_fixer.py index 222ebf4..22a6f9e 100644 --- a/tests/test_fixer.py +++ b/tests/test_fixer.py @@ -36,6 +36,16 @@ def test_empty_unchanged() -> None: assert _run(b"") == (False, b"") +def test_empty_real_file_unchanged(tmp_path: pathlib.Path) -> None: + # BytesIO(b"") handles seek(-1, SEEK_END) without error; a real empty file + # raises OSError, exercising the except branch in _detect_trailing. + empty = tmp_path / "empty.txt" + empty.write_bytes(b"") + with empty.open("rb+") as f: + assert fix_file(f, check=False) is False + assert empty.read_bytes() == b"" + + def test_crlf_no_trailing_appends_lf() -> None: assert _run(b"a\r\nb") == (True, b"a\r\nb\n") From 257159db37bd45fdacb4f94fbb52f8dd8c82f120 Mon Sep 17 00:00:00 2001 From: Artur Shiriev Date: Fri, 26 Jun 2026 22:09:45 +0300 Subject: [PATCH 5/6] docs: promote the fixer deepening to architecture Co-Authored-By: Claude Opus 4.8 (1M context) --- architecture/cli.md | 14 +++++++-- architecture/eof-normalization.md | 29 ++++++++++++++----- .../2026-06-26.02-deepen-fixer/design.md | 2 +- 3 files changed, 34 insertions(+), 11 deletions(-) diff --git a/architecture/cli.md b/architecture/cli.md index 5937614..94c1b42 100644 --- a/architecture/cli.md +++ b/architecture/cli.md @@ -2,6 +2,12 @@ The command-line surface and its contract with callers (shells, CI, pre-commit). +`main()` is a thin adapter over `fix_directory`: it parses args, guards +`is_dir` (exit 2), delegates the walk to `fix_directory`, renders one +`Fixing ` line per returned path, and exits `1`/`0`. It holds no +file-level logic. See [eof-normalization](eof-normalization.md) for how +`fix_directory` and `fix_file` do the actual work. + ## Interface - `eof-fixer ` — fix every non-ignored text file under `` in place. @@ -19,12 +25,14 @@ visited file. ## Output For each file that needs (or would need) fixing, the tool writes `Fixing -\n` to stdout — the relative path as discovery yielded it. Output goes -through `sys.stdout.write`, not `print`. +\n` to stdout — the relative path as discovery yielded it. Output +goes through `sys.stdout.write`, not `print`. Output is batch-rendered: +all `Fixing` lines are written after `fix_directory` returns, not streamed +during the walk. ## Exit code -`main()` accumulates a result by OR-ing each file's outcome: +`main()` derives its result from the list returned by `fix_directory`: - **0** — every file already ended correctly (nothing changed / nothing would change). diff --git a/architecture/eof-normalization.md b/architecture/eof-normalization.md index f40817a..2c08502 100644 --- a/architecture/eof-normalization.md +++ b/architecture/eof-normalization.md @@ -1,9 +1,24 @@ # EOF normalization -The core capability: given a single open file, decide whether its end-of-file -terminator is correct and, in fix mode, make it so. "Correct" means the file -ends with **exactly one** line terminator — no missing newline, no trailing -blank lines. +The core capability lives in `eof_fixer/fixer.py` and is exposed through two +public functions: + +- **`fix_file(file_obj, *, check) -> bool`** — given a single open binary + file, decide whether its end-of-file terminator is correct and, in fix mode, + make it so. Returns `True` if the file was (or, under `check`, would be) + fixed. +- **`fix_directory(root, *, check=False, extra_excludes=()) -> list[pathlib.Path]`** + — drives `discovery.iter_text_files`, owns the default-skip policy + (`DEFAULT_EXCLUDES = (".cache", ".uv-cache")`), opens each file `rb` under + check else `rb+`, calls `fix_file`, and returns the relative paths + fixed (or would-be-fixed) in walk order. + +`main()` no longer holds the file-level logic — it delegates entirely to +`fix_directory` and renders the result. See [cli](cli.md) for the adapter +layer. + +"Correct" means the file ends with **exactly one** line terminator — no +missing newline, no trailing blank lines. ## Binary skip @@ -50,6 +65,6 @@ In fix mode: In check mode no bytes are written; the action is computed and reported only. -Either way, a file that needed changing contributes a non-zero result to the -caller — see [cli](cli.md) for how that becomes the process exit code and the -`Fixing ` line. +Either way, `fix_file` returns `True` if the file needed changing; +`fix_directory` collects those into the returned path list. See [cli](cli.md) +for how that list becomes the process exit code and the `Fixing ` lines. diff --git a/planning/changes/2026-06-26.02-deepen-fixer/design.md b/planning/changes/2026-06-26.02-deepen-fixer/design.md index 6b0ea83..95fc31a 100644 --- a/planning/changes/2026-06-26.02-deepen-fixer/design.md +++ b/planning/changes/2026-06-26.02-deepen-fixer/design.md @@ -1,5 +1,5 @@ --- -summary: Deepen the fix behavior into eof_fixer/fixer.py (fix_file + fix_directory) so main() becomes a thin CLI adapter and the test surface moves off the process/global-state harness. +summary: Fix behavior now lives behind eof_fixer/fixer.py (fix_file + fix_directory); main() is a thin adapter and tests target the interface instead of the argv/stdout/cwd harness. --- # Design: Deepen the fix into `fixer.py`; `main()` becomes an adapter From 6aa7fe89aeb52749db97c50030cc0c9080fc64ea Mon Sep 17 00:00:00 2001 From: Artur Shiriev Date: Fri, 26 Jun 2026 22:15:20 +0300 Subject: [PATCH 6/6] refactor: replace unreachable return with AssertionError in _detect_trailing Swap the final `return ("none", 0) # pragma: no cover` for `raise AssertionError("unreachable")` so the invariant fails loud instead of silently returning a wrong value. With only 6 return paths remaining, the PLR0911 suppression is no longer needed and has been removed. EM101 is suppressed inline on the unreachable line only. Co-Authored-By: Claude Opus 4.8 (1M context) --- eof_fixer/fixer.py | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/eof_fixer/fixer.py b/eof_fixer/fixer.py index d509250..8222312 100644 --- a/eof_fixer/fixer.py +++ b/eof_fixer/fixer.py @@ -18,7 +18,7 @@ def _is_binary(file_obj: IO[bytes]) -> bool: return b"\x00" in sample -def _detect_trailing(file_obj: IO[bytes]) -> tuple[str, int]: # noqa: PLR0911 +def _detect_trailing(file_obj: IO[bytes]) -> tuple[str, int]: """Inspect the end of `file_obj` and return the action needed. Returns one of: @@ -53,7 +53,7 @@ def _detect_trailing(file_obj: IO[bytes]) -> tuple[str, int]: # noqa: PLR0911 if remaining.startswith(sequence): return ("truncate", position + len(sequence)) - return ("none", 0) # pragma: no cover + raise AssertionError("unreachable") # pragma: no cover # noqa: EM101 def fix_file(file_obj: IO[bytes], *, check: bool) -> bool: