Add FlashMLA flashmla artifact manifest#24
Conversation
There was a problem hiding this comment.
Code Review
This pull request introduces a new utility script, tools/artifact_manifest.py, designed to generate reproducible artifact manifests containing file sizes and SHA-256 hashes. The feedback suggests two key improvements: first, isolating the self-test execution using a temporary directory to prevent performance overhead and file overwrite risks in the current working directory; second, validating that the specified --root directory exists and is a directory to avoid silent failures in CI/CD pipelines.
Important
The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.
| def self_test() -> None: | ||
| sample = Path("_artifact_manifest_sample.txt") | ||
| sample.write_text("maca artifact\n", encoding="utf-8") | ||
| try: | ||
| data = collect(Path.cwd()) | ||
| assert any(item["path"] == sample.name for item in data["artifacts"]) | ||
| print(json.dumps({"ok": True, "count": data["count"]}, ensure_ascii=False)) | ||
| finally: | ||
| sample.unlink(missing_ok=True) |
There was a problem hiding this comment.
Running self_test on the current working directory (Path.cwd()) can be extremely inefficient and risky:
- Performance Overhead: If the workspace already contains large build artifacts (e.g.,
.sofiles inbuild/or wheels indist/),collect(Path.cwd())will scan and compute SHA-256 hashes for all of them. This makes a simple self-test very slow and resource-intensive. - File Overwrite/Deletion Risk: Creating
_artifact_manifest_sample.txtdirectly in the current working directory risks overwriting or deleting an existing file of the same name.
Using a temporary directory via tempfile.TemporaryDirectory avoids both issues by isolating the self-test execution and ensuring it only scans the temporary directory.
| def self_test() -> None: | |
| sample = Path("_artifact_manifest_sample.txt") | |
| sample.write_text("maca artifact\n", encoding="utf-8") | |
| try: | |
| data = collect(Path.cwd()) | |
| assert any(item["path"] == sample.name for item in data["artifacts"]) | |
| print(json.dumps({"ok": True, "count": data["count"]}, ensure_ascii=False)) | |
| finally: | |
| sample.unlink(missing_ok=True) | |
| def self_test() -> None: | |
| import tempfile | |
| with tempfile.TemporaryDirectory() as tmpdir: | |
| tmp_path = Path(tmpdir) | |
| sample = tmp_path / "_artifact_manifest_sample.txt" | |
| sample.write_text("maca artifact\n", encoding="utf-8") | |
| data = collect(tmp_path) | |
| assert any(item["path"] == sample.name for item in data["artifacts"]) | |
| print(json.dumps({"ok": True, "count": data["count"]}, ensure_ascii=False)) |
| def main() -> int: | ||
| parser = argparse.ArgumentParser(description=__doc__) | ||
| parser.add_argument("--root", default=".") | ||
| parser.add_argument("--self-test", action="store_true") | ||
| args = parser.parse_args() | ||
| if args.self_test: | ||
| self_test() | ||
| return 0 | ||
| print(json.dumps(collect(Path(args.root)), ensure_ascii=False, indent=2)) | ||
| return 0 |
There was a problem hiding this comment.
If the specified --root directory does not exist or is not a directory, Path.glob will silently yield no files, and the script will exit with status 0 while producing an empty manifest. This can lead to silent failures in CI/CD pipelines (e.g., due to a typo in the path or a build step that failed to produce the directory).
It is safer to validate that the --root path exists and is a directory, raising an error via parser.error if it is not.
| def main() -> int: | |
| parser = argparse.ArgumentParser(description=__doc__) | |
| parser.add_argument("--root", default=".") | |
| parser.add_argument("--self-test", action="store_true") | |
| args = parser.parse_args() | |
| if args.self_test: | |
| self_test() | |
| return 0 | |
| print(json.dumps(collect(Path(args.root)), ensure_ascii=False, indent=2)) | |
| return 0 | |
| def main() -> int: | |
| parser = argparse.ArgumentParser(description=__doc__) | |
| parser.add_argument("--root", default=".") | |
| parser.add_argument("--self-test", action="store_true") | |
| args = parser.parse_args() | |
| if args.self_test: | |
| self_test() | |
| return 0 | |
| root = Path(args.root) | |
| if not root.is_dir(): | |
| parser.error(f"Root path '{args.root}' is not a directory or does not exist.") | |
| print(json.dumps(collect(root), ensure_ascii=False, indent=2)) | |
| return 0 |
- Add FlashMLA artifact manifest - Harden artifact manifest self test
Summary
Validation
Review notes