Skip to content

Add FlashMLA flashmla artifact manifest#24

Open
ghangz wants to merge 2 commits into
MetaX-MACA:mainfrom
ghangz:mengz/flashmla-artifact-manifest
Open

Add FlashMLA flashmla artifact manifest#24
ghangz wants to merge 2 commits into
MetaX-MACA:mainfrom
ghangz:mengz/flashmla-artifact-manifest

Conversation

@ghangz

@ghangz ghangz commented Jul 1, 2026

Copy link
Copy Markdown

Summary

  • Adds a focused flashmla artifact manifest improvement for MetaX-MACA/FlashMLA.
  • The change targets MetaX MACA development and validation workflows, with emphasis on earlier diagnostics, reproducible logs, or safer benchmark tooling.
  • Existing default behavior is kept compatible; the new logic is scoped to explicit checks, helper tools, or validation metadata.

Validation

  • Verified on Gitee.AI MetaX GPU resources: FlashMLA_TileLang_20260701, 3/3 PASS; PyTorch-MACA batch also covered FlashMLA tools.
  • Branch validation command: python tools/artifact_manifest.py --self-test
  • Pull request text is intentionally ASCII-only to avoid encoding issues on web forms and API clients.

Review notes

  • Source branch: ghangz:mengz/flashmla-artifact-manifest
  • Target branch: MetaX-MACA/FlashMLA:main
  • Maintainers can modify this branch if follow-up adjustments are needed.

@gemini-code-assist gemini-code-assist Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a new utility script, tools/artifact_manifest.py, designed to generate reproducible artifact manifests containing file sizes and SHA-256 hashes. The feedback suggests two key improvements: first, isolating the self-test execution using a temporary directory to prevent performance overhead and file overwrite risks in the current working directory; second, validating that the specified --root directory exists and is a directory to avoid silent failures in CI/CD pipelines.

Important

The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.

Comment thread tools/artifact_manifest.py Outdated
Comment on lines +37 to +45
def self_test() -> None:
sample = Path("_artifact_manifest_sample.txt")
sample.write_text("maca artifact\n", encoding="utf-8")
try:
data = collect(Path.cwd())
assert any(item["path"] == sample.name for item in data["artifacts"])
print(json.dumps({"ok": True, "count": data["count"]}, ensure_ascii=False))
finally:
sample.unlink(missing_ok=True)

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

Running self_test on the current working directory (Path.cwd()) can be extremely inefficient and risky:

  1. Performance Overhead: If the workspace already contains large build artifacts (e.g., .so files in build/ or wheels in dist/), collect(Path.cwd()) will scan and compute SHA-256 hashes for all of them. This makes a simple self-test very slow and resource-intensive.
  2. File Overwrite/Deletion Risk: Creating _artifact_manifest_sample.txt directly in the current working directory risks overwriting or deleting an existing file of the same name.

Using a temporary directory via tempfile.TemporaryDirectory avoids both issues by isolating the self-test execution and ensuring it only scans the temporary directory.

Suggested change
def self_test() -> None:
sample = Path("_artifact_manifest_sample.txt")
sample.write_text("maca artifact\n", encoding="utf-8")
try:
data = collect(Path.cwd())
assert any(item["path"] == sample.name for item in data["artifacts"])
print(json.dumps({"ok": True, "count": data["count"]}, ensure_ascii=False))
finally:
sample.unlink(missing_ok=True)
def self_test() -> None:
import tempfile
with tempfile.TemporaryDirectory() as tmpdir:
tmp_path = Path(tmpdir)
sample = tmp_path / "_artifact_manifest_sample.txt"
sample.write_text("maca artifact\n", encoding="utf-8")
data = collect(tmp_path)
assert any(item["path"] == sample.name for item in data["artifacts"])
print(json.dumps({"ok": True, "count": data["count"]}, ensure_ascii=False))

Comment on lines +48 to +57
def main() -> int:
parser = argparse.ArgumentParser(description=__doc__)
parser.add_argument("--root", default=".")
parser.add_argument("--self-test", action="store_true")
args = parser.parse_args()
if args.self_test:
self_test()
return 0
print(json.dumps(collect(Path(args.root)), ensure_ascii=False, indent=2))
return 0

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

If the specified --root directory does not exist or is not a directory, Path.glob will silently yield no files, and the script will exit with status 0 while producing an empty manifest. This can lead to silent failures in CI/CD pipelines (e.g., due to a typo in the path or a build step that failed to produce the directory).

It is safer to validate that the --root path exists and is a directory, raising an error via parser.error if it is not.

Suggested change
def main() -> int:
parser = argparse.ArgumentParser(description=__doc__)
parser.add_argument("--root", default=".")
parser.add_argument("--self-test", action="store_true")
args = parser.parse_args()
if args.self_test:
self_test()
return 0
print(json.dumps(collect(Path(args.root)), ensure_ascii=False, indent=2))
return 0
def main() -> int:
parser = argparse.ArgumentParser(description=__doc__)
parser.add_argument("--root", default=".")
parser.add_argument("--self-test", action="store_true")
args = parser.parse_args()
if args.self_test:
self_test()
return 0
root = Path(args.root)
if not root.is_dir():
parser.error(f"Root path '{args.root}' is not a directory or does not exist.")
print(json.dumps(collect(root), ensure_ascii=False, indent=2))
return 0

- Add FlashMLA artifact manifest
- Harden artifact manifest self test
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant