Skip to content
View manuelsampedro1's full-sized avatar

Block or report manuelsampedro1

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
manuelsampedro1/README.md

Manuel Sampedro

I build agentic engineering tools: repo readiness checks, review handoffs, verification gates, run ledgers, and local-first product prototypes.

My focus is the practical layer around coding agents: the prompts, scripts, workflows, and small products that keep AI-generated work scoped, inspectable, and easier to trust.

If you are building with Codex or evaluating how AI changes software work, find me on X @manuelsampedrop.

Current Focus

  • Agent reliability: repo setup checks, context contracts, review packets, and repeatable handoffs.
  • Verification discipline: change-aware test plans, honest closeout notes, and evidence before claims.
  • Agent auditability: run ledgers, decisions, file changes, command history, and blockers a reviewer can inspect.
  • Agent safety: permission gates, MCP tool controls, and receipt-based authorization for sensitive actions.
  • Product judgment: small local-first prototypes that test the workflow before adding backend weight.

Reviewer Path

If you have five minutes, start with the workflow rather than the full repo list:

Selected Work

Repo What it proves Why it matters
agent-request-brief Raw request clarification Audits raw coding-agent requests for objective, scope, acceptance criteria, constraints, context, verification, risks, and next actions before broad edits start.
agent-task-contract Task scope readiness Validates Markdown task briefs before coding-agent work starts, including stable acceptance-criteria IDs for traceable review packets, closeouts, ledgers, and profile proof.
agent-repo-map Repository context mapping Generates compact pre-run maps of docs, languages, entrypoints, commands, verification signals, CI, Git state, and risk paths before a coding-agent handoff.
agent-handoff-brief Pre-run agent handoffs Turns task contracts and repo context into compact coding-agent briefs with required reading, commands, verification, risk paths, gaps, and a ready-to-use prompt.
agent-continuation-brief Long-run continuation handoffs Audits continuation notes for original objective, current state, completed work, blockers, changed files, commands, next actions, risks, and next-agent instructions before another coding agent resumes.
agent-handoff-drift Handoff state drift Checks handoff notes against the live repo for missing files, stale HEAD or branch claims, false clean-tree claims, and weak command-success evidence before another agent continues.
agent-start-gate Pre-run start gate Gates coding-agent starts from a Markdown packet with objective, scope, inputs, traceable evidence pointers, worktree state, context screening, verification commands, and stop conditions before the first edit.
agent-context-sentinel Context injection preflight Audits untrusted context for prompt-injection language, hidden authority claims, secret exfiltration requests, dangerous commands, unattended sensitive actions, local paths, and missing source metadata before agent handoff.
agent-context-budget Context budget planning Estimates token pressure, detects oversized or duplicate context, and produces keep/summarize/drop plans before coding-agent handoffs.
agent-tool-schema-lint Tool schema quality Lints JSON tool definitions for strong descriptions, object schemas, required fields, extra-property control, parameter guidance, enum clarity, schema-aligned input examples, and safety language before agents can call them.
agent-tool-call-replay Tool-call schema replay Replays captured agent tool calls against current schemas to catch unknown tools, invalid JSON arguments, missing required fields, type mismatches, enum drift, extra arguments, and missing or duplicate stable call IDs before reruns or proof packets reuse them.
agent-output-contract Structured output evidence Validates coding-agent JSON outputs for schema version, outcome, score, issue lists, blocker consistency, structured check evidence, and obvious secret or local-path leakage before CI, ledgers, or review packets trust them.
agent-evidence-chain Cross-artifact proof consistency Validates that multiple JSON evidence artifacts from one coding-agent run share task, run, repo, and commit identity before review packets, ledgers, or closeouts treat them as one proof chain.
agent-source-grounding Source-grounded agent claims Audits agent-written Markdown and JSON artifacts for explicit sources, concrete evidence pointers, claim grounding, placeholder citations, and optional HTTP link checks before public proof or decisions reuse them.
agent-acceptance-trace Acceptance criteria traceability Turns task acceptance criteria, diffs, and closeout evidence into a criterion-by-criterion matrix with covered, partial, or missing status before a final answer is accepted.
agent-scope-guard Scope boundary enforcement Fails coding-agent diffs when changed paths fall outside an explicit file or glob allowlist, with text and JSON output, tests, CI, repo readiness proof, and validated proof-packet evidence.
agent-worktree-guard Dirty worktree protection Snapshots pre-existing user edits before a coding-agent run, emits reusable snapshot hashes, and blocks tampered baselines, protected-file drift, or unexpected dirty paths outside the task allowlist.
agent-instruction-audit Agent instruction readiness Audits AGENTS.md, CLAUDE.md, GEMINI.md, CURSOR.md, and .cursorrules for actionable scope, constraints, verification, safety, closeout guidance, and risky commands.
agent-decision-guard Decision documentation gate Blocks decision-worthy diffs when CI, automation, config, security, product scope, or agent-instruction changes lack DECISIONS.md, TODO.md, or an explicit no-follow-up waiver.
agent-diff-budget Diff size and risk budget Fails broad coding-agent diffs when changed files, line volume, or high-risk files exceed explicit review budgets, with path-level risk tags, reviewer questions, and validated proof-packet evidence.
agent-diff-splitter Oversized diff split planning Turns broad coding-agent diffs into ordered review slices by security, data, release, automation, agent instructions, tests, application, and product/docs with files, line counts, rationale, reviewer questions, and validated proof-packet evidence.
agent-review-map Review lane routing Maps mixed coding-agent diffs into security, data, release, automation, agent-instruction, product/docs, tests, and application lanes with owners, handoff order, reviewer questions, and validated proof-packet evidence.
agent-review-finding-check Review finding quality Audits coding-agent review findings for severity, concrete file lines, impact, actionable fixes, vague language, diff membership, and validated proof-packet evidence.
agent-run-ledger Agent audit trails Records AI agent runs as JSONL, imports review packets with embedded task contracts, CI, published-HEAD proof, readiness, sensitive-change blockers, rendered verification-envelope metadata, task-contract metadata from JSON envelopes and readiness reports, repo readiness reports and contracts, Markdown or JSON-envelope verification plans, and GitHub Actions run evidence, gates unresolved evidence with strict doctor mode, and renders static review reports.
agent-tool-call-audit Tool-call history review Audits coding-agent tool-call logs for destructive commands, sensitive tool actions, repeated failures, secret markers, missing working directories, skipped safety hooks, and missing approval evidence.
agent-memory-audit Agent memory hygiene Audits local agent memory files for stale current-state claims, missing or weak concrete source evidence, public-action risk, local-path exposure, weak memory policy, and secret-material markers before context is reused.
agent-pr-brief PR description quality Audits pull-request descriptions against real diffs for required sections, unmentioned changed files, risky paths, weak verification evidence, vague language, and large-diff scope notes.
agent-plan-trace Plan execution traceability Audits coding-agent plans against diffs, command logs, and closeouts so completed steps stay tied to evidence and pending work cannot hide behind a confident final answer.
repo-flightcheck Pre-agent readiness Audits whether a repository is ready for Codex, Claude Code, and human reviewers, including agent-readiness contract output, structured task-contract metadata, optional task-contract validation, CI/local verification coverage, local tool availability, remote publication failure classification, published-HEAD readiness, Python unittest detection, Python and Node CLI entrypoint readiness, GitHub Action repos, and stale documented commands.
codex-review-packet Review context quality Packages diffs, repo rules, local context, task contracts, review lanes, sensitive-change checks, repo readiness reports or contracts, generated Markdown or JSON-envelope verification plans with task-contract summaries, published-HEAD proof, and GitHub Actions CI evidence into a sharper handoff for Codex or Claude Code.
verify-by-change Evidence-based closeout Suggests honest checks from committed diffs, working-tree changes, and generated review packets, with JSON envelope metadata, packet readiness context, task-contract metadata, CI-local command parity, GitHub Action/workflow guidance, secret-material and security-sensitive path checks, Python and Node CLI context, and a repo readiness contract.
diff-to-eval Agent learning loops Turns real unified diffs into reusable JSON evaluation cases with changed files, risk tags, suggested checks, expected outcomes, tests, CI, and repo readiness proof.
agent-eval-runner Eval case scoring Scores proof packets, closeouts, or review notes against saved diff-to-eval cases for changed files, suggested checks, risk tags, expected outcomes, and pass/fail thresholds.
agent-bug-repro Reproducible bug handoffs Audits bug reports for summary, repro steps, expected and actual behavior, environment, evidence, regression context, vague language, and error-log signals before a debugging agent starts guessing.
agent-retry-guard Retry-loop control Detects repeated failed commands, unchanged error signatures, consecutive blind retries, missing strategy shifts, and verified failed command receipts before another run wastes context.
agent-ci-failure-packet Focused CI retries Turns noisy CI logs or verified failed command receipts into Markdown or JSON packets with failing commands, error signals, referenced files, suggested checks, and a scoped next-agent prompt.
agent-release-note-check Release note accuracy Audits release notes against real diffs so maintainers do not miss breaking, security, dependency, CI, test, code changes, or unsupported verification claims before publishing a release.
agent-rollback-plan Operational rollback review Turns risky agent diffs into rollback packets with changed files, risk tags, rollback steps, post-rollback checks, and reviewer questions before changes ship.
agent-closeout-check Evidence-backed final answers Lints coding-agent closeouts for summaries, changed-file evidence, exact verification commands, residual risks, vague claims, tests, CI, and repo readiness proof.
agent-claim-check Closeout claim verification Checks coding-agent closeout claims against changed files, exact claimed commands, explicit command evidence, risky paths, and no-risk language before PR comments or proof packets reuse the final answer.
agent-command-receipt Command evidence receipts Creates and verifies hashed command-outcome receipts with evidence file hashes before closeouts, proof packets, or ledgers reuse test and verification claims.
agent-test-impact Test-impact evidence Maps coding-agent diffs to direct, partial, or missing test evidence for each changed source file before a broad test pass is treated as enough.
agent-change-risk Review-gate routing Classifies coding-agent diffs into risk tags and required gates for scope, secrets, runbook drift, CI failure packets, rollback, eval cases, closeout checks, and change-aware verification.
agent-dependency-guard Dependency-surface review Classifies dependency manifest, lockfile, package spec, and install-script changes in coding-agent diffs before tests or closeouts treat the change as safe.
agent-merge-readiness Merge verdict gate Turns diff risk, explicit check results, and closeout evidence into strict ready, needs-review, or blocked verdicts with non-ready exit codes for automation.
agent-proof-packet Review proof packets Packages coding-agent diffs, explicit checks, evidence files, risks, decisions, open questions, and missing evidence into Markdown or JSON proof packets.
agent-publish-queue Publication queue audit Audits local proof repos for branch, HEAD, dirty state, GitHub remote, optional public HTTP status, blockers, and next actions before profile promotion.
profile-proof-audit Profile claim audit Audits a GitHub profile README for required proof sections, Selected Work shape, Latest Proof items, relative links, optional public HTTP status, and unsupported claim language.
runbook-drift-check Operational doc drift Checks README, AGENTS.md, and runbooks for missing local links, stale path references, script command drift, optional bash syntax checks, tests, CI, and repo readiness proof.
briefboard-local Product scoping taste Turns messy kickoff notes into a structured build brief, flags missing essentials and weak handoff scope, and generates a Codex-ready prompt with no backend, importable examples, and CI-local checks.

These are small on purpose. I prefer tools a reviewer can clone, inspect, run, and challenge over larger demos with less operational signal.

Agent Safety Layer

Repo What it proves Why it matters
agent-secret-sentinel Secret leak preflight Scans agent-generated diffs for likely private keys, provider tokens, suspicious assignments, and high-entropy values before commits, pull requests, or public examples.
agent-artifact-redactor Artifact redaction before public proof Redacts logs, transcripts, proof packets, and command artifacts for sensitive markers, then writes hash manifests that tie redacted copies to source artifacts before publication.
deploy-gate Human authorization for AI-driven deploys Blocks sensitive PRs until a named human approves the exact action with a signed receipt.
mcp-guard Tool-call control for MCP agents Enforces allow, block, or approval rules before dangerous MCP tool calls execute.
pp-cli Local receipt verification Verifies Permission Protocol receipts with local Ed25519 signature checks.
python-sdk Approval workflow integration Lets Python workflows request and verify authority receipts around sensitive actions.

How I Work With Codex

Public Workbench

  • AI lab notes: build notes, decisions, and launch logs tied to real repos or workflows.
  • Recipes: reusable prompts, checklists, and implementation patterns that came from actual work.
  • Examples: concrete proof-packet shapes for verifying profile and agent-workbench claims.
  • Tooling radar: short research only when it changes a build or tooling decision.
  • Docs: public operating docs, including the automation runbook and profile strategy.

Verify This Repo

This profile repo has a small local check so maintenance changes are not just copy edits:

make test
make lint
make build

The check validates shell scripts, compiles local Python audit tools, runs Python unit tests, runs the commit-script shell fixture, executes the profile quality audit, regenerates public indexes, checks latest-proof freshness, refreshes latest-proof links, and fails if generated files drift.

Latest Proof

Principles

  • Ship useful proof, not activity theater.
  • Optimize for reviewability: strong AI workflows should leave evidence.
  • Prefer own repos and working artifacts over meta commentary.
  • Keep claims honest: what exists, what was tested, and what is still limited.
  • Use the workbench as supporting evidence, not as a substitute for real projects.

Pinned Loading

  1. briefboard-local briefboard-local Public

    Local-first client brief builder that outputs project briefs and Codex prompts.

    JavaScript

  2. codex-review-packet codex-review-packet Public

    Generate sharper repo review packets for Codex and Claude Code from local git context.

    Python

  3. manuelsampedro1 manuelsampedro1 Public

    AI-native builder workbench: Codex, Claude Code, agents, recipes, and daily lab notes.

    Python

  4. verify-by-change verify-by-change Public

    Suggest honest verification steps from changed files in local repos.

    Python

  5. repo-flightcheck repo-flightcheck Public

    Audit whether a repository is truly ready for Codex, Claude Code, and human reviewers.

    JavaScript

  6. agent-run-ledger agent-run-ledger Public

    Audit AI agent runs with a local JSONL ledger and static HTML review reports.

    JavaScript