defensible-cv

A profile-driven pipeline that turns one input file into a recruiter-ready, 2-page developer CV PDF — where every number is verified live (publication downloads, citations, GitHub stars/forks) and committed as proof.

What this is

You fill in profile.yaml (identity + research sources + experience). A Python crawler verifies the metrics against live public APIs and writes data/research.{json,md}. Then the CV is authored into cv_data.yaml (rendercv) and rendered to a clean 2-page PDF.

profile.yaml  ──►  scripts/research.py  ──►  data/research.json + research.md
                                                      │
                                                      ▼
                                  cv_data.yaml  ──►  rendercv  ──►  PDF (2 pages)

Sample output: assets/Nguyen_Phuong_Anh_Tu_CV.pdf.

Use it with an AI agent (recommended)

This repo is built around an agent skill. Open it in any AI coding agent that reads AGENTS.md / SKILL.md files (Claude, Cursor, …). The agent reads AGENTS.md → agents/skills/defensible-cv/SKILL.md, and stops to ask for profile.yaml if it's missing — it never invents your data.

Then just ask, in plain language:

Build a CV for a new person (no profile yet):

I want a developer CV. Here's my info — name: …, location: …, email: …, GitHub: …, publications (DOIs): …, experience: …, education: …, skills: …. Follow AGENTS.md + the defensible-cv skill: write profile.yaml, refresh the metrics, author cv_data.yaml, and render the PDF.

Rebuild from an existing profile.yaml:

Build the CV: follow AGENTS.md + the defensible-cv skill. Refresh metrics from profile.yaml, then render the PDF.

Refresh the numbers only:

Refresh the CV metrics and re-render the PDF.

Do it manually (no agent)

# 1. Create your profile from the template, then fill it in
cp profile.example.yaml profile.yaml

# 2. Verify metrics (reads profile.yaml's research_sources; --no-scholar = CI-safe)
uv run --with requests --with pyyaml --with pypdf \
  python scripts/research.py --no-scholar      # --refresh bypasses 24h cache

# 3. Render the CV to PDF (no venv needed)
uv run --with "rendercv[full]>=2.8" rendercv render cv_data.yaml

scripts/research.py reads only profile.yaml's research_sources. If profile.yaml is missing it stops with a clear message instead of inventing data. The committed profile.yaml / cv_data.yaml are a complete worked example.

How it stays defensible

Every figure on the CV is verified live, then committed to data/research.json so it survives a recruiter's spot-check:

Source	What it returns
OJS (e.g. jte.edu.vn)	Per-article download total from the inline `pkpUsageStats` payload.
Semantic Scholar Graph API	Paper metadata + paginated citing-paper list (venue, year, authors, DOI).
OpenCitations COCI/Meta	Independent citation count + citing DOIs (cross-checks Semantic Scholar).
Crossref Event Data	Twitter / Wikipedia / news mentions (best-effort).
Google Scholar (`scholarly`)	Optional, soft-fails on CAPTCHA. `--no-scholar` skips it.
GitHub Public API	Per-repo stars/forks, languages, totals across non-fork repos.
Citing-paper PDFs (Unpaywall → arXiv → `pypdf`)	Downloads each open-access citing paper, finds the reference label that points back to the work, and extracts the verbatim sentence + page where the author invokes it.

Concrete citation contexts

For every open-access citing paper, data/research.md records the reference label, the reference-list page, and the in-body sentence(s) that cite the work. Example from the committed run for 10.54644/jte.2024.1514:

[ACM] Integrating Expert Knowledge With Automated Knowledge Component Extraction for Student Modeling — ACM UMAP 2025

Our paper is reference [19] (their p.6).

p.2 — "…ASTs have been widely used in automated code analysis efforts [19]. For example, Rivers used ASTs to identify each syntax structure in a student's submission…"

Paywalled papers (most IEEE, parts of ACM, MDPI behind Akamai) are kept with status not_oa / fetch_failed and clearly flagged — nothing is fabricated.

Files

profile.yaml                 INPUT: identity + research sources + narrative
profile.example.yaml         documented template — copy to profile.yaml
cv_data.yaml                 rendercv source (authored from profile + research)
scripts/research.py          CLI crawler, single file, urllib + pypdf
data/research.json           machine-readable verified snapshot (regenerated)
data/research.md             recruiter-friendly summary (regenerated)
AGENTS.md                    entry point for AI agents
agents/skills/defensible-cv/SKILL.md   the agent playbook
assets/cv-preview.png        the preview banner above
assets/Nguyen_Phuong_Anh_Tu_CV.pdf     sample rendered CV
requirements.txt             pinned deps

License

MIT.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

defensible-cv

What this is

Use it with an AI agent (recommended)

Do it manually (no agent)

How it stays defensible

Concrete citation contexts

Files

License

About

Uh oh!

Releases 2

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
agents/skills/defensible-cv		agents/skills/defensible-cv
assets		assets
data		data
scripts		scripts
.gitignore		.gitignore
AGENTS.md		AGENTS.md
LICENSE		LICENSE
README.md		README.md
README.vi.md		README.vi.md
cv_data.yaml		cv_data.yaml
profile.example.yaml		profile.example.yaml
profile.yaml		profile.yaml
requirements.txt		requirements.txt

Folders and files

Latest commit

History

Repository files navigation

defensible-cv

What this is

Use it with an AI agent (recommended)

Do it manually (no agent)

How it stays defensible

Concrete citation contexts

Files

License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 2

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages