English · Tiếng Việt
A profile-driven pipeline that turns one input file into a recruiter-ready, 2-page developer CV PDF — where every number is verified live (publication downloads, citations, GitHub stars/forks) and committed as proof.
You fill in profile.yaml (identity + research sources + experience). A
Python crawler verifies the metrics against live public APIs and writes
data/research.{json,md}. Then the CV is authored into cv_data.yaml
(rendercv) and rendered to a clean 2-page PDF.
profile.yaml ──► scripts/research.py ──► data/research.json + research.md
│
▼
cv_data.yaml ──► rendercv ──► PDF (2 pages)
Sample output: assets/Nguyen_Phuong_Anh_Tu_CV.pdf.
This repo is built around an agent skill. Open it in any AI coding agent
that reads AGENTS.md / SKILL.md files (Claude, Cursor, …). The agent reads
AGENTS.md → agents/skills/defensible-cv/SKILL.md,
and stops to ask for profile.yaml if it's missing — it never invents your data.
Then just ask, in plain language:
Build a CV for a new person (no profile yet):
I want a developer CV. Here's my info — name: …, location: …, email: …, GitHub: …, publications (DOIs): …, experience: …, education: …, skills: …. Follow
AGENTS.md+ the defensible-cv skill: writeprofile.yaml, refresh the metrics, authorcv_data.yaml, and render the PDF.
Rebuild from an existing profile.yaml:
Build the CV: follow
AGENTS.md+ the defensible-cv skill. Refresh metrics fromprofile.yaml, then render the PDF.
Refresh the numbers only:
Refresh the CV metrics and re-render the PDF.
# 1. Create your profile from the template, then fill it in
cp profile.example.yaml profile.yaml
# 2. Verify metrics (reads profile.yaml's research_sources; --no-scholar = CI-safe)
uv run --with requests --with pyyaml --with pypdf \
python scripts/research.py --no-scholar # --refresh bypasses 24h cache
# 3. Render the CV to PDF (no venv needed)
uv run --with "rendercv[full]>=2.8" rendercv render cv_data.yamlscripts/research.py reads only profile.yaml's research_sources. If
profile.yaml is missing it stops with a clear message instead of inventing
data. The committed profile.yaml / cv_data.yaml are a complete worked example.
Every figure on the CV is verified live, then committed to data/research.json
so it survives a recruiter's spot-check:
| Source | What it returns |
|---|---|
| OJS (e.g. jte.edu.vn) | Per-article download total from the inline pkpUsageStats payload. |
| Semantic Scholar Graph API | Paper metadata + paginated citing-paper list (venue, year, authors, DOI). |
| OpenCitations COCI/Meta | Independent citation count + citing DOIs (cross-checks Semantic Scholar). |
| Crossref Event Data | Twitter / Wikipedia / news mentions (best-effort). |
Google Scholar (scholarly) |
Optional, soft-fails on CAPTCHA. --no-scholar skips it. |
| GitHub Public API | Per-repo stars/forks, languages, totals across non-fork repos. |
Citing-paper PDFs (Unpaywall → arXiv → pypdf) |
Downloads each open-access citing paper, finds the reference label that points back to the work, and extracts the verbatim sentence + page where the author invokes it. |
For every open-access citing paper, data/research.md records the reference
label, the reference-list page, and the in-body sentence(s) that cite the work.
Example from the committed run for 10.54644/jte.2024.1514:
[ACM] Integrating Expert Knowledge With Automated Knowledge Component Extraction for Student Modeling — ACM UMAP 2025
- Our paper is reference [19] (their p.6).
- p.2 — "…ASTs have been widely used in automated code analysis efforts [19]. For example, Rivers used ASTs to identify each syntax structure in a student's submission…"
Paywalled papers (most IEEE, parts of ACM, MDPI behind Akamai) are kept with
status not_oa / fetch_failed and clearly flagged — nothing is fabricated.
profile.yaml INPUT: identity + research sources + narrative
profile.example.yaml documented template — copy to profile.yaml
cv_data.yaml rendercv source (authored from profile + research)
scripts/research.py CLI crawler, single file, urllib + pypdf
data/research.json machine-readable verified snapshot (regenerated)
data/research.md recruiter-friendly summary (regenerated)
AGENTS.md entry point for AI agents
agents/skills/defensible-cv/SKILL.md the agent playbook
assets/cv-preview.png the preview banner above
assets/Nguyen_Phuong_Anh_Tu_CV.pdf sample rendered CV
requirements.txt pinned deps
MIT.
