Skip to content

noodlebindev/marker-pdf

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

7 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

marker-pdf skill

A Claude Code skill that teaches the agent to drive marker — the PDF→Markdown converter — and pick the right command for the job instead of fumbling flags.

Marker turns PDFs (and images, docx, pptx) into clean Markdown, JSON, HTML, or RAG-ready chunks, preserving tables, headings, reading order, and equations, with OCR for scanned pages. This skill encodes the operator knowledge: when to skip OCR, which output format to pick, when to escalate to LLM mode, how to verify on a small page range first, and how to get past the common install/first-run traps.

What it gives the agent

  • A decision flow — preflight (detect, don't ask) → verify on 2 pages → OCR on/off → output format → escalate to LLM mode only if needed.
  • Verified flags — every flag and LLM-service path checked against marker 1.10.x, not guessed.
  • Real provider setup — Gemini, Claude, OpenAI, Ollama, Vertex, Azure for --use_llm.
  • Troubleshooting — the Python 3.10+ trap, ~3 GB first-run model download, GPU contention, OOM on large PDFs, and the timing reality (~1–2 min/page on dense docs).

Triggering, measured

The description is validated by a reproducible trigger eval (evals/), testing the real installed skill across 33 labelled queries (3 runs each):

accuracy precision recall
97% 100% 94%

100% precision = zero false triggers across 17 near-miss negatives (summarize, Word→PDF, merge, form-fill, xlsx→csv, markdown-README, JSON→markdown, …). Reproduce with:

python evals/run_trigger_eval.py --eval-set evals/trigger-evals.json \
  --skill-name marker-pdf --model <your-model-id> --runs 3

Note: this harness exists because the official skill-creator trigger eval can't measure an already-installed skill — it matches a synthetic command name while the model fires the real skill, yielding a false 0% recall.

Install the skill

git clone https://github.com/<you>/marker-pdf-skill ~/.claude/skills/marker-pdf

Claude Code discovers it automatically. It triggers when you ask to convert/extract a document to markdown/JSON/chunks, OCR a scanned or image-based document, batch-convert a folder, or mention marker / marker_single.

Install marker itself

The skill assumes the marker CLI is available (it can also install/repair it for you). Marker needs Python 3.10+:

uv tool install --python 3.12 marker-pdf

This installs six commands to ~/.local/bin: marker_single, marker, marker_gui, marker_server, marker_chunk_convert, marker_extract.

Files

Path Contents
SKILL.md Decision flow, quick start, recipes, troubleshooting. The agent loads this.
REFERENCE.md Full flag tables, output-format guide, LLM-mode provider setup, deeper troubleshooting.
evals/ Trigger eval set + the installed-skill eval harness + last results.

Credits

A wrapper around datalab-to/marker by Datalab — all the heavy lifting is theirs. This repo only teaches an agent to use it well.

License

MIT — see LICENSE.

About

Claude Code skill: drive the marker CLI to convert PDFs (and images, docx, pptx) to clean Markdown/JSON/HTML/chunks, with OCR and LLM-assisted passes.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages