Runpod-Worker-Marker

A RunPod serverless worker that converts PDFs and images to Markdown, HTML, JSON, or chunks using Marker.

Features

Converts PDF, PNG, JPEG, TIFF, and BMP files.
Output formats: markdown, html, json, chunks.
Optional LLM-assisted conversion with your choice of LLM service and model.
GPU-accelerated via CUDA.
Dependency management via UV.

CI/CD — Build & Push to GitHub Container Registry

A GitHub Actions workflow (.github/workflows/docker-build-push.yml) automatically builds and pushes the Docker image to the GitHub Container Registry (ghcr.io) on every push to main and on semver tags (v*.*.*).

Authentication uses the built-in GITHUB_TOKEN — no extra secrets or variables are required. The image is published under ghcr.io/<owner>/<repo> (e.g. ghcr.io/appsflare/runpod-worker-marker).

Tags produced

Trigger	Tags pushed
Push to `main`	`:latest`, `:sha-<short-sha>`
Tag `v1.2.3`	`:v1.2.3`, `:1.2`, `:sha-<short-sha>`

Local Development

Prerequisites

UV installed.
Python 3.11+.
(Optional) CUDA-capable GPU.

Setup

uv sync

This reads pyproject.toml and installs all dependencies (including PyTorch with CUDA 12.1 wheels).

Building the Docker Image

docker build -t runpod-worker-marker:latest .

Deploying on RunPod

Push the image to a container registry (Docker Hub, GHCR, etc.).
Create a new Serverless endpoint in the RunPod console.
Set the container image to your pushed image.
Configure the following environment variables as needed:

Variable	Default	Description
`TORCH_DEVICE`	`cuda`	Inference device (`cuda` or `cpu`).
`MODEL_CACHE_DIR`	`/models`	Directory where Marker/Surya models are downloaded and loaded from. Set this to a persistent volume mount path (e.g. `/runpod-volume/models`) so models are reused across cold starts instead of being re-downloaded each time.

Using a persistent volume for models

In the RunPod console, attach a Network Volume to your serverless endpoint and set MODEL_CACHE_DIR to its mount path (e.g. /runpod-volume/models). On first run the models will be downloaded there; all subsequent cold starts will load from the volume, eliminating download time.

Input Schema

Send a JSON payload to your endpoint:

Field	Type	Required	Default	Description
`pdf`	string	✅	—	Base64-encoded file bytes or a public URL to download the file from.
`filename`	string	❌	`document.pdf`	Original filename; used for file-type detection.
`page_range`	string	❌	all pages	Page range, e.g. `"0-5"`.
`force_ocr`	boolean	❌	`false`	Force OCR even when a text layer is present.
`paginate_output`	boolean	❌	`false`	Insert page delimiters into the output.
`output_format`	string	❌	`"markdown"`	One of `"markdown"`, `"html"`, `"json"`, `"chunks"`.
`use_llm`	boolean	❌	`false`	Enable LLM-assisted conversion.
`llm_service`	string	❌	`"marker.services.ollama.OllamaService"`	Fully-qualified LLM service class. Requires `use_llm=true`.
`llm_config`	object	❌	—	Service-specific config dict passed to the service constructor. Requires `use_llm=true`. See examples below.

Example — Markdown (base64 input)

{
  "input": {
    "pdf": "<base64-encoded PDF bytes>",
    "filename": "report.pdf",
    "output_format": "markdown"
  }
}

Example — HTML via URL

{
  "input": {
    "pdf": "https://example.com/document.pdf",
    "filename": "document.pdf",
    "output_format": "html"
  }
}

Example — LLM-assisted with Ollama (qwen3-vl:8b)

{
  "input": {
    "pdf": "<base64-encoded PDF bytes>",
    "filename": "report.pdf",
    "output_format": "markdown",
    "use_llm": true,
    "llm_service": "marker.services.ollama.OllamaService",
    "llm_config": {
      "ollama_model": "qwen3-vl:8b",
      "ollama_base_url": "http://localhost:11434"
    }
  }
}

Output Schema

Field	Type	Description
`success`	boolean	`true` on successful conversion.
`filename`	string	Original filename.
`output_format`	string	Format used.
`markdown`	string \| null	Markdown text (when `output_format="markdown"`).
`html`	string \| null	HTML text (when `output_format="html"`).
`json`	object \| null	Structured data (when `output_format="json"`).
`chunks`	string \| null	Chunks text (when `output_format="chunks"`).
`images`	object	Map of image name → base64-encoded PNG string.
`metadata`	object	Marker metadata (language, page stats, etc.).
`page_count`	integer	Number of pages processed.
`error`	string	Present only on failure; describes what went wrong.

Name		Name	Last commit message	Last commit date
Latest commit History 18 Commits
.github/workflows		.github/workflows
.vscode		.vscode
.gitignore		.gitignore
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md
handler.py		handler.py
ollama_runner.py		ollama_runner.py
openai_service.py		openai_service.py
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt
test_input.json		test_input.json
test_openai_service.py		test_openai_service.py
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Runpod-Worker-Marker

Features

CI/CD — Build & Push to GitHub Container Registry

Tags produced

Local Development

Prerequisites

Setup

Building the Docker Image

Deploying on RunPod

Using a persistent volume for models

Input Schema

Example — Markdown (base64 input)

Example — HTML via URL

Example — LLM-assisted with Ollama (qwen3-vl:8b)

Output Schema

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Runpod-Worker-Marker

Features

CI/CD — Build & Push to GitHub Container Registry

Tags produced

Local Development

Prerequisites

Setup

Building the Docker Image

Deploying on RunPod

Using a persistent volume for models

Input Schema

Example — Markdown (base64 input)

Example — HTML via URL

Example — LLM-assisted with Ollama (qwen3-vl:8b)

Output Schema

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages