Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
10 changes: 6 additions & 4 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -34,8 +34,8 @@
| Project | Responsibility | Status |
| --------------- | ---------------------------------------------- | ------------ |
| `light-runner` | Spawn one container, return exit code + files | **this repo** |
| `light-run` | CLI + HTTP wrapper around `light-runner` | planned |
| `light-process` | DAG orchestration, retries, fan-out | planned |
| `light-run` | CLI + HTTP wrapper around `light-runner` | [shipped](https://enixcode.github.io/light-run/) |
| `light-process` | DAG orchestration, retries, fan-out | [shipped](https://enixcode.github.io/light-process/) |

Each layer is an independent npm package. Use `light-runner` alone when you just need to run code in a sandbox; pick the higher layers when you want scheduling or an HTTP surface.

Expand Down Expand Up @@ -264,6 +264,8 @@ interface RunState {
image: string;
workdir: string;
entrypoint?: string;
timeout?: number;
extract?: ExtractSpec[];
startedAt: string;
finishedAt?: string;
status: 'running' | 'exited' | 'cancelled' | 'failed';
Expand Down Expand Up @@ -333,13 +335,13 @@ Failures talk to you in a structured way, and stale resources do not pile up.
- `VOLUME_CREATE_FAILED` - per-run volume could not be created
- `CONTAINER_START_FAILED` - container could not be created or started
- `SEED_FAILED` - host folder could not be streamed into the volume
- `EXTRACT_FAILED` - artefact streaming out of the container failed
- `EXTRACT_FAILED` - artifact streaming out of the container failed
- `BUILD_FAILED` - `run[]` setup build failed (transport or non-zero RUN step)
- `INVALID_RUN_STEP` - a `run[]` entry violated the validation rules
- `NETWORK_CONNECT_FAILED` - an extra `networks[]` entry could not be attached (e.g. the network does not exist)

Each error carries an optional `dockerOp` (which docker call was in flight) and `containerId` (when known) for log correlation.
- **Orphan reaping.** `DockerRunner.reapOrphans()` lists every `light-runner-*` container and volume tagged with the library's label, then removes any that have been idle or exited longer than `LIGHT_RUNNER_REAP_AGE_MS` (default 5 min). Returns `{ containers, volumes }` counts. Safe to call from a cron, a process-shutdown hook, or a sibling watchdog.
- **Orphan reaping.** `DockerRunner.reapOrphans()` lists every `light-runner-*` container and volume tagged with the library's label. Containers idle or exited longer than `LIGHT_RUNNER_REAP_AGE_MS` (default 5 min) are removed; their associated volumes are removed unconditionally (the daemon refuses to remove a volume that is still in use by a running container). Returns `{ containers, volumes }` counts. Safe to call from a cron, a process-shutdown hook, or a sibling watchdog.
- **Volume sweeping.** `DockerRunner.cleanupOrphanVolumes()` removes `light-runner-*` volumes, skipping any created within the last `LIGHT_RUNNER_REAP_AGE_MS` (default 5 min) so it cannot yank a volume out from under a concurrent run that has not yet mounted it. Returns the count removed.
- **State-file GC.** Every run writes one JSON file per run id under the state dir (both attached and detached since v0.15). `DockerRunner.cleanupOldStates(maxBytes?)` caps that directory: once its total size exceeds `maxBytes` (default `LIGHT_RUNNER_STATE_MAX_BYTES`, or 50 MiB) it deletes terminal states (`exited` / `cancelled` / `failed`) oldest-first until back under budget. `running` states are never removed. Returns the count deleted. This is distinct from `cleanupOrphanStates()`, which only reconciles `running` -> `failed` and deletes nothing. Synchronous (pure filesystem); call it on boot or a schedule.
- **Attached vs detached state.** Attached (synchronous) runs are persisted purely for observability (`listStates()`) and post-mortem reconciliation: they cannot be resumed live (`AutoRemove` tears the container down on exit, and `onLog` output is not replayed on `attach`). If an attached launcher dies mid-run, the container is auto-removed and the state stays `running` until `cleanupOrphanStates()` marks it `failed`.
Expand Down
4 changes: 2 additions & 2 deletions website/content/docs/detached.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -19,7 +19,7 @@ const runner = new DockerRunner();

const execution = runner.run({
image: 'python:3.12-alpine',
command: 'python long_job.py',
entrypoint: 'python long_job.py',
dir: './job',
timeout: 6 * 60 * 60 * 1000, // 6 hours
extract: [{ from: '/app/result.json', to: './out' }],
Expand Down Expand Up @@ -74,7 +74,7 @@ Every detached run writes one JSON file under `~/.light-runner/state/<id>.json`
"volume": "light-runner-abc123def456",
"image": "python:3.12-alpine",
"workdir": "/app",
"command": "python long_job.py",
"entrypoint": "python long_job.py",
"timeout": 21600000,
"extract": [{"from": "/app/result.json", "to": "./out"}],
"startedAt": "2026-04-27T10:15:00.000Z",
Expand Down
6 changes: 3 additions & 3 deletions website/content/docs/extract.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,7 @@ title: "Extract files"
```ts
const execution = runner.run({
image: 'node:lts-alpine',
command: 'node build.js',
entrypoint: 'node build.js',
dir: './project',
extract: [
{ from: '/app/dist', to: './out' }, // folder, recursive
Expand Down Expand Up @@ -87,15 +87,15 @@ Have the container write a JSON file, extract it, parse it on the host:

const result = await runner.run({
image: 'python:3.12-alpine',
command: 'python main.py',
entrypoint: 'python main.py',
dir: './solver',
extract: [{ from: '/app/result.json', to: './out' }],
}).result;

const data = JSON.parse(fs.readFileSync('./out/result.json', 'utf8'));
```

### Pull a build artefact
### Pull a build artifact

```ts
extract: [
Expand Down
6 changes: 3 additions & 3 deletions website/content/docs/gvisor-kata.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -108,7 +108,7 @@ A quick sanity check that the runtime actually changed:
```ts
const result = await runner.run({
image: 'alpine:3.19',
command: 'cat /proc/self/maps | head -5',
entrypoint: 'cat /proc/self/maps | head -5',
// gVisor-mapped binaries look very different from runc-mapped ones
}).result;
```
Expand All @@ -118,7 +118,7 @@ Or look for the gVisor signature:
```ts
const result = await runner.run({
image: 'alpine:3.19',
command: 'dmesg | grep -i gvisor || echo "not gvisor"',
entrypoint: 'dmesg | grep -i gvisor || echo "not gvisor"',
}).result;
```

Expand All @@ -131,4 +131,4 @@ Under `runsc` you will see `gVisor` strings in `dmesg`. Under `runc`, you will s
| Maximum speed, you trust the code | `runc` (default) |
| Hard barrier against kernel exploits, willing to pay 10-30% I/O | `runsc` |
| Full VM isolation, separate kernel | `kata` (un-validated, at your own risk) |
| Air-gapped network on top of any of the above | `network: 'none'` on the request |
| Air-gapped network on top of any of the above | `networks: ['none']` on the request |
2 changes: 2 additions & 0 deletions website/content/docs/index.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -34,3 +34,5 @@ console.log(result.exitCode, result.extracted);
```

Start with the [Quick start](/docs/quickstart), then explore [Extract files](/docs/extract), [Detached runs](/docs/detached), the [Security model](/docs/security), and [gVisor & Kata](/docs/gvisor-kata).

**See also:** [light-run](https://enixcode.github.io/light-run/) — HTTP wrapper around light-runner, and [light-process](https://enixcode.github.io/light-process/) — DAG orchestration layer.
4 changes: 2 additions & 2 deletions website/content/docs/quickstart.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -25,7 +25,7 @@ const runner = new DockerRunner({ memory: '512m', cpus: '1' });

const execution = runner.run({
image: 'python:3.12-alpine',
command: 'python main.py',
entrypoint: 'python main.py',
dir: './my-project',
input: { task: 'compute', n: 20 },
timeout: 30_000,
Expand All @@ -51,7 +51,7 @@ result.extracted // [{ from, to, status, bytes? }, ...] if extract was set

## Where to go next

- [Extract files](./extract) — how to pull artefacts out of the container after a run
- [Extract files](./extract) — how to pull artifacts out of the container after a run
- [Detached runs](./detached) — long-running jobs that survive a host restart
- [Security model](./security) — what the sandbox protects against and what it does not
- [gVisor & Kata](./gvisor-kata) — adding a stronger runtime for hostile code
Expand Down
8 changes: 4 additions & 4 deletions website/content/docs/security.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -52,7 +52,7 @@ A `setuid` binary inside the container **cannot elevate above the user it starts

Default network: a **dedicated isolated bridge** (`light-runner-isolated`) with **inter-container traffic disabled** (`com.docker.network.bridge.enable_icc: false`). Outbound internet works; sibling runs on the same bridge cannot see each other.

For air-gapped runs, set `network: 'none'` in the request — the container has no network interface at all.
For air-gapped runs, set `networks: ['none']` in the request — the container has no network interface at all.

### Filesystem protections

Expand All @@ -77,7 +77,7 @@ For genuinely hostile code (anonymous user-submitted source, AI-agent-generated
- **`input`** (stdin). Ephemeral. Not in metadata, not in `docker inspect`. Your container reads it via `sys.stdin.read()` / `process.stdin`.
- **A bind mount to `/run/secrets/<name>`**. Docker-native file-based secrets pattern (compose has a `secrets:` block for this). Not managed by light-runner — the consumer wires it via the host config or via a parent compose definition.

Avoid putting API keys in `env`. Avoid putting them in `command`. Avoid putting them in `dir` (they would be tarred into the seed archive and visible to anyone with access to the volume).
Avoid putting API keys in `env`. Avoid putting them in `entrypoint`. Avoid putting them in `dir` (they would be tarred into the seed archive and visible to anyone with access to the volume).

## Hardening recipes

Expand All @@ -87,9 +87,9 @@ Avoid putting API keys in `env`. Avoid putting them in `command`. Avoid putting
const runner = new DockerRunner();
runner.run({
image: 'python:3.12-alpine',
command: 'python untrusted.py',
entrypoint: 'python untrusted.py',
dir: './sandbox',
network: 'none',
networks: ['none'],
timeout: 30_000,
extract: [{ from: '/app/output.json', to: './out' }],
});
Expand Down