Skip to content

feat(download): bulk workspace export — session + project zip endpoints#87

Open
lucastononro wants to merge 3 commits into
release/v0.0.5from
feat/workspace-download
Open

feat(download): bulk workspace export — session + project zip endpoints#87
lucastononro wants to merge 3 commits into
release/v0.0.5from
feat/workspace-download

Conversation

@lucastononro

@lucastononro lucastononro commented May 12, 2026

Copy link
Copy Markdown
Owner

Summary

Closes #79.

Adds two streaming endpoints and matching UI so users can leave a session with the agent's work on their laptop:

  • GET /api/sessions/{session_id}/download — zip of /sessions/{sid}/
  • GET /api/projects/{project_id}/download — zip of every session in a project, namespaced under sessions/{slug}/

Each zip ships three synthetic files at the root:

  • trainable_local.py — local SDK shim. Agent scripts that from trainable import log, log_image, ... now resolve against the filesystem; output lands in ./trainable_out/ instead of the Modal volume.
  • requirements.txt — pinned subset of the sandbox image (pandas/numpy/sklearn/etc.), skips server-only deps (FastAPI, modal-client, anthropic, OTel).
  • README.mdpython -m venv .venv && pip install -r requirements.txt && python -c "import trainable_local" runbook, plus notes on what's not in the zip (raw datasets, GPU parity, round-trip telemetry).

Design

Streaming, not materialized. A _StreamingZipBuffer (BytesIO-shaped sink) is drained after every ZipFile.writestr, so the response body trickles out to the client without buffering the full archive. Verified on a 500 KB workspace in tests; the design extends to multi-hundred-MB sessions without holding the bytes in memory.

Cap + safety net. Default 2 GB uncompressed cap per export. When hit, the walk continues so the trailing __truncated.txt lists every omitted path; the zip itself closes cleanly so the browser finishes the download. Cap is overridable per-call for a future opt-in larger-export endpoint.

Storage untouched. Reads straight from the Modal volume (/sessions/...) via the existing listdir_async / read_volume_file_async helpers — no S3 mirror, no new storage path, no auth surface change. Inherits the auth posture of routers/files.py.

Single-source SDK surface. services/trainable_sdk.py is the home of the local shim's content. When services/sandbox.py grows a new helper, mirror it here or downloaded scripts will AttributeError.

UI

  • Workspace sidebar header → Download icon next to the existing Metrics icon. Anchor with href="/api/sessions/{sid}/download", browser handles the save (filename from Content-Disposition, no JS buffering).
  • Project rows in the left sidebar → Download icon revealed on row hover, sits between Settings and Trash. Same pattern: anchor → /api/projects/{pid}/download.

Tests

backend/tests/test_workspace_export.py (9 tests):

  • Workspace files + synthetic files present in session zip
  • __pycache__, .DS_Store correctly skipped
  • Empty session still ships the three synthetic files
  • Project zip namespaces sessions under sessions/{slug}/
  • Duplicate session labels disambiguated by short-id suffix
  • Truncation cap emits __truncated.txt with omitted paths
  • Subprocess imports the shipped trainable_local.py on vanilla Python, calls trainable.log(...), asserts metrics.jsonl written
  • Session endpoint 200 + zip content via FastAPI test client
  • 404 paths for missing session / project with no sessions

Suite total: 305 passed, 8 skipped (pre-existing). Ruff clean, TS clean, eslint clean.

What's NOT in v1 (per issue #79)

  • Re-downloading raw datasets (README explains why and where to drop locally).
  • Background-job + signed-URL flow for huge exports (the streaming design holds for typical sizes).
  • Auth hardening — inherits routers/files.py posture; tracked separately.

Test plan

  • curl -o session.zip http://localhost:8000/api/sessions/<sid>/download returns a valid zip
  • unzip -l session.zip shows README.md, requirements.txt, trainable_local.py, plus the agent's src/ / notebooks/ / figures/
  • python -m venv .venv && pip install -r requirements.txt && python -c "import trainable_local; import trainable; trainable.log(1, {'loss': 0.5})" works and writes ./trainable_out/metrics.jsonl
  • Download icon in WorkspaceSidebar header triggers the session zip via the browser
  • Download icon on a project row triggers the project zip with every session namespaced
  • curl http://localhost:8000/api/sessions/nope/download → 404

Greptile Summary

Adds two streaming zip-download endpoints (GET /api/sessions/{id}/download and GET /api/projects/{id}/download) that walk the Modal volume, stream the archive to the browser without materialising it in memory, and bundle a local SDK shim (trainable_runtime.py), a filtered requirements.txt, and a README so downloaded scripts work on a vanilla Python install. The PR also refactors the sandbox preamble to share the same runtime source file instead of maintaining a duplicate inline string.

  • Backend: workspace_export.py streams DEFLATE-compressed zips via a custom _StreamingZipBuffer, with a 2 GB uncompressed cap and a __truncated.txt sentinel. The new trainable_runtime.py consolidates sandbox and local logging paths behind environment-variable mode selection.
  • Frontend: workspace sidebar and project rows each gain a browser-native download anchor with hover-reveal and disabled-state handling.

Confidence Score: 4/5

Safe to merge after fixing the confusion-matrix iterator bug; the streaming zip and SDK refactor are otherwise well-constructed.

log_confusion_matrix materialises y_true and y_pred to compute labels when labels=None, then passes the already-exhausted iterators to sklearn.confusion_matrix (or the hand-rolled fallback loop), silently producing an all-zero matrix. The analogous bug in log_table was fixed in this same PR; log_confusion_matrix was overlooked.

backend/services/trainable_runtime.py — log_confusion_matrix needs the same iterator-materialisation fix applied to log_table in this PR.

Important Files Changed

Filename Overview
backend/services/trainable_runtime.py New shared SDK runtime (replaces the inline preamble string). Has an iterator-exhaustion bug in log_confusion_matrix that was fixed for log_table but missed here.
backend/services/workspace_export.py Core streaming zip exporter; well-structured with cap/truncation logic. Previously reviewed issues (yield-in-finally, cap bypass reads, READ_CHUNK_BYTES) are tracked upstream.
backend/routers/download.py Two thin download endpoints; session 404 and project 422 are correct. Returns StreamingResponse backed by async generators from the service layer.
backend/services/trainable_sdk.py Synthetic-file factory (shim, requirements, README). Module-level constants load from package resources at import time; README generation is correct.
backend/services/volume.py Adds iter_volume_file_chunks_async — wraps the Modal Volume iterator with executor calls and re-chunks at 1 MB boundaries. Async-generator idiom is correct.
backend/services/sandbox.py SDK preamble refactored to load trainable_runtime.py at import time and inject it into the sandbox try/finally template; back-compat alias preserved.
backend/tests/test_workspace_export.py Nine tests covering happy path, empty session, project namespacing, duplicate labels, cap truncation, oversized-file skip, mid-read error, client-disconnect, and subprocess import.
frontend/src/app/page.tsx Adds download anchor to WorkspaceSidebar header; disabled state is handled visually but the element remains keyboard-focusable when sessionId is null.
frontend/src/components/Sidebar.tsx Adds project-row download anchor with hover reveal; stopPropagation prevents unintended row selection. Pattern matches the existing Settings/Trash icons.

Sequence Diagram

sequenceDiagram
    participant Browser
    participant DownloadRouter as routers/download.py
    participant DB as Database
    participant WorkspaceExport as workspace_export.py
    participant Volume as Modal Volume
    participant TrainableSDK as trainable_sdk.py

    Browser->>DownloadRouter: "GET /api/sessions/{id}/download"
    DownloadRouter->>DB: "SELECT session WHERE id=?"
    DB-->>DownloadRouter: "session | None"
    alt session not found
        DownloadRouter-->>Browser: 404
    else session exists
        DownloadRouter->>WorkspaceExport: stream_session_zip(session_id)
        WorkspaceExport->>Volume: reload_volume_async()
        WorkspaceExport->>Volume: "listdir_async(/sessions/{id}, recursive=True)"
        Volume-->>WorkspaceExport: [FileEntry...]
        loop each file (under cap)
            WorkspaceExport->>Volume: iter_volume_file_chunks_async(path)
            Volume-->>WorkspaceExport: bytes chunks
            WorkspaceExport-->>Browser: zip chunk (streamed)
        end
        WorkspaceExport->>TrainableSDK: render_readme() + LOCAL_SHIM + LOCAL_REQUIREMENTS
        TrainableSDK-->>WorkspaceExport: synthetic file bytes
        WorkspaceExport-->>Browser: zip central directory (close)
    end
Loading

Fix All in Claude Code

Reviews (3): Last reviewed commit: "fix: mark partial workspace export reads" | Re-trigger Greptile

Greptile also left 1 inline comment on this PR.

Adds `GET /api/sessions/{sid}/download` and `GET /api/projects/{pid}/download`
that stream a self-contained zip of the agent's workspace: every file under
`/sessions/{sid}/` plus a synthetic `trainable_local.py` shim, a filtered
`requirements.txt`, and a runbook README. The shim lets downloaded scripts
that import `from trainable import log, log_image, ...` run on a vanilla
Python install — calls land in `./trainable_out/` instead of the Modal
volume.

The zip is streamed via `StreamingResponse` over an in-memory buffer
drained per write, so multi-hundred-MB sessions don't materialize on
disk or in RAM. A 2 GB uncompressed cap with a trailing `__truncated.txt`
marker keeps a runaway walk bounded.

Frontend ships two entry points:
- Download icon in the WorkspaceSidebar header → session zip
- Download icon on every Project row in the sidebar → project zip

Closes #79.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Comment thread backend/services/trainable_sdk.py Outdated
Comment thread backend/services/workspace_export.py Outdated
Comment thread backend/services/workspace_export.py Outdated
Comment thread backend/services/workspace_export.py
Comment thread backend/routers/download.py
Comment thread backend/services/workspace_export.py
Comment on lines +254 to +274
def log_confusion_matrix(step, key, y_true, y_pred, labels=None, run=None):
try:
from sklearn.metrics import confusion_matrix as _cm # type: ignore

labs = (
list(labels)
if labels is not None
else sorted(set(list(y_true) + list(y_pred)))
)
matrix = _cm(y_true, y_pred, labels=labs).tolist()
except Exception:
labs = (
list(labels)
if labels is not None
else sorted(set(list(y_true) + list(y_pred)))
)
idx = {lab: i for i, lab in enumerate(labs)}
matrix = [[0] * len(labs) for _ in labs]
for t, p in zip(y_true, y_pred):
if t in idx and p in idx:
matrix[idx[t]][idx[p]] += 1

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 log_confusion_matrix exhausts both y_true and y_pred iterators when computing labels (if labels=None), so both _cm(y_true, y_pred, ...) (sklearn path) and zip(y_true, y_pred) (fallback path) receive empty sequences and produce a silent all-zero matrix. This is the same iterator-exhaustion pattern that was fixed for log_table via all_rows = list(rows) in this PR, but log_confusion_matrix was overlooked.

Suggested change
def log_confusion_matrix(step, key, y_true, y_pred, labels=None, run=None):
try:
from sklearn.metrics import confusion_matrix as _cm # type: ignore
labs = (
list(labels)
if labels is not None
else sorted(set(list(y_true) + list(y_pred)))
)
matrix = _cm(y_true, y_pred, labels=labs).tolist()
except Exception:
labs = (
list(labels)
if labels is not None
else sorted(set(list(y_true) + list(y_pred)))
)
idx = {lab: i for i, lab in enumerate(labs)}
matrix = [[0] * len(labs) for _ in labs]
for t, p in zip(y_true, y_pred):
if t in idx and p in idx:
matrix[idx[t]][idx[p]] += 1
def log_confusion_matrix(step, key, y_true, y_pred, labels=None, run=None):
y_true_list = list(y_true)
y_pred_list = list(y_pred)
try:
from sklearn.metrics import confusion_matrix as _cm # type: ignore
labs = (
list(labels)
if labels is not None
else sorted(set(y_true_list + y_pred_list))
)
matrix = _cm(y_true_list, y_pred_list, labels=labs).tolist()
except Exception:
labs = (
list(labels)
if labels is not None
else sorted(set(y_true_list + y_pred_list))
)
idx = {lab: i for i, lab in enumerate(labs)}
matrix = [[0] * len(labs) for _ in labs]
for t, p in zip(y_true_list, y_pred_list):
if t in idx and p in idx:
matrix[idx[t]][idx[p]] += 1

Fix in Claude Code

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant