feat(download): bulk workspace export — session + project zip endpoints#87
Open
lucastononro wants to merge 3 commits into
Open
feat(download): bulk workspace export — session + project zip endpoints#87lucastononro wants to merge 3 commits into
lucastononro wants to merge 3 commits into
Conversation
Adds `GET /api/sessions/{sid}/download` and `GET /api/projects/{pid}/download`
that stream a self-contained zip of the agent's workspace: every file under
`/sessions/{sid}/` plus a synthetic `trainable_local.py` shim, a filtered
`requirements.txt`, and a runbook README. The shim lets downloaded scripts
that import `from trainable import log, log_image, ...` run on a vanilla
Python install — calls land in `./trainable_out/` instead of the Modal
volume.
The zip is streamed via `StreamingResponse` over an in-memory buffer
drained per write, so multi-hundred-MB sessions don't materialize on
disk or in RAM. A 2 GB uncompressed cap with a trailing `__truncated.txt`
marker keeps a runaway walk bounded.
Frontend ships two entry points:
- Download icon in the WorkspaceSidebar header → session zip
- Download icon on every Project row in the sidebar → project zip
Closes #79.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Comment on lines
+254
to
+274
| def log_confusion_matrix(step, key, y_true, y_pred, labels=None, run=None): | ||
| try: | ||
| from sklearn.metrics import confusion_matrix as _cm # type: ignore | ||
|
|
||
| labs = ( | ||
| list(labels) | ||
| if labels is not None | ||
| else sorted(set(list(y_true) + list(y_pred))) | ||
| ) | ||
| matrix = _cm(y_true, y_pred, labels=labs).tolist() | ||
| except Exception: | ||
| labs = ( | ||
| list(labels) | ||
| if labels is not None | ||
| else sorted(set(list(y_true) + list(y_pred))) | ||
| ) | ||
| idx = {lab: i for i, lab in enumerate(labs)} | ||
| matrix = [[0] * len(labs) for _ in labs] | ||
| for t, p in zip(y_true, y_pred): | ||
| if t in idx and p in idx: | ||
| matrix[idx[t]][idx[p]] += 1 |
There was a problem hiding this comment.
log_confusion_matrix exhausts both y_true and y_pred iterators when computing labels (if labels=None), so both _cm(y_true, y_pred, ...) (sklearn path) and zip(y_true, y_pred) (fallback path) receive empty sequences and produce a silent all-zero matrix. This is the same iterator-exhaustion pattern that was fixed for log_table via all_rows = list(rows) in this PR, but log_confusion_matrix was overlooked.
Suggested change
| def log_confusion_matrix(step, key, y_true, y_pred, labels=None, run=None): | |
| try: | |
| from sklearn.metrics import confusion_matrix as _cm # type: ignore | |
| labs = ( | |
| list(labels) | |
| if labels is not None | |
| else sorted(set(list(y_true) + list(y_pred))) | |
| ) | |
| matrix = _cm(y_true, y_pred, labels=labs).tolist() | |
| except Exception: | |
| labs = ( | |
| list(labels) | |
| if labels is not None | |
| else sorted(set(list(y_true) + list(y_pred))) | |
| ) | |
| idx = {lab: i for i, lab in enumerate(labs)} | |
| matrix = [[0] * len(labs) for _ in labs] | |
| for t, p in zip(y_true, y_pred): | |
| if t in idx and p in idx: | |
| matrix[idx[t]][idx[p]] += 1 | |
| def log_confusion_matrix(step, key, y_true, y_pred, labels=None, run=None): | |
| y_true_list = list(y_true) | |
| y_pred_list = list(y_pred) | |
| try: | |
| from sklearn.metrics import confusion_matrix as _cm # type: ignore | |
| labs = ( | |
| list(labels) | |
| if labels is not None | |
| else sorted(set(y_true_list + y_pred_list)) | |
| ) | |
| matrix = _cm(y_true_list, y_pred_list, labels=labs).tolist() | |
| except Exception: | |
| labs = ( | |
| list(labels) | |
| if labels is not None | |
| else sorted(set(y_true_list + y_pred_list)) | |
| ) | |
| idx = {lab: i for i, lab in enumerate(labs)} | |
| matrix = [[0] * len(labs) for _ in labs] | |
| for t, p in zip(y_true_list, y_pred_list): | |
| if t in idx and p in idx: | |
| matrix[idx[t]][idx[p]] += 1 |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Closes #79.
Adds two streaming endpoints and matching UI so users can leave a session with the agent's work on their laptop:
GET /api/sessions/{session_id}/download— zip of/sessions/{sid}/GET /api/projects/{project_id}/download— zip of every session in a project, namespaced undersessions/{slug}/Each zip ships three synthetic files at the root:
trainable_local.py— local SDK shim. Agent scripts thatfrom trainable import log, log_image, ...now resolve against the filesystem; output lands in./trainable_out/instead of the Modal volume.requirements.txt— pinned subset of the sandbox image (pandas/numpy/sklearn/etc.), skips server-only deps (FastAPI, modal-client, anthropic, OTel).README.md—python -m venv .venv && pip install -r requirements.txt && python -c "import trainable_local"runbook, plus notes on what's not in the zip (raw datasets, GPU parity, round-trip telemetry).Design
Streaming, not materialized. A
_StreamingZipBuffer(BytesIO-shaped sink) is drained after everyZipFile.writestr, so the response body trickles out to the client without buffering the full archive. Verified on a 500 KB workspace in tests; the design extends to multi-hundred-MB sessions without holding the bytes in memory.Cap + safety net. Default 2 GB uncompressed cap per export. When hit, the walk continues so the trailing
__truncated.txtlists every omitted path; the zip itself closes cleanly so the browser finishes the download. Cap is overridable per-call for a future opt-in larger-export endpoint.Storage untouched. Reads straight from the Modal volume (
/sessions/...) via the existinglistdir_async/read_volume_file_asynchelpers — no S3 mirror, no new storage path, no auth surface change. Inherits the auth posture ofrouters/files.py.Single-source SDK surface.
services/trainable_sdk.pyis the home of the local shim's content. Whenservices/sandbox.pygrows a new helper, mirror it here or downloaded scripts willAttributeError.UI
Downloadicon next to the existingMetricsicon. Anchor withhref="/api/sessions/{sid}/download", browser handles the save (filename fromContent-Disposition, no JS buffering).Downloadicon revealed on row hover, sits betweenSettingsandTrash. Same pattern: anchor →/api/projects/{pid}/download.Tests
backend/tests/test_workspace_export.py(9 tests):__pycache__,.DS_Storecorrectly skippedsessions/{slug}/__truncated.txtwith omitted pathstrainable_local.pyon vanilla Python, callstrainable.log(...), assertsmetrics.jsonlwrittenSuite total: 305 passed, 8 skipped (pre-existing). Ruff clean, TS clean, eslint clean.
What's NOT in v1 (per issue #79)
routers/files.pyposture; tracked separately.Test plan
curl -o session.zip http://localhost:8000/api/sessions/<sid>/downloadreturns a valid zipunzip -l session.zipshowsREADME.md,requirements.txt,trainable_local.py, plus the agent'ssrc//notebooks//figures/python -m venv .venv && pip install -r requirements.txt && python -c "import trainable_local; import trainable; trainable.log(1, {'loss': 0.5})"works and writes./trainable_out/metrics.jsonlcurl http://localhost:8000/api/sessions/nope/download→ 404Greptile Summary
Adds two streaming zip-download endpoints (
GET /api/sessions/{id}/downloadandGET /api/projects/{id}/download) that walk the Modal volume, stream the archive to the browser without materialising it in memory, and bundle a local SDK shim (trainable_runtime.py), a filteredrequirements.txt, and a README so downloaded scripts work on a vanilla Python install. The PR also refactors the sandbox preamble to share the same runtime source file instead of maintaining a duplicate inline string.workspace_export.pystreams DEFLATE-compressed zips via a custom_StreamingZipBuffer, with a 2 GB uncompressed cap and a__truncated.txtsentinel. The newtrainable_runtime.pyconsolidates sandbox and local logging paths behind environment-variable mode selection.Confidence Score: 4/5
Safe to merge after fixing the confusion-matrix iterator bug; the streaming zip and SDK refactor are otherwise well-constructed.
log_confusion_matrixmaterialisesy_trueandy_predto compute labels whenlabels=None, then passes the already-exhausted iterators tosklearn.confusion_matrix(or the hand-rolled fallback loop), silently producing an all-zero matrix. The analogous bug inlog_tablewas fixed in this same PR;log_confusion_matrixwas overlooked.backend/services/trainable_runtime.py —
log_confusion_matrixneeds the same iterator-materialisation fix applied tolog_tablein this PR.Important Files Changed
log_confusion_matrixthat was fixed forlog_tablebut missed here.Sequence Diagram
sequenceDiagram participant Browser participant DownloadRouter as routers/download.py participant DB as Database participant WorkspaceExport as workspace_export.py participant Volume as Modal Volume participant TrainableSDK as trainable_sdk.py Browser->>DownloadRouter: "GET /api/sessions/{id}/download" DownloadRouter->>DB: "SELECT session WHERE id=?" DB-->>DownloadRouter: "session | None" alt session not found DownloadRouter-->>Browser: 404 else session exists DownloadRouter->>WorkspaceExport: stream_session_zip(session_id) WorkspaceExport->>Volume: reload_volume_async() WorkspaceExport->>Volume: "listdir_async(/sessions/{id}, recursive=True)" Volume-->>WorkspaceExport: [FileEntry...] loop each file (under cap) WorkspaceExport->>Volume: iter_volume_file_chunks_async(path) Volume-->>WorkspaceExport: bytes chunks WorkspaceExport-->>Browser: zip chunk (streamed) end WorkspaceExport->>TrainableSDK: render_readme() + LOCAL_SHIM + LOCAL_REQUIREMENTS TrainableSDK-->>WorkspaceExport: synthetic file bytes WorkspaceExport-->>Browser: zip central directory (close) endReviews (3): Last reviewed commit: "fix: mark partial workspace export reads" | Re-trigger Greptile