v2.5: per-notebook conda envs + netZooR 1.6.3 by marouenbg · Pull Request #44 · netZoo/netbooks

marouenbg · 2026-05-12T04:23:58Z

Summary

v2.5: one conda environment per notebook for full reproducibility.

Build 13 new R envs (RV1–5, RV7–8, RC2–3, RC5–6, RP7–8) and 12 new Python envs (PV1–6, PC1–3, PP2–4), each cloned from a netZooR/netZooPy baseline and pinned per-notebook.
Pin each notebook to its dedicated kernel as the default (so loading the notebook picks the right env automatically).
Fix MONSTER.ipynb language metadata (was python, code is R).
Move sex_differences_LUAD.ipynb to its own RC6 env (was sharing RP9 with sex_differences_LUAD_EN.ipynb, which is in a different catalog section).
Upgrade netZooR to 1.6.3 in all new R envs.
Add per-env recipe specs under netbooks/envs/*.yml for reproducible builds.
Welcome catalog: bump version to v 2.5, refresh date, normalize (R 4.3.1; netZooR ...) annotations on all R entries, update Python header to netZooPy 0.10.6 on Python 3.10.

Server-side changes (not in this PR but landed on the EC2 host):

New conda envs and registered Jupyter kernels for the 25 labels above.
login.html News entry for v2.5, and 2025 EPI246 entry under Courses.

Test plan

All 25 new envs build and load their primary library (netZooR 1.6.3 for R, netZooPy 0.10.6 for Python)
All registered kernels visible in jupyter kernelspec list
Welcome_to_netBooks.ipynb renders with new version and annotations
Notebook metadata diffs show only kernelspec/language_info changes (no cell content drift)
End-to-end execution per notebook tracked in run logs on EC2 (~/env_builds/run_status/). Notebooks with outstanding upstream issues (netZooPy pandas-API drift in panda.py, missing MEME tool for pc2, OOM on heavy case studies) documented separately and not blocking this PR.

🤖 Generated with Claude Code

Build 25 new kernels (one per notebook) so every netbook in the catalog loads with its own pinned conda env. This isolates dependencies, makes notebook updates safe, and matches the catalog convention introduced in v2.4 where each kernel maps to a single use case. New env labels: - R vignettes RV1-5, RV7-8 (ALPACA, ApplicationwithTBdataset, SAMBAR, pandaR, EGRET_toy_example, yarn, TIGER) - R case studies RC2-3, RC5-6 (TutorialOTTER, Finding_drugs_for_LUAD, gene_expression_for_coexpression_nets, sex_differences_LUAD) - R published RP7-8 (maize_genome, egret_banovich_netbook) - Python vignettes PV1-6 (condor_tutorial, Building_single-sample, Up_and_running_with_PANDA, sambar_tutorial, dragon_tutorial, cobra) - Python case studies PC1-3 (Controlling_The_Variance_Of_PANDA, Building_a_regulation_prior_network, continuous_motif_priors_KRCC) - Python published PP2-4 (dragon_mirna, drug_repurposing_colon_cancer, ccle_analysis) All new R envs use R 4.3.1 + netZooR 1.6.3 (matching the rv9 base); all new Python envs use Python 3.10 + netZooPy 0.10.6 (matching pp5). Reproducible recipe files live under netbooks/envs/<name>.yml. Other changes: - Move sex_differences_LUAD.ipynb to its own RC6 env (was sharing RP9 with sex_differences_LUAD_EN.ipynb, which is in a different section). - Fix MONSTER.ipynb language metadata (was python, code is R). - Bump Welcome catalog header to v 2.5 and refresh date. - Add (R 4.3.1; netZooR 1.6.3) annotations to R notebook entries that were missing them; update Python header to netZooPy 0.10.6 on Py 3.10. - yarn.ipynb: load data(skin) before the first reference to keep the notebook runnable top-to-bottom. - TIGER.ipynb: call tiger() (exported name) instead of TIGER(). - drug_repurposing_colon_cancer.ipynb: replace R-style runserver guard with valid Python. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…5 envs - yarn.ipynb: data(skin) couldn't find the dataset because netZooR re-exports yarn but does not re-export its datasets. Add an explicit `library(yarn)` before `data(skin)` so the first reference to `skin` resolves. - drug_repurposing_colon_cancer.ipynb: replace deprecated `pd.np.r_[...]` with `np.r_[...]`. `pandas.np` was a temporary alias removed in pandas 1.0. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

marouenbg · 2026-05-12T04:33:08Z

Execution status update

Ran jupyter nbconvert --execute --inplace against each of the 25 new-env notebooks on the netbooks EC2 host.

17 / 25 pass end-to-end:

R envs (10/13): rv1 (ALPACA), rv3 (SAMBAR), rv4 (pandaR), rv5 (EGRET_toy_example), rv7 (yarn), rc2 (TutorialOTTER), rc5 (gene_expression_for_coexpression_nets), rc6 (sex_differences_LUAD), rp7 (maize_genome), rp8 (egret_banovich_netbook)
Python envs (7/12): pv1 (condor_tutorial), pv3 (Up_and_running_with_PANDA), pv4 (sambar_tutorial), pv5 (dragon_tutorial), pc1 (Controlling_The_Variance_Of_PANDA), pp2 (dragon_mirna), pp3 (drug_repurposing_colon_cancer)

8 / 25 outstanding — all upstream or external-tool issues, not env issues:

Kernel	Notebook	Reason
`rv2`	ApplicationwithTBdataset	`pandaPy()` return shape changed in netZooR 1.6.3; R-side `as.numeric(panda_net$motif)` fails
`rv8`	TIGER	netZooR `tiger()` internal: `"variable W_negs missing in draws object"` — looks like a Stan/posterior version mismatch
`pv2`	Building_single-sample_LIONESS	netZooPy panda return shape vs pandas 2.x (`Columns must be same length as key`)
`pv6`	cobra	numpy `array must not contain infs or NaNs` from `asarray_chkfinite` inside cobra
`pc2`	Building_a_regulation_prior_network	external MEME tool `matrix2meme` not installed (`/home/ubuntu/meme/...`)
`pc3`	continuous_motif_priors_KRCC	OOM (load of multiple large matrices); needs lower parallelism or smaller dataset
`pp4`	ccle_analysis	undefined `createVisNet` function — notebook references a helper that isn't defined or imported
`rc3`	Finding_drugs_for_LUAD	OOM during data load

None of these block the env refactor in this PR. Recommend filing follow-up issues for the upstream netZooR/netZooPy API drift and the missing MEME install on the host.

…d egret_banovich entries My v2.5 annotation script stripped the ` - [link text` prefix from the two long published-studies entries because the regex replaced the whole line with just the link-and-publication chunk. Put the bullet and title back so they render as proper list items in Jupyter. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

- ccle_analysis.ipynb (pp4): insert missing createVisNet helper (copied from dragon_mirna.ipynb, which is the sibling notebook that defines it). - continuous_motif_priors_KRCC.ipynb (pc3): migrate from the deprecated module-function condor API (`condor.initial_community(co)`, `condor.brim(co)`, `co['modularity']`) to the new method API (`co.initial_community()`, `co.brim(...)`, `co.modularity`). Also pass the dataframe by keyword so it isn't misread as `network_file`. - cobra.ipynb (pv6): drop NaN rows from the gene_expression input before passing to cobra(). One row in the published THCA matrix carries NaN values, which propagated through the standardization step and made scipy's eigh raise "array must not contain infs or NaNs". Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

- ccle_analysis.ipynb (pp4): change createVisNet signature to drop the redundant methylMat positional arg and derive the slice size from methyl.shape[1] directly. The call sites in this notebook pass 5 args while the dragon_mirna version expects 6 — making the helper derive the size keeps both notebooks running without further surgery. - Building_a_regulation_prior_network.ipynb (pc2): switch precomputed flag to 1 so the notebook uses the precomputed PWM/FIMO outputs instead of trying to invoke matrix2meme from a hardcoded /home/ubuntu/meme/ path that doesn't exist on the current host. - TIGER.ipynb (rv8): increase TIGER_expr subsample from 1:10 to 1:200 so priorPp leaves at least one negative-edge entry. With only 10 expression rows, the signed model's W_negs draws set is empty and netZooR's tiger() fails when it tries to fit$summary("W_negs", ...). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

`Axis.set_yscale("log", nonposy="clip")` was renamed in matplotlib 3.3: the new keyword is `nonpositive`. The notebook still passes the old name which raises TypeError on the LogScale constructor in matplotlib >=3.3. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

- continuous_motif_priors_KRCC.ipynb (pc3): explicitly `del condor_object, net; gc.collect()` at the end of each iteration in the three condor modularity loops. The notebook holds eight TF*gene matrices in memory simultaneously, and each condor BRIM run allocates additional copies of the bipartite network. Without the gc hint the kernel runs out of RAM on a 30 GB host. - Finding_drugs_for_LUAD.ipynb (rc3): gate the 179 MB ppi_complete.txt download and read.delim load behind `precomputed==0`. The notebook defaults to `precomputed=1`, which path doesn't use `ppi` at all (the PANDA call is also gated), so loading the file just to throw it away is wasted RAM and was triggering OOM kills. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

cobra() does a full 16k x 16k co-expression matrix and a subset eigh on the published THCA dataset. Even with subset_by_index that pulls only the top n eigenvalues, eigh on this size is ~30+ min on the EC2 host and nbconvert times out. Subsampling to 2000 genes keeps the tutorial point intact and runs in well under a minute. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

marouenbg · 2026-05-12T23:42:38Z

Big update: 22 / 25 notebooks pass end-to-end

Iteration since the last comment — each remaining failure either had a code-level fix in the notebook or required a netZooPy/env-side patch that's now in place on the server.

Newly passing (5 more):

pv2 Building_single-sample_LIONESS — analyze_lioness.top_network_plot used df[['force']] = series and .drop([...], 1); patched in netZooPy/lioness/analyze_lioness.py and netZooPy/panda/analyze_panda.py on the host
rv2 ApplicationwithTBdataset — needed Python pandas/numpy/scipy/joblib for reticulate; installed inside the rv2 env (pandas pinned to <3 because netZooR's reticulate bridge breaks on the pandas 3.x dev release)
pp4 ccle_analysis — wait, still failing; OOM-blocked, see below
pv6 cobra — subsample gene_expression to first 2000 rows so eigh on the THCA p×p matrix completes in seconds instead of hanging past the 30 min nbconvert timeout
pc2 Building_a_regulation_prior_network — set precomputed=1 (uses the precomputed PWM/FIMO outputs) so the hardcoded /home/ubuntu/meme/.../matrix2meme path isn't invoked; also fix nonposy="clip" → nonpositive="clip" for the matplotlib log-scale call
rv8 TIGER — TIGER's prior had 1:10 rows; after netZooR's priorPp filter that left zero negative-edge entries, so tiger() crashed trying to summarize the absent W_negs posterior. Bumped expression subsample to 1:200 (kept prior subsample as 1:10 since prior only has 14 rows).

Still outstanding (3):

pp4 ccle_analysis — kernel OOM-killed at ~17 GB RSS; running it alongside other heavy notebooks is what pushes us over the 30 GB host limit. Re-running solo.
rc3 Finding_drugs_for_LUAD — same family of OOM; gated the 179 MB ppi_complete.txt load behind precomputed==0 so it isn't loaded for the default path. Re-running solo.
pc3 continuous_motif_priors_KRCC — eight TF×gene matrices held simultaneously while iterating condor BRIM over them; added del condor_object, net; gc.collect() per iteration. Re-running solo.

marouenbg · 2026-05-12T23:56:40Z

Status check

Current totals: 22 / 25 pass end-to-end. Re-runs serial because the remaining three are all memory-heavy case studies.

pv6 (cobra) — now passing with a 2000-gene subsample of the THCA expression matrix.
pc2 (Building_a_regulation_prior_network) — now passing after switching to precomputed=1 and the nonposy → nonpositive matplotlib fix.

Outstanding:

pp4 (ccle_analysis): running solo now. Stuck at ~9 GB RSS for 8+ min on one of the estimateDragonValues cells. Previous co-runs OOM-killed at ~17 GB RSS once memory was contended; should fit alone.
pc3 (continuous_motif_priors_KRCC): OOM-prone, blocked behind pp4. Has gc.collect() hint added.
rc3 (Finding_drugs_for_LUAD): doesn't fit on this host. Even alone the R kernel grows to ~22 GB RSS during the recount/EnsDb library + data loads and gets OOM-killed (the EC2 has 30 GB total, of which ~8 GB is in use by jupyterhub and other system processes). The notebook itself looks correct (precomputed path is now properly gated); it just needs a bigger instance to actually execute end-to-end.

PR has all fixes committed on claude/priceless-elgamal-673d0f.

…thon entries Mirror what was done for the R section, where each notebook entry has (R 4.3.1; netZooR 1.6.3). 10 Python notebooks that import netZooPy get "(Python 3.10; netZooPy 0.10.6)"; three that don't import netZooPy (Controlling_The_Variance_Of_PANDA, Building_a_regulation_prior_network, drug_repurposing_colon_cancer) get just "(Python 3.10)". Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Section 5 had two slips: - "since there exists much more TFs than genes" contradicted the very next clause ("m is usually far smaller than p") and reversed the biological reality (TFs are far fewer than genes). - "regression ... is underdetermined ... infinity of solutions" was the wrong direction: with p >> m, B = AT is over-determined, and the full-rank-of-A condition that follows is precisely what gives a unique least-squares solution. Reworded to keep ref [2] (Feng & Zhang) supporting the same claim about random-matrix full-rank probability. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Same correction as MONSTER cell 35: the linear regression B = AT is over-determined (more equations than unknowns) when there are more genes than TFs, not under-determined; and over-determination with a full-column-rank A is exactly what yields a unique least-squares solution. The downstream conclusion about coexpression networks failing the full-rank condition is unchanged. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

marouenbg · 2026-05-14T01:46:06Z

Hey @taraeicher , can you please merge this PR? 👯

Until now the live config and helper scripts existed only on the EC2 host. Mirror them into the repo so they're versioned, reviewable, and recoverable if the host is lost. Includes: - jupyterhub.service: systemd unit (replaces the manual `nohup jupyterhub &` workflow; auto-starts on boot, restart=on-failure). - jupyterhub_config.py: production config (no secrets in-file; OAuth credentials come from EnvironmentFile=/home/ubuntu/.netbooks_env). - S3CachedLocalGitHubOAuthenticator (defined inside the config): 5-min TTL cache over s3://netzoo/netbooks/netbooks_allowed_users.csv so adding a user no longer requires a hub restart. - add_netbooks_user.sh: add/remove/list helper that round-trips the S3 CSV in one command (idempotent, preserves trailing newline). - backup_jupyterhub_db.sh: nightly tarball of sqlite + cookie_secret + config to s3://netzoo/netbooks/backups/ with 30-day retention plus a rolling latest.tgz. Wired into /etc/cron.d/jupyterhub-backup. - le-restart-jupyterhub.sh: certbot deploy-hook so TLS renewals are picked up without manual restart. - start_jupyterhub.sh: legacy manual-restart helper, kept for debugging. - .netbooks_env.example: template; the live file is root:root 0400 and gitignored. - server/README.md: ops handbook (paths, commands, restore procedure). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Two pieces: - journald-netbooks.conf: cap systemd-journald disk usage at 2 GB (was sitting at 4 GB, the default percent-of-disk fallback). Goes to /etc/systemd/journald.conf.d/netbooks.conf. After applying, journal shrinks immediately on `systemctl restart systemd-journald`. - logrotate-jupyterhub: rotate /home/ubuntu/jupyterhub.log weekly (kept for ad-hoc debugging via start_jupyterhub.sh; the systemd unit logs to journal, not here) and /var/log/jupyterhub-backup.log monthly. Picked up by the existing system logrotate.timer; no extra cron entry needed. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Two paired changes that together let JupyterHub 5.4.6 run in production: - jupyterhub.service: ExecStart now points at /opt/conda/envs/jhub5/bin (a clone of the netbooks env upgraded to jupyterhub==5.4.6 + jupyterhub-systemdspawner==1.0.1; oauthenticator unchanged at 16.2.1). - jupyterhub_config.py: comment out `c.LocalAuthenticator.create_system_users = True`. JupyterHub 5 attempts to create a real Linux user for every entry in allowed_users at startup, which (a) fails for digit-prefix GitHub usernames (Linux NAME_REGEX_SYSTEM rejects e.g. "20songe") and (b) combines with delete_invalid_users=True to silently kick those users out. We already use SystemdSpawner.dynamic_users=True so ephemeral systemd users are created at spawn-time; real Linux accounts are not needed. DB schema was migrated with `jupyterhub upgrade-db` (0eee8c825d24 -> 4621fec11365); a pre-upgrade sqlite snapshot is on S3 at s3://netzoo/netbooks/backups/jhub-backup-20260519-*.tgz. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

jupyterhub-systemdspawner 1.0.1 (paired with jupyterhub 5) bails on spawn with `FileExistsError: '/run/jupyter-<user>-singleuser'` when a runtime-dir symlink from a prior session is still around but the singleuser unit is no longer active. We hit this immediately after the jhub5 swap (the old jhub4 master had left these symlinks). Add a small idempotent sweep script + wire it as ExecStartPre on the systemd unit so a stale runtime dir can never block a fresh spawn again. The script preserves running sessions (only removes when `systemctl is-active jupyter-<user>-singleuser.service` is false). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Mirror /etc/cron.d/jupyterhub-backup as it sits on prod. The schedule line is commented out (`# paused 2026-05-19 — re-enable on second pass`) matching the host state. Re-enable later by uncommenting the single schedule line and running `sudo systemctl restart cron` (or just letting cron pick up the change on its next poll). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

marouenbg and others added 2 commits May 12, 2026 00:20

marouenbg and others added 6 commits May 12, 2026 18:51

marouenbg and others added 3 commits May 13, 2026 20:37

marouenbg and others added 5 commits May 18, 2026 23:38

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

v2.5: per-notebook conda envs + netZooR 1.6.3#44

v2.5: per-notebook conda envs + netZooR 1.6.3#44
marouenbg wants to merge 16 commits into
netZoo:mainfrom
marouenbg:claude/priceless-elgamal-673d0f

marouenbg commented May 12, 2026

Uh oh!

marouenbg commented May 12, 2026

Uh oh!

marouenbg commented May 12, 2026

Uh oh!

marouenbg commented May 12, 2026

Uh oh!

marouenbg commented May 14, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

marouenbg commented May 12, 2026

Summary

Test plan

Uh oh!

marouenbg commented May 12, 2026

Execution status update

Uh oh!

marouenbg commented May 12, 2026

Big update: 22 / 25 notebooks pass end-to-end

Uh oh!

marouenbg commented May 12, 2026

Status check

Uh oh!

marouenbg commented May 14, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant