BNFO262 notebook environments

Modifying an environment

First, clone this repository
Create a new branch off of the master branch. Give it an informative name.
Add the new software to the conda environment used by that module. Make sure to follow best practices (see the section below)! Note: Never use one conda environment.yml file for more than one module. Each module should have its own .yml file. Mixing modules into the same environment will make it difficult for future TAs to maintain the environment, since they won't be able to tell which packages to add or remove as the notebooks change.
Check that the conda environment can still be solved
```
conda env create --dry-run --file spatial-tx.yml
```
Commit and push your changes
Once you're ready, create a pull request to merge it back into the master branch
Wait at most 40 minutes for the image to be built and for the checks to pass
You should see a green check-mark if all of the checks pass. If not, click on the red X and then "Details" to view the error message. Add additional commit(s) to fix the issue.
Test your changes (see section below) and add any commits as needed
Once all checks and tests pass, merge your pull request!

Testing a new environment

Note: This section is now outdated. There used to be a way to test actions before they became live. But now any successful changes to the environments (even on an unmerged pull request) will immediately become live on DataHub! This can be dangerous. Use with caution.

After creating a pull request for changes to our Dockerfile or a conda environment within our notebook repository, GitHub actions will automatically build an updated Docker image. The image will be tagged by the number assigned to your pull request.

(If off-campus) connect to the UCSD VPN. Then log into DataHub via ssh from your terminal.
```
ssh username@dsmlp-login.ucsd.edu
```
Run your container on DataHub
```
launch-scipy-ml.sh -W BNFO262_WIXX_A00 -P Always -i ghcr.io/biom262/cmm262-notebook:pr-#
```
You should replace # with the number of the pull request and XX with the last two digits of the current year. For example, the number for this pull request is 11 and XX would be 24 for 2024.
Executing that command will generate a URL to a DataHub environment that uses your updated changes. Open the URL in your browser, and use that notebook environment to test if your changes work as expected. You should rerun your notebooks one more time here -- there's a possibility that they don't work here, even if they worked earlier!

If the URL isn’t working, make sure you connect to the UCSD VPN.

Note: If DataHub gives you "Error: ImagePullBackOff", then it probably means that your container image has yet to be pushed to the image repository. You can check the list of available images that have been pushed to the image repository here. If the tag does not appear there, then you will probably need to wait a bit and check back later.

Best practices for conda environments

Write your environment file manually. Don't create an environment and then export it to a .yml file using conda env export. This will inevitably create .yml files that cannot be easily updated in future years. Also, the .yml file will be unlikely to work with other environments besides your own (or other base Docker images besides the one from which you exported it). If you absolutely must use conda env export, it is best to do it with the --from-history flag.
Always specify conda-forge before bioconda in the channels list if both of them are needed. (Note that conda-forge is needed whenever bioconda is needed, but the opposite is not true.)
You should avoid using packages from anywhere else but the conda-forge and bioconda channels. Other channels (like anaconda and r) have been known to eventually purge old packages.
You should also specify nodefaults as a channel in the channels list, since the defaults channel conflicts with conda-forge. When possible, you should specify exact package versions and channels to reduce the amount of time it takes for conda to find the correct versions and channels to use (aka "solve the environment"). This also makes the .yml file much more reproducible and less likely to break in the future. Here's an example where we specify the channel name (conda-forge), the package name (r-base), and the package version (3.6.3):
```
dependencies:
- conda-forge::r-base==3.6.3
```
To pin to exact package versions use a double equals == instead of a single equals = sign.
If a package can be installed via conda, do not specify it as a pip dependency in your environment file. Avoid pip dependencies if possible.
Do not include dependencies of any packages already listed in your environment file unless you import or use those dependencies in your own code. For example, if you use scanpy and it imports pytables, you shouldn't add pytables to your conda environment file unless you directly import and use pytables in your code. This rule helps to ensure .yml file can be easily updated in future years.
When checking whether your environment file will solve, make sure the --strict-channel-priority setting is turned on. (See here for more info.)
If you are creating an environment that should be used from an R notebook, you must also specify the r-irkernel package as a dependency. This allows the environment to be detected by DataHub's nb_conda_kernels.
```
dependencies:
- conda-forge::r-irkernel==1.3.1
```
Otherwise, if you are creating an environment that should be used from a Python notebook, specify the ipykernel package. In either case, be sure to specify a version. I would recommend using the most recent one unless the other packages in your environment are too old.
```
dependencies:
- conda-forge::ipykernel==6.20.1
```

Name		Name	Last commit message	Last commit date
Latest commit History 207 Commits
.github/workflows		.github/workflows
Dockerfile		Dockerfile
README.md		README.md
chipseq.yml		chipseq.yml
gwas.yml		gwas.yml
imgproc.yml		imgproc.yml
networks.yml		networks.yml
popgen.yml		popgen.yml
programming-R.yml		programming-R.yml
rna-seq.yml		rna-seq.yml
scrna-seq.yml		scrna-seq.yml
spatial-tx.yml		spatial-tx.yml
stats.yml		stats.yml
variant_calling.yml		variant_calling.yml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

BNFO262 notebook environments

Modifying an environment

Testing a new environment

Best practices for conda environments

Other helpful resources

About

Uh oh!

Releases

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

BNFO262 notebook environments

Modifying an environment

Testing a new environment

Best practices for conda environments

Other helpful resources

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages