stDyer

Description

stDyer is a spatial domain cluster method for sptailly resolved transcriptomic data.

How to run

Install dependencies

# clone project
git clone https://github.com/ericcombiolab/stDyer.git
cd stDyer

# create conda environment
conda env create -f stdyer.yml
conda activate stdyer

Tutorial

There is a tutorial notebook tutorial.ipynb that demonstrates how to train the model with a single slice dataset. For more advanced usage using command line, please refer to the following sections:

For the dataset with a single slice

Train model with chosen experiment configuration from configs/experiment/

python run.py experiment=example.yaml

The predicted spatial domain labels will be saved to anndata(.h5ad) files in logs/logger_logs folder. The raw predicted spatial domain labels is in adata.obs["pred_labels"]. The autoencoder refined labels is in adata.obs["mlp_fit"].

The detected spatially variable genes will be saved in adata.uns["svg_dict"].

You can override any parameter from command line like this

python run.py trainer.max_epochs=20

For the large dataset (multiple GPUs)

Train model with chosen experiment configuration from configs/experiment/ with multiple GPUs

CUDA_VISIBLE_DEVICES=0,1 python run.py experiment=example_ddp.yaml trainer.devices=2

To train model with your own dataset, you can copy the configs/experiment/example_ddp.yaml to configs/experiment/your_experiment.yaml file and modify it to your needs. The required data format is h5ad, which can be created by AnnData. The "spatial" key in the obsm attribute of the anndata object (adata.obsm["spatial"]) indicates spatial coordinates and is necessary for constructing spatial adjacency graph. The full path to h5ad file is data_dir/dataset_dir/data_file_name. You can also specify the requred number of spatial domains with the parameter num_classes in your_experiment.yaml as well. The config file has rich comments for explaining the parameters.

cp configs/experiment/example_ddp.yaml configs/experiment/your_experiment.yaml
python run.py experiment=your_experiment.yaml

For the dataset with a multiple slices

To train with a dataset with multiple slices, you need to first align the dataset with paste2. Refer to align_multiple_slices_with_paste2.ipynb for preprocessing steps. You can then train with configs/experiment/example_multi_slices.yaml. For your own dataset, make sure the obs attribute of the anndata object has the "batch" column (adata.obs["batch"]), which indicates the slice index. Set z_scale with a meaningful value (refer to config file for details) as adata.obs["batch"] * z_scale * min_two_units_xy_distance will be considered as the third coordinate for constructing spatial adjacency graph besides two coordinates in adata.obsm["spatial"].

python run.py experiment=example_multi_slices.yaml

For reproducing the results in the paper

You can check https://doi.org/10.5281/zenodo.11315101 to download the processed data and reproducible Jupyter notebooks. Please read the README.md inside the zip file for details.

Name		Name	Last commit message	Last commit date
Latest commit History 12 Commits
configs		configs
example_data/20180505_BY3_1kgenes		example_data/20180505_BY3_1kgenes
src		src
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
align_multiple_slides_with_paste2.ipynb		align_multiple_slides_with_paste2.ipynb
run.py		run.py
setup.cfg		setup.cfg
stdyer.yml		stdyer.yml
tutorial.ipynb		tutorial.ipynb
tutorial_vscode_env.png		tutorial_vscode_env.png

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

stDyer

Description

How to run

Tutorial

For the dataset with a single slice

For the large dataset (multiple GPUs)

For the dataset with a multiple slices

For reproducing the results in the paper

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

stDyer

Description

How to run

Tutorial

For the dataset with a single slice

For the large dataset (multiple GPUs)

For the dataset with a multiple slices

For reproducing the results in the paper

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages