Skip to content

wwood/galah

Repository files navigation

Current Build Conda version Conda downloads Crates.io version Crates.io downloads

Galah

Galah logo

Galah - Scalable dereplication and MIMAG calculation for metagenome assembled genomes.

Documentation can be found at https://wwood.github.io/galah/.

Galah aims to be a scalable metagenome assembled genome (MAG) dereplication and quality assessment method. Dereplication clusters genomes together based on their average nucleotide identity (ANI), and chooses a single member of each cluster as the representative. Quality assessment results in a MIMAG quality score for each genome, based on its completeness, contamination and the presence of rRNA and tRNA genes.

Quick install

# Install latest release via conda.
conda create -n galah -c bioconda -c conda-forge galah

Example usage

For clustering and determining MIMAG quality scores:

galah process --genome-fasta-files /path/to/genome1.fna /path/to/genome2.fna \
  --output-cluster-definition clusters.tsv \
  --output-mimag-summary mimag.tsv

For clustering a set of genomes at 95% ANI:

galah cluster --genome-fasta-files /path/to/genome1.fna /path/to/genome2.fna \
  --output-cluster-definition clusters.tsv

For clustering a set of contigs at 95% ANI:

galah cluster --cluster-contigs --small-genomes --genome-fasta-files /path/to/contigs.fna \
  --output-cluster-definition clusters.tsv

For determining MIMAG quality scores for a set of genomes with CheckM2, Barrnap, and tRNAscan-SE:

galah analyse --genome-fasta-files /path/to/genome1.fna /path/to/genome2.fna \
  --output-mimag-summary mimag.tsv

Help

If you have any questions or need help, please open an issue.

License

Galah is developed by the Woodcroft lab at the Centre for Microbiome Research, School of Biomedical Sciences, QUT, with contributions from Samuel Aroney, Antônio Camargo, and Rhys Newell. It is licensed under GPL3 or later.

The source code is available at https://github.com/wwood/galah.

Citation

Aroney, S.T.N., Camargo, A.P., Tyson, G.W. and Woodcroft B.J. Galah: More scalable dereplication for metagenome assembled genomes. Zenodo (2024). https://doi.org/10.5281/zenodo.13637856

About

More scalable dereplication for metagenome assembled genomes

Resources

License

Stars

Watchers

Forks

Packages

 
 
 

Contributors