Skip to content

Document how sample IDs are determined from fastq file names #9

Description

@stenglein-lab

Documentation should describe how fastq file names are converted to sample IDs (meta.id). Code is in marshal_fastq.nf:

         // this map gets rid of any of the following at the end of sample IDs:
         // .gz
         // .fastq
         // .fq
         // _001
         // _R[12]
         // _S\d+
         // E.g. strip _S1 from the end of a sample ID..
         // This is typically sample #s from Illumina basecalling.
         // could cause an issue if sequenced the same sample with
         // multiple barcodes so was repeated on a sample sheet.
         meta.id         = name.replaceAll( /.gz$/ ,"")
         meta.id         = meta.id.replaceAll( /.fastq$/ ,"")
         meta.id         = meta.id.replaceAll( /.fq$/ ,"")
         meta.id         = meta.id.replaceAll( /.uniq$/ ,"")
         meta.id         = meta.id.replaceAll( /.trim$/ ,"")
         meta.id         = meta.id.replaceFirst( /_001$/ ,"")
         meta.id         = meta.id.replaceFirst( /_R[12]$/ ,"")
         meta.id         = meta.id.replaceFirst( /_S\d+$/ ,"")

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions