DSL2 modules

### Description of feature

Categorization of the workflow at the process level with the corresponding modules needed to port to 'DSL2'. Once the modules have been created, I can place more shape on this in terms of subworkflows. 

N.B: please checkout new branches for individual features and push to the `DSL2` branch, not `dev`. 

# Input files
Currently, circRNA takes as input a `samplesheet.csv` file and a `phenotype.csv` file. Functions already exist to check these files, all that is needed is to place these in an `input_check.nf` local subworkflow. 

I would like to incorporate `strandedness` like other nf-core workflows. Will check which circRNA quantification tools have a parameter denoting strandedness. 

# Pre-processing
The workflow takes as input `fastq` or `bam` files (which are converted to `fastq` using `picard SamToFastq`) and performs `FastQC` on the raw reads prior to trimming using `BBDUK`. The trimmed reads are then checked using `FastQC` again and placed in channels for downstream analyses. 

- [x] `FastQC`
- [x] `MultiQC`
- [x] `BBDUK` 
- [ ] `picard/SamToFastq`  (I don't care if we drop this functionality.)

# circRNA Discovery 
Several tools utilize the same aligner, there will be duplicates here. 

## CIRIquant
- [x] `bwa index`
- [x] `hisat build`
- [x] `ciriquant`

## CIRCexplorer2
- [x] `STAR genomegenerate`
- [x] `STAR align` (2 Pass mode)
- [x]  `circexplorer2 parse`
- [x] `circexplorer2 annotate` 

## circRNA_finder
- [x] `star genomegenerate`
- [x] `star align` (2 Pass mode)
- [x] `circRNA_Finder` (`postProcessStarAlignment.pl` script)

## DCC
DCC maps paired-end reads jointly and separately using STAR 2 pass mode. The goal is to generate `chimeric.junction.out` files from joint STAR mapping and individual read 1 and read 2 STAR mapping.
- [x] `star genomegenerate`
- [x] `star align` (2 Pass)
- [x] `dcc`

## find_circ
- [x] `bowtie2 build`
- [x] `bowtie2 align`
- [x] `find_circ find_anchors`
- [x] `find_circ find_circ`

## Mapsplice
- [x] `bowtie build`
- [x] `mapsplice align`
- [x] `circexplorer2 parse`
- [x] `circexplorer2 annotate`

## Segemehl
- [x] `segemehl align`

Custom scripts to parse `segemehl` output, no need to create a module.

## circRNA annotation
customized bash script to standardise the annotation outputs from the seven quantification tools. 

## circRNA FASTA sequence
customized bash script to generate the mature spliced sequence in FASTA format, and append the back-splice junction sequence for miRNA target prediction.

## circRNA count matrix
consolidate the circRNAs called by multiple tools on a per sample basis, generate the count matrix.

# miRNA target prediction

## miRanda
- [x] `miranda`

## TargetScan
- [x] `targetscan`. [biocontainers #475](https://github.com/BioContainers/containers/pull/475)

custom script to amalgamate the results from both tools. 

# Differential expression
- [x] `hisat build`
- [x] `hisat align`
- [x] `stringite`

Custom R scripts for DESeq2 and CircTest, no need to create modules. 

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

DSL2 modules #60

Description of feature

Input files

Pre-processing

circRNA Discovery

CIRIquant

CIRCexplorer2

circRNA_finder

DCC

find_circ

Mapsplice

Segemehl

circRNA annotation

circRNA FASTA sequence

circRNA count matrix

miRNA target prediction

miRanda

TargetScan

Differential expression

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

DSL2 modules #60

Description

Description of feature

Input files

Pre-processing

circRNA Discovery

CIRIquant

CIRCexplorer2

circRNA_finder

DCC

find_circ

Mapsplice

Segemehl

circRNA annotation

circRNA FASTA sequence

circRNA count matrix

miRNA target prediction

miRanda

TargetScan

Differential expression

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions