This repository documents the full data-processing and quality-control pipeline used to process 72 MeRIP-seq samples and retain 29 high-quality samples for downstream analyses.
The scripts and configuration files here reproduce the computational workflow from raw public SRA accessions to final, QC-passed methylation peak calls.
Note: Raw sequencing data are not stored in this repository because of their size.
All source data are publicly available through the NCBI SRA and can be re-downloaded using the scripts provided.
Each biological sample includes a pair of sequencing runs:
- Input sample (background control)
- IP sample (immunoprecipitated m⁶A-enriched fraction)
The processing pipeline performs:
- Download of raw
.srafiles - Conversion to FASTQ format
- Alignment to the GRCh38 reference transcriptome
- Conversion to sorted and indexed BAM files
- Annotation database construction
- Peak calling using TRESS
- Quality control using GC-bias assessment and DRACH motif enrichment, resulting in 29 retained samples.