Details
-
Type: Task
-
Status: Closed (View Workflow)
-
Priority: Major
-
Resolution: Done
-
Affects Version/s: None
-
Fix Version/s: None
-
Labels:None
-
Story Points:1.5
-
Epic Link:
-
Sprint:Fall 3 2021 Sep 13 - Sep 24
Description
To make our HPC data processing easier and more robust, we are exploring using the nextflow workflow management system in conjunction with Singularity containers.
For this task, develop a nextflow script that runs "trimmomatic" on all fastq files in a directory.
Attachments
Issue Links
- relates to
-
IGBF-2909 Determine if and how we can use Nextflow on UNCC HPC cluster
- Closed
Activity
Ann Loraine
created issue -
Ann Loraine
made changes -
Field | Original Value | New Value |
---|---|---|
Epic Link | IGBF-2323 [ 18477 ] |
Ann Loraine
made changes -
Ann Loraine
made changes -
Rank | Ranked higher |
Ann Loraine
made changes -
Status | To-Do [ 10305 ] | In Progress [ 3 ] |
Ann Loraine
made changes -
Comment |
[ Retrieving trimmomatic singularity image:
{code} singularity pull trimmomatic_v0.39.sif oras://registry.forgemia.inra.fr/gafl/singularity/trimmomatic/trimmomatic:latest {code} Worked fine on a head node, but failed due to a network error of some time when run from an Andromeda partition interactive session. ] |
Ann Loraine
made changes -
Status | In Progress [ 3 ] | Needs 1st Level Review [ 10005 ] |
Ann Loraine
made changes -
Comment |
[ Final version of the nextflow script:
{code} #!/usr/bin/env nextflow // test on one sample: nextflow run trim2.nf --dev -with-singularity trimmomatic_v0.39.sif // run all: runextflow run trim2.nf --dev -with-singularity trimmomatic_v0.39.sif params.dev = false params.number_of_inputs = 1 params.saveMode = 'copy' //params.filePattern = "/projects/tomato_genome/rnaseq/phase2-rnaseq-Sep2021/*_{R1,R2}_001.fastq.gz" params.filePattern = "fastq/*_{R1,R2}_001.fastq.gz" params.outdir = 'results' Channel .fromFilePairs( params.filePattern ) .ifEmpty { error "Cannot find any reads matching: ${params.filePattern}" } .take( params.dev ? params.number_of_inputs : -1 ) .set { read_pairs_ch } process trim { time '2h' publishDir "$params.outdir", pattern: '*.fq.gz', mode: 'copy' input: tuple val(prefix), file(reads) from read_pairs_ch output: file '*.fq.gz' script: fq_1_paired = prefix + '_R1.p.fq' fq_1_unpaired = prefix + '_R1.u.fq' fq_2_paired = prefix + '_R2.p.fq' fq_2_unpaired = prefix + '_R2.u.fq' """ trimmomatic \ PE -phred33 \ ${reads[0]} \ ${reads[1]} \ $fq_1_paired \ $fq_1_unpaired \ $fq_2_paired \ $fq_2_unpaired \ ILLUMINACLIP:TruSeq2-PE.fa:2:30:10 LEADING:3 TRAILING:3 SLIDINGWINDOW:4:15 MINLEN:50 gzip $fq_1_paired gzip $fq_1_unpaired gzip $fq_2_paired gzip $fq_2_unpaired """ } {code} Outputs copied to a results directory. ] |
Ann Loraine
made changes -
Status | Needs 1st Level Review [ 10005 ] | First Level Review in Progress [ 10301 ] |
Ann Loraine
made changes -
Status | First Level Review in Progress [ 10301 ] | Needs 1st Level Review [ 10005 ] |
Ann Loraine
made changes -
Status | Needs 1st Level Review [ 10005 ] | First Level Review in Progress [ 10301 ] |
Ann Loraine
made changes -
Status | First Level Review in Progress [ 10301 ] | Ready for Pull Request [ 10304 ] |
Ann Loraine
made changes -
Status | Ready for Pull Request [ 10304 ] | Pull Request Submitted [ 10101 ] |
Ann Loraine
made changes -
Status | Pull Request Submitted [ 10101 ] | Reviewing Pull Request [ 10303 ] |
Ann Loraine
made changes -
Status | Reviewing Pull Request [ 10303 ] | Merged Needs Testing [ 10002 ] |
Ann Loraine
made changes -
Status | Merged Needs Testing [ 10002 ] | Post-merge Testing In Progress [ 10003 ] |
Ann Loraine
made changes -
Resolution | Done [ 10000 ] | |
Status | Post-merge Testing In Progress [ 10003 ] | Closed [ 6 ] |
Ann Loraine
made changes -
Link | This issue blocks IGBF-2949 [ IGBF-2949 ] |
From RR: