Uploaded image for project: 'IGB'
  1. IGB
  2. IGBF-2945

Run trimmomatic on HPC system using nextflow and Singularity

    Details

    • Type: Task
    • Status: Closed (View Workflow)
    • Priority: Major
    • Resolution: Done
    • Affects Version/s: None
    • Fix Version/s: None
    • Labels:
      None

      Description

      To make our HPC data processing easier and more robust, we are exploring using the nextflow workflow management system in conjunction with Singularity containers.

      For this task, develop a nextflow script that runs "trimmomatic" on all fastq files in a directory.

        Attachments

          Issue Links

            Activity

            ann.loraine Ann Loraine created issue -
            ann.loraine Ann Loraine made changes -
            Field Original Value New Value
            Epic Link IGBF-2323 [ 18477 ]
            ann.loraine Ann Loraine made changes -
            Link This issue relates to IGBF-2909 [ IGBF-2909 ]
            ann.loraine Ann Loraine made changes -
            Rank Ranked higher
            ann.loraine Ann Loraine made changes -
            Status To-Do [ 10305 ] In Progress [ 3 ]
            ann.loraine Ann Loraine made changes -
            Comment [ Retrieving trimmomatic singularity image:

            {code}
            singularity pull trimmomatic_v0.39.sif oras://registry.forgemia.inra.fr/gafl/singularity/trimmomatic/trimmomatic:latest
            {code}

            Worked fine on a head node, but failed due to a network error of some time when run from an Andromeda partition interactive session.
            ]
            ann.loraine Ann Loraine made changes -
            Status In Progress [ 3 ] Needs 1st Level Review [ 10005 ]
            ann.loraine Ann Loraine made changes -
            Comment [ Final version of the nextflow script:

            {code}
            #!/usr/bin/env nextflow

            // test on one sample: nextflow run trim2.nf --dev -with-singularity trimmomatic_v0.39.sif
            // run all: runextflow run trim2.nf --dev -with-singularity trimmomatic_v0.39.sif
            params.dev = false
            params.number_of_inputs = 1
            params.saveMode = 'copy'
            //params.filePattern = "/projects/tomato_genome/rnaseq/phase2-rnaseq-Sep2021/*_{R1,R2}_001.fastq.gz"
            params.filePattern = "fastq/*_{R1,R2}_001.fastq.gz"
            params.outdir = 'results'

            Channel
                .fromFilePairs( params.filePattern )
                .ifEmpty { error "Cannot find any reads matching: ${params.filePattern}" }
                .take( params.dev ? params.number_of_inputs : -1 )
                .set { read_pairs_ch }

                
            process trim {
                time '2h'

                publishDir "$params.outdir", pattern: '*.fq.gz', mode: 'copy'
                
                input:
                tuple val(prefix), file(reads) from read_pairs_ch

                output:
                file '*.fq.gz'

                script:
                fq_1_paired = prefix + '_R1.p.fq'
                fq_1_unpaired = prefix + '_R1.u.fq'
                fq_2_paired = prefix + '_R2.p.fq'
                fq_2_unpaired = prefix + '_R2.u.fq'
                """
                trimmomatic \
                PE -phred33 \
                ${reads[0]} \
                ${reads[1]} \
                $fq_1_paired \
                $fq_1_unpaired \
                $fq_2_paired \
                $fq_2_unpaired \
                ILLUMINACLIP:TruSeq2-PE.fa:2:30:10 LEADING:3 TRAILING:3 SLIDINGWINDOW:4:15 MINLEN:50
                gzip $fq_1_paired
                gzip $fq_1_unpaired
                gzip $fq_2_paired
                gzip $fq_2_unpaired
                """
            }
            {code}

            Outputs copied to a results directory. ]
            ann.loraine Ann Loraine made changes -
            Status Needs 1st Level Review [ 10005 ] First Level Review in Progress [ 10301 ]
            ann.loraine Ann Loraine made changes -
            Status First Level Review in Progress [ 10301 ] Needs 1st Level Review [ 10005 ]
            ann.loraine Ann Loraine made changes -
            Status Needs 1st Level Review [ 10005 ] First Level Review in Progress [ 10301 ]
            ann.loraine Ann Loraine made changes -
            Status First Level Review in Progress [ 10301 ] Ready for Pull Request [ 10304 ]
            ann.loraine Ann Loraine made changes -
            Status Ready for Pull Request [ 10304 ] Pull Request Submitted [ 10101 ]
            ann.loraine Ann Loraine made changes -
            Status Pull Request Submitted [ 10101 ] Reviewing Pull Request [ 10303 ]
            ann.loraine Ann Loraine made changes -
            Status Reviewing Pull Request [ 10303 ] Merged Needs Testing [ 10002 ]
            ann.loraine Ann Loraine made changes -
            Status Merged Needs Testing [ 10002 ] Post-merge Testing In Progress [ 10003 ]
            ann.loraine Ann Loraine made changes -
            Resolution Done [ 10000 ]
            Status Post-merge Testing In Progress [ 10003 ] Closed [ 6 ]
            ann.loraine Ann Loraine made changes -
            Link This issue blocks IGBF-2949 [ IGBF-2949 ]

              People

              • Assignee:
                ann.loraine Ann Loraine
                Reporter:
                ann.loraine Ann Loraine
              • Votes:
                0 Vote for this issue
                Watchers:
                1 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: