Uploaded image for project: 'IGB'
  1. IGB
  2. IGBF-3043

Align and process rice RNA-Seq data using nf-core/rnaseq pipeline

    Details

    • Type: Task
    • Status: Closed (View Workflow)
    • Priority: Major
    • Resolution: Done
    • Affects Version/s: None
    • Fix Version/s: None
    • Labels:
      None
    • Story Points:
      2
    • Sprint:
      Spring 1 2022 Jan 3 - Jan 14, Spring 2 2022 Jan 18 - Jan 28, Spring 3 2022 Jan 31 - Feb 11, Spring 4 2022 Feb 14 - Feb 25, Spring 5 2022 Feb 28 - Mar 11

      Description

      Align and process RNA-Seq data using nf-core/rnaseq pipeline

        Attachments

          Issue Links

            Activity

            ann.loraine Ann Loraine created issue -
            ann.loraine Ann Loraine made changes -
            Field Original Value New Value
            Epic Link IGBF-3039 [ 21553 ]
            ann.loraine Ann Loraine made changes -
            Sprint Spring 1 2022 Jan 3 - Jan 14 [ 136 ] Spring 1 2022 Jan 3 - Jan 14, Spring 2 2022 Jan 18 - Jan 28 [ 136, 137 ]
            ann.loraine Ann Loraine made changes -
            Rank Ranked higher
            Hide
            ann.loraine Ann Loraine added a comment -

            Added config file to repository: nfcore-rnaseq.config. The file is required to ensure correct max intron size parameter is used.

            Show
            ann.loraine Ann Loraine added a comment - Added config file to repository: nfcore-rnaseq.config. The file is required to ensure correct max intron size parameter is used.
            ann.loraine Ann Loraine made changes -
            Status To-Do [ 10305 ] In Progress [ 3 ]
            ann.loraine Ann Loraine made changes -
            Assignee Ann Loraine [ aloraine ]
            Hide
            ann.loraine Ann Loraine added a comment -

            Commencing to download the latest nf-core/rnaseq pipeline code into /nobackup/lorainelab/salty_rice/rna-seq.

            Show
            ann.loraine Ann Loraine added a comment - Commencing to download the latest nf-core/rnaseq pipeline code into /nobackup/lorainelab/salty_rice/rna-seq.
            Hide
            ann.loraine Ann Loraine added a comment -

            Version 3.4 of rnaseq pipeline is available. Downloading rnaseq pipeline for off-line use, since the cluster nodes do not allow connecting to internet

            Invoked:

            module load nf-core
            

            Then:

            nf-core download rnaseq
            

            However, when I tried to use the "singularity" option, got an error. Something about a directory not being available. Re-did the download command this time without selecting that option.

            Show
            ann.loraine Ann Loraine added a comment - Version 3.4 of rnaseq pipeline is available. Downloading rnaseq pipeline for off-line use, since the cluster nodes do not allow connecting to internet Invoked: module load nf-core Then: nf-core download rnaseq However, when I tried to use the "singularity" option, got an error. Something about a directory not being available. Re-did the download command this time without selecting that option.
            Hide
            ann.loraine Ann Loraine added a comment -

            There is now a new rnaseq pipeline directory: /nobackup/lorainelab/salty_rice/rna-seq/nf-core-rnaseq-3.5

            Show
            ann.loraine Ann Loraine added a comment - There is now a new rnaseq pipeline directory: /nobackup/lorainelab/salty_rice/rna-seq/nf-core-rnaseq-3.5
            Hide
            ann.loraine Ann Loraine added a comment -

            Made symbolic links to sample sheet samplesheet_RNA-Seq.csv located in my home directory in my cloned copy of bseq_rice repository.

            Show
            ann.loraine Ann Loraine added a comment - Made symbolic links to sample sheet samplesheet_RNA-Seq.csv located in my home directory in my cloned copy of bseq_rice repository.
            Hide
            ann.loraine Ann Loraine added a comment -

            Tried running version 3.5 and got this error:

            Nextflow version 21.04.0 does not match workflow required version: >=21.10.3

            Installing previous version 3.4.

            Show
            ann.loraine Ann Loraine added a comment - Tried running version 3.5 and got this error: Nextflow version 21.04.0 does not match workflow required version: >=21.10.3 Installing previous version 3.4.
            Hide
            ann.loraine Ann Loraine added a comment -

            None of the errors seen when installing version 3.5 occured with 3.4. Not sure why. Proceeding with version 3.4 instead of 3.5.

            Show
            ann.loraine Ann Loraine added a comment - None of the errors seen when installing version 3.5 occured with 3.4. Not sure why. Proceeding with version 3.4 instead of 3.5.
            Hide
            ann.loraine Ann Loraine added a comment - - edited

            Testing with:

            nextflow run nf-core-rnaseq-3.4/workflow -profile test,singularity -c nfcore-rnaseq.config
            
            Show
            ann.loraine Ann Loraine added a comment - - edited Testing with: nextflow run nf-core-rnaseq-3.4/workflow -profile test,singularity -c nfcore-rnaseq.config
            Hide
            ann.loraine Ann Loraine added a comment -

            Forgot to include the genomic sequence data in the local directory.

            Show
            ann.loraine Ann Loraine added a comment - Forgot to include the genomic sequence data in the local directory.
            Hide
            ann.loraine Ann Loraine added a comment -

            Need to make gtf and fa file.

            Show
            ann.loraine Ann Loraine added a comment - Need to make gtf and fa file.
            Hide
            ann.loraine Ann Loraine added a comment -

            Edited sample sheet. Columns 2 and 3 need to contain the name of the file. Added extension fastq.gz to each value.

            When tried to run, got this error:

            Command output:
              ERROR: Please check samplesheet header -> SRA run identifier,fastq_1,fastq_2,strandedness,genotype,treatment,5-Azacytidine,tissue,replicate,read length != sample,fastq_1,fastq_2,strandedness
            

            Fixing the header to match spec: https://nf-co.re/rnaseq/usage#samplesheet-input

            Show
            ann.loraine Ann Loraine added a comment - Edited sample sheet. Columns 2 and 3 need to contain the name of the file. Added extension fastq.gz to each value. When tried to run, got this error: Command output: ERROR: Please check samplesheet header -> SRA run identifier,fastq_1,fastq_2,strandedness,genotype,treatment,5-Azacytidine,tissue,replicate,read length != sample,fastq_1,fastq_2,strandedness Fixing the header to match spec: https://nf-co.re/rnaseq/usage#samplesheet-input
            Hide
            ann.loraine Ann Loraine added a comment -

            New error:

            WARN: Process 'NFCORE_RNASEQ:RNASEQ:MULTIQC_TSV_FAIL_MAPPED' cannot be executed by 'slurm' executor -- Using 'local' executor instead
            WARN: Process 'NFCORE_RNASEQ:RNASEQ:MULTIQC_TSV_STRAND_CHECK' cannot be executed by 'slurm' executor -- Using 'local' executor instead
            ERROR: Please check input samplesheet -> Read 1 FastQ file does not exist!
            SRR7591232_1.fastq.gz
            

            Sample sheet includes _1 extension even for samples that are not paired-end. Fixing this.

            Show
            ann.loraine Ann Loraine added a comment - New error: WARN: Process 'NFCORE_RNASEQ:RNASEQ:MULTIQC_TSV_FAIL_MAPPED' cannot be executed by 'slurm' executor -- Using 'local' executor instead WARN: Process 'NFCORE_RNASEQ:RNASEQ:MULTIQC_TSV_STRAND_CHECK' cannot be executed by 'slurm' executor -- Using 'local' executor instead ERROR: Please check input samplesheet -> Read 1 FastQ file does not exist! SRR7591232_1.fastq.gz Sample sheet includes _1 extension even for samples that are not paired-end. Fixing this.
            Hide
            ann.loraine Ann Loraine added a comment -

            nf-core/rnaseq is now running.

            Show
            ann.loraine Ann Loraine added a comment - nf-core/rnaseq is now running.
            Hide
            ann.loraine Ann Loraine added a comment -

            Edits to sample sheet checked in. Script doIt.sh used to run the pipeline also added to the repo.

            Show
            ann.loraine Ann Loraine added a comment - Edits to sample sheet checked in. Script doIt.sh used to run the pipeline also added to the repo.
            Hide
            ann.loraine Ann Loraine added a comment -

            New error:

            Command exit status:
              255
            
            Command output:
              rsem-extract-reference-transcripts rsem/genome 0 O_sativa_japonica_Oct_2011_genes.gtf None 0 rsem/O_sativa_japonica_Oct_2011.fa
              "rsem-extract-reference-transcripts rsem/genome 0 O_sativa_japonica_Oct_2011_genes.gtf None 0 rsem/O_sativa_japonica_Oct_2011.fa" failed! Plase check if you provide correct parameters/options for the pipeline!
            
            Command error:
              INFO:    Converting SIF file to temporary sandbox...
              WARNING: Skipping mount /usr/local/singularity/var/singularity/mnt/session/etc/resolv.conf [files]: /etc/resolv.conf doesn't exist in container
              The GTF file might be corrupted!
              Stop at line : Chr1   NA      exon    2903    3268    .       +       .       transcript_id "LOC_Os01g01010.1";
              Error Message: Cannot find gene_id!
              INFO:    Cleaning up image...
            
            Work dir:
              /nobackup/lorainelab/salty_rice/rna-seq/work/51/a375c11f7ab0a69457e646f29f0f25
            
            Tip: view the complete command output by changing to the process work dir and entering the command `cat .command.out`
            
            Show
            ann.loraine Ann Loraine added a comment - New error: Command exit status: 255 Command output: rsem-extract-reference-transcripts rsem/genome 0 O_sativa_japonica_Oct_2011_genes.gtf None 0 rsem/O_sativa_japonica_Oct_2011.fa "rsem-extract-reference-transcripts rsem/genome 0 O_sativa_japonica_Oct_2011_genes.gtf None 0 rsem/O_sativa_japonica_Oct_2011.fa" failed! Plase check if you provide correct parameters/options for the pipeline! Command error: INFO: Converting SIF file to temporary sandbox... WARNING: Skipping mount /usr/local/singularity/ var /singularity/mnt/session/etc/resolv.conf [files]: /etc/resolv.conf doesn't exist in container The GTF file might be corrupted! Stop at line : Chr1 NA exon 2903 3268 . + . transcript_id "LOC_Os01g01010.1" ; Error Message: Cannot find gene_id! INFO: Cleaning up image... Work dir: /nobackup/lorainelab/salty_rice/rna-seq/work/51/a375c11f7ab0a69457e646f29f0f25 Tip: view the complete command output by changing to the process work dir and entering the command `cat .command.out`
            ann.loraine Ann Loraine made changes -
            Status In Progress [ 3 ] To-Do [ 10305 ]
            ann.loraine Ann Loraine made changes -
            Status To-Do [ 10305 ] In Progress [ 3 ]
            Hide
            ann.loraine Ann Loraine added a comment - - edited

            Re-running bed2gtf.py with gene_id option:

            bed2gtf.py -g 13 O_sativa_japonica_Oct_2011.bed | grep -v ChrUn | grep -v ChrSy > O_sativa_japonica_Oct_2011.gtf
            

            bed2gtf.py version:

            34eac2e (HEAD -> master, upstream/master) Improve GTF to/from BED conversion

            Added output to bseq_rice repo in ExternalDataSets/for_nf-core

            Show
            ann.loraine Ann Loraine added a comment - - edited Re-running bed2gtf.py with gene_id option: bed2gtf.py -g 13 O_sativa_japonica_Oct_2011.bed | grep -v ChrUn | grep -v ChrSy > O_sativa_japonica_Oct_2011.gtf bed2gtf.py version: 34eac2e (HEAD -> master, upstream/master) Improve GTF to/from BED conversion Added output to bseq_rice repo in ExternalDataSets/for_nf-core
            Hide
            ann.loraine Ann Loraine added a comment -

            Asked NF to modify column 1 in the nf-core/rnaseq samples file. Using concatenated sample types codes from subsequent columns.

            Show
            ann.loraine Ann Loraine added a comment - Asked NF to modify column 1 in the nf-core/rnaseq samples file. Using concatenated sample types codes from subsequent columns.
            Hide
            ann.loraine Ann Loraine added a comment -

            Relaunched nf-core/rnaseq version 3.4 pipline using new gtf file and new sample sheet file.

            Show
            ann.loraine Ann Loraine added a comment - Relaunched nf-core/rnaseq version 3.4 pipline using new gtf file and new sample sheet file.
            Hide
            ann.loraine Ann Loraine added a comment - - edited

            Got new errors affecting A.C.Y.S.3

              
              >>>>> Now validing the length of the 2 paired-end infiles: A.C.Y.S.3_1_trimmed.fq.gz and A.C.Y.S.3_2_trimmed.fq.gz <<<<<
              Writing validated paired-end Read 1 reads to A.C.Y.S.3_1_val_1.fq.gz
              Writing validated paired-end Read 2 reads to A.C.Y.S.3_2_val_2.fq.gz
              
              Read 2 output is truncated at sequence count: 23469680, please check your paired-end input files! Terminating...
              
              INFO:    Cleaning up image...
            
            Work dir:
              /nobackup/lorainelab/salty_rice/rna-seq/work/c7/25844b7fa5600e4c60222407461493
            
            Tip: you can replicate the issue by changing to the process work dir and entering the command `bash .command.run`
            

            and

              >>>>> Now validing the length of the 2 paired-end infiles: A.C.Y.S.3_1_trimmed.fq.gz and A.C.Y.S.3_2_trimmed.fq.gz <<<<<
              Writing validated paired-end Read 1 reads to A.C.Y.S.3_1_val_1.fq.gz
              Writing validated paired-end Read 2 reads to A.C.Y.S.3_2_val_2.fq.gz
              
              Read 2 output is truncated at sequence count: 23469680, please check your paired-end input files! Terminating...
              
              INFO:    Cleaning up image...
            
            Work dir:
              /nobackup/lorainelab/salty_rice/rna-seq/work/c7/25844b7fa5600e4c60222407461493
            
            Tip: you can replicate the issue by changing to the process work dir and entering the command `bash .command.run`
            
            Show
            ann.loraine Ann Loraine added a comment - - edited Got new errors affecting A.C.Y.S.3 >>>>> Now validing the length of the 2 paired-end infiles: A.C.Y.S.3_1_trimmed.fq.gz and A.C.Y.S.3_2_trimmed.fq.gz <<<<< Writing validated paired-end Read 1 reads to A.C.Y.S.3_1_val_1.fq.gz Writing validated paired-end Read 2 reads to A.C.Y.S.3_2_val_2.fq.gz Read 2 output is truncated at sequence count: 23469680, please check your paired-end input files! Terminating... INFO: Cleaning up image... Work dir: /nobackup/lorainelab/salty_rice/rna-seq/work/c7/25844b7fa5600e4c60222407461493 Tip: you can replicate the issue by changing to the process work dir and entering the command `bash .command.run` and >>>>> Now validing the length of the 2 paired-end infiles: A.C.Y.S.3_1_trimmed.fq.gz and A.C.Y.S.3_2_trimmed.fq.gz <<<<< Writing validated paired-end Read 1 reads to A.C.Y.S.3_1_val_1.fq.gz Writing validated paired-end Read 2 reads to A.C.Y.S.3_2_val_2.fq.gz Read 2 output is truncated at sequence count: 23469680, please check your paired-end input files! Terminating... INFO: Cleaning up image... Work dir: /nobackup/lorainelab/salty_rice/rna-seq/work/c7/25844b7fa5600e4c60222407461493 Tip: you can replicate the issue by changing to the process work dir and entering the command `bash .command.run`
            ann.loraine Ann Loraine made changes -
            Comment [ More errors:

            {code}
              >>>>> Now validing the length of the 2 paired-end infiles: A.C.Y.S.3_1_trimmed.fq.gz and A.C.Y.S.3_2_trimmed.fq.gz <<<<<
              Writing validated paired-end Read 1 reads to A.C.Y.S.3_1_val_1.fq.gz
              Writing validated paired-end Read 2 reads to A.C.Y.S.3_2_val_2.fq.gz
              
              Read 2 output is truncated at sequence count: 23469680, please check your paired-end input files! Terminating...
              
              INFO: Cleaning up image...

            Work dir:
              /nobackup/lorainelab/salty_rice/rna-seq/work/c7/25844b7fa5600e4c60222407461493

            Tip: you can replicate the issue by changing to the process work dir and entering the command `bash .command.run`
            {code}

            ]
            ann.loraine Ann Loraine made changes -
            Description Align and process RNA-Seq data using nf-core/rnaseq pipeline Align and process RNA-Seq data using nf-core/rnaseq pipeline
            Hide
            ann.loraine Ann Loraine added a comment -

            NF recommends discarding R2 for the problem sample.

            Show
            ann.loraine Ann Loraine added a comment - NF recommends discarding R2 for the problem sample.
            ann.loraine Ann Loraine made changes -
            Assignee Ann Loraine [ aloraine ]
            ann.loraine Ann Loraine made changes -
            Assignee Ann Loraine [ aloraine ]
            Hide
            ann.loraine Ann Loraine added a comment -

            Deleted R2 file for A.C.Y.S.3. Re-running the pipeline.

            Show
            ann.loraine Ann Loraine added a comment - Deleted R2 file for A.C.Y.S.3. Re-running the pipeline.
            Hide
            ann.loraine Ann Loraine added a comment - - edited

            Still running. Waiting for pipeline to finish.

            Show
            ann.loraine Ann Loraine added a comment - - edited Still running. Waiting for pipeline to finish.
            ann.loraine Ann Loraine made changes -
            Status In Progress [ 3 ] To-Do [ 10305 ]
            ann.loraine Ann Loraine made changes -
            Status To-Do [ 10305 ] In Progress [ 3 ]
            Hide
            ann.loraine Ann Loraine added a comment -

            Nextflow job got killed after 30 hours expired, but items were still in queue. No more items were in queue but files did not finish processing. Seems like when the nextflow job manager dies, all the jobs die too? Confusing. Anyway, started a new interactive job running nextflow with job scheduled for 60 hours instead of 30 as was done previously. Relaunched the rnaseq pipeline to hopefully finish the jobs.

            Show
            ann.loraine Ann Loraine added a comment - Nextflow job got killed after 30 hours expired, but items were still in queue. No more items were in queue but files did not finish processing. Seems like when the nextflow job manager dies, all the jobs die too? Confusing. Anyway, started a new interactive job running nextflow with job scheduled for 60 hours instead of 30 as was done previously. Relaunched the rnaseq pipeline to hopefully finish the jobs.
            ann.loraine Ann Loraine made changes -
            Status In Progress [ 3 ] To-Do [ 10305 ]
            Hide
            ann.loraine Ann Loraine added a comment -

            Shortly after re-running the rnaseq pipeline, 72 jobs added to the queue.

            Show
            ann.loraine Ann Loraine added a comment - Shortly after re-running the rnaseq pipeline, 72 jobs added to the queue.
            ann.loraine Ann Loraine made changes -
            Sprint Spring 1 2022 Jan 3 - Jan 14, Spring 2 2022 Jan 18 - Jan 28 [ 136, 137 ] Spring 1 2022 Jan 3 - Jan 14, Spring 2 2022 Jan 18 - Jan 28, Spring 3 2022 Jan 31 - Feb 11 [ 136, 137, 138 ]
            ann.loraine Ann Loraine made changes -
            Rank Ranked higher
            ann.loraine Ann Loraine made changes -
            Status To-Do [ 10305 ] In Progress [ 3 ]
            ann.loraine Ann Loraine made changes -
            Status In Progress [ 3 ] To-Do [ 10305 ]
            ann.loraine Ann Loraine made changes -
            Sprint Spring 1 2022 Jan 3 - Jan 14, Spring 2 2022 Jan 18 - Jan 28, Spring 3 2022 Jan 31 - Feb 11 [ 136, 137, 138 ] Spring 1 2022 Jan 3 - Jan 14, Spring 2 2022 Jan 18 - Jan 28, Spring 3 2022 Jan 31 - Feb 11, Spring 4 2022 Feb 14 - Feb 25 [ 136, 137, 138, 139 ]
            ann.loraine Ann Loraine made changes -
            Rank Ranked higher
            ann.loraine Ann Loraine made changes -
            Status To-Do [ 10305 ] In Progress [ 3 ]
            Hide
            ann.loraine Ann Loraine added a comment - - edited

            Pipeline crashed again. Restarted with:

            • Logged onto cluster.
            • Re-attached to existing tmux session with "tmux attach" command
            • Launched interactive job with:
            srun --partition Andromeda --job-name kitten --cpus-per-task 5 --mem-per-cpu 12000 --time 60:00:00 --pty bash
            
            • Re-started pipeline with:
            [aloraine@str-ac9 rna-seq]$ doIt.sh samplesheet_RNA-Seq.csv O_sativa_japonica_Oct_2011.fa O_sativa_japonica_Oct_2011.gtf O_sativa_japonica_Oct_2011.bed 2>3.err 1>3.out
            
            Show
            ann.loraine Ann Loraine added a comment - - edited Pipeline crashed again. Restarted with: Logged onto cluster. Re-attached to existing tmux session with "tmux attach" command Launched interactive job with: srun --partition Andromeda --job-name kitten --cpus-per-task 5 --mem-per-cpu 12000 --time 60:00:00 --pty bash Re-started pipeline with: [aloraine@str-ac9 rna-seq]$ doIt.sh samplesheet_RNA-Seq.csv O_sativa_japonica_Oct_2011.fa O_sativa_japonica_Oct_2011.gtf O_sativa_japonica_Oct_2011.bed 2>3.err 1>3.out
            ann.loraine Ann Loraine made changes -
            Status In Progress [ 3 ] To-Do [ 10305 ]
            ann.loraine Ann Loraine made changes -
            Status To-Do [ 10305 ] In Progress [ 3 ]
            ann.loraine Ann Loraine made changes -
            Status In Progress [ 3 ] Needs 1st Level Review [ 10005 ]
            ann.loraine Ann Loraine made changes -
            Assignee Ann Loraine [ aloraine ] Nowlan Freese [ nfreese ]
            Hide
            ann.loraine Ann Loraine added a comment - - edited

            The pipeline appears to have completed. To test / review:

            • Dr. Freese please check that you have full access to all the files in the "results" directory
            • Nextflow runs a data QC sub-pipeline using "multiQC"; please check the output (probably need to download it)

            You could also use 'rclone' to migrate all or some of the data into a google drive. If you search jira for 'rclone' you will probably find pretty good instructions on how to do this.

            Show
            ann.loraine Ann Loraine added a comment - - edited The pipeline appears to have completed. To test / review: Dr. Freese please check that you have full access to all the files in the "results" directory Nextflow runs a data QC sub-pipeline using "multiQC"; please check the output (probably need to download it) You could also use 'rclone' to migrate all or some of the data into a google drive. If you search jira for 'rclone' you will probably find pretty good instructions on how to do this.
            nfreese Nowlan Freese made changes -
            Status Needs 1st Level Review [ 10005 ] First Level Review in Progress [ 10301 ]
            Hide
            nfreese Nowlan Freese added a comment - - edited

            I can access the files in the results directory.

            The following processed bam files appear to be missing:

            • A.C.Y.S.1
            • A.C.Y.S.3
            • A.E.Y.S.2
            • A.E.Y.S.3

            I cannot find a multiqc_report.html but there does appear to be a complete folder of all of the samples run through fastqc here:

            /nobackup/lorainelab/salty_rice/rna-seq/results/fastqc
            

            Since there appear to be missing files I am moving back to To Do.

            Show
            nfreese Nowlan Freese added a comment - - edited I can access the files in the results directory. The following processed bam files appear to be missing: A.C.Y.S.1 A.C.Y.S.3 A.E.Y.S.2 A.E.Y.S.3 I cannot find a multiqc_report.html but there does appear to be a complete folder of all of the samples run through fastqc here: /nobackup/lorainelab/salty_rice/rna-seq/results/fastqc Since there appear to be missing files I am moving back to To Do.
            nfreese Nowlan Freese made changes -
            Assignee Nowlan Freese [ nfreese ] Ann Loraine [ aloraine ]
            nfreese Nowlan Freese made changes -
            Status First Level Review in Progress [ 10301 ] To-Do [ 10305 ]
            ann.loraine Ann Loraine made changes -
            Sprint Spring 1 2022 Jan 3 - Jan 14, Spring 2 2022 Jan 18 - Jan 28, Spring 3 2022 Jan 31 - Feb 11, Spring 4 2022 Feb 14 - Feb 25 [ 136, 137, 138, 139 ] Spring 1 2022 Jan 3 - Jan 14, Spring 2 2022 Jan 18 - Jan 28, Spring 3 2022 Jan 31 - Feb 11, Spring 4 2022 Feb 14 - Feb 25, Spring 5 2022 Feb 28 - Mar 11 [ 136, 137, 138, 139, 140 ]
            ann.loraine Ann Loraine made changes -
            Rank Ranked higher
            ann.loraine Ann Loraine made changes -
            Status To-Do [ 10305 ] In Progress [ 3 ]
            ann.loraine Ann Loraine made changes -
            Status In Progress [ 3 ] To-Do [ 10305 ]
            ann.loraine Ann Loraine made changes -
            Status To-Do [ 10305 ] In Progress [ 3 ]
            Hide
            ann.loraine Ann Loraine added a comment - - edited

            I think I found the problem:

            RUN STATISTICS FOR INPUT FILE: A.C.Y.S.3_2.fastq.gz
            =============================================
            23469680 sequences processed in total
            The length threshold of paired-end sequences gets evaluated later on (in the validation step)

            Validate paired-end files A.C.Y.S.3_1_trimmed.fq.gz and A.C.Y.S.3_2_trimmed.fq.gz
            file_1: A.C.Y.S.3_1_trimmed.fq.gz, file_2: A.C.Y.S.3_2_trimmed.fq.gz

            >>>>> Now validing the length of the 2 paired-end infiles: A.C.Y.S.3_1_trimmed.fq.gz and A.C.Y.S.3_2_trimmed.fq.gz <<<<<
            Writing validated paired-end Read 1 reads to A.C.Y.S.3_1_val_1.fq.gz
            Writing validated paired-end Read 2 reads to A.C.Y.S.3_2_val_2.fq.gz

            Read 2 output is truncated at sequence count: 23469680, please check your paired-end input files! Terminating...

            To deal with this, let's only use read 1 for sample A.C.Y.S.3.

            Show
            ann.loraine Ann Loraine added a comment - - edited I think I found the problem: RUN STATISTICS FOR INPUT FILE: A.C.Y.S.3_2.fastq.gz ============================================= 23469680 sequences processed in total The length threshold of paired-end sequences gets evaluated later on (in the validation step) Validate paired-end files A.C.Y.S.3_1_trimmed.fq.gz and A.C.Y.S.3_2_trimmed.fq.gz file_1: A.C.Y.S.3_1_trimmed.fq.gz, file_2: A.C.Y.S.3_2_trimmed.fq.gz >>>>> Now validing the length of the 2 paired-end infiles: A.C.Y.S.3_1_trimmed.fq.gz and A.C.Y.S.3_2_trimmed.fq.gz <<<<< Writing validated paired-end Read 1 reads to A.C.Y.S.3_1_val_1.fq.gz Writing validated paired-end Read 2 reads to A.C.Y.S.3_2_val_2.fq.gz Read 2 output is truncated at sequence count: 23469680, please check your paired-end input files! Terminating... To deal with this, let's only use read 1 for sample A.C.Y.S.3.
            Hide
            ann.loraine Ann Loraine added a comment - - edited

            Error reports:

            • A.C.Y.S.1 - doIt.out: "Read 2 output is truncated at sequence count: 40774078, please check your paired-end input files! Terminating..."
            • A.C.Y.S.3 - 3.out: "Read 2 output is truncated at sequence count: 23469680, please check your paired-end input files! Terminating..."
            • A.E.Y.S.2 - ?
            • A.E.Y.S.3 - ?

            Edited sample sheet to exclude R2 for A.C.Y.S.1 and A.C.Y.S.3 as per error message above.

            Re-ran pipeline after launching 60-hour interactive session on Andromeda cluster:

            (nf-core) [aloraine@str-abm1 rna-seq]$ doIt.sh samplesheet_RNA-Seq.csv O_sativa_japonica_Oct_2011.fa O_sativa_japonica_Oct_2011.gtf O_sativa_japonica_Oct_2011.bed 2>4.err 1>4.out
            
            Show
            ann.loraine Ann Loraine added a comment - - edited Error reports: A.C.Y.S.1 - doIt.out: "Read 2 output is truncated at sequence count: 40774078, please check your paired-end input files! Terminating..." A.C.Y.S.3 - 3.out: "Read 2 output is truncated at sequence count: 23469680, please check your paired-end input files! Terminating..." A.E.Y.S.2 - ? A.E.Y.S.3 - ? Edited sample sheet to exclude R2 for A.C.Y.S.1 and A.C.Y.S.3 as per error message above. Re-ran pipeline after launching 60-hour interactive session on Andromeda cluster: (nf-core) [aloraine@str-abm1 rna-seq]$ doIt.sh samplesheet_RNA-Seq.csv O_sativa_japonica_Oct_2011.fa O_sativa_japonica_Oct_2011.gtf O_sativa_japonica_Oct_2011.bed 2>4.err 1>4.out
            Hide
            ann.loraine Ann Loraine added a comment - - edited

            Pipeline halted very quickly after starting, choking on sample A.E.Y.S.2. Error appeared to be with pigz. Internet says this is a parallel implementation of gzip. Error message was:

            pigz: error while loading shared libraries: libz.so.1: cannot open shared object file: No such file or directory
            

            Trying again:

            (nf-core) [aloraine@str-abm1 rna-seq]$ doIt.sh samplesheet_RNA-Seq.csv O_sativa_japonica_Oct_2011.fa O_sativa_japonica_Oct_2011.gtf O_sativa_japonica_Oct_2011.bed 2>5.err 1>5.out
            
            Show
            ann.loraine Ann Loraine added a comment - - edited Pipeline halted very quickly after starting, choking on sample A.E.Y.S.2. Error appeared to be with pigz. Internet says this is a parallel implementation of gzip. Error message was: pigz: error while loading shared libraries: libz.so.1: cannot open shared object file: No such file or directory Trying again: (nf-core) [aloraine@str-abm1 rna-seq]$ doIt.sh samplesheet_RNA-Seq.csv O_sativa_japonica_Oct_2011.fa O_sativa_japonica_Oct_2011.gtf O_sativa_japonica_Oct_2011.bed 2>5.err 1>5.out
            Hide
            ann.loraine Ann Loraine added a comment - - edited

            New error on sample A.E.Y.S.3:

            Read 2 output is truncated at sequence count: 25069958, please check your paired-end input files! Terminating...
            

            Edited samples file to exclude R2 for A.E.Y.S.3 and tried again with:

            (nf-core) [aloraine@str-abm1 rna-seq]$ doIt.sh samplesheet_RNA-Seq.csv O_sativa_japonica_Oct_2011.fa O_sativa_japonica_Oct_2011.gtf O_sativa_japonica_Oct_2011.bed 2>6.err 1>6.out
            

            New error on A.E.Y.S.2:

            Read 2 output is truncated at sequence count: 28905992, please check your paired-end input files! Terminating...
            

            Edited samples file to exclude R2 for A.E.Y.S.2 and tried again with:

            (nf-core) [aloraine@str-abm1 rna-seq]$ doIt.sh samplesheet_RNA-Seq.csv O_sativa_japonica_Oct_2011.fa O_sativa_japonica_Oct_2011.gtf O_sativa_japonica_Oct_2011.bed 2>7.err 1>7.out 
            
            Show
            ann.loraine Ann Loraine added a comment - - edited New error on sample A.E.Y.S.3: Read 2 output is truncated at sequence count: 25069958, please check your paired-end input files! Terminating... Edited samples file to exclude R2 for A.E.Y.S.3 and tried again with: (nf-core) [aloraine@str-abm1 rna-seq]$ doIt.sh samplesheet_RNA-Seq.csv O_sativa_japonica_Oct_2011.fa O_sativa_japonica_Oct_2011.gtf O_sativa_japonica_Oct_2011.bed 2>6.err 1>6.out New error on A.E.Y.S.2: Read 2 output is truncated at sequence count: 28905992, please check your paired-end input files! Terminating... Edited samples file to exclude R2 for A.E.Y.S.2 and tried again with: (nf-core) [aloraine@str-abm1 rna-seq]$ doIt.sh samplesheet_RNA-Seq.csv O_sativa_japonica_Oct_2011.fa O_sativa_japonica_Oct_2011.gtf O_sativa_japonica_Oct_2011.bed 2>7.err 1>7.out
            Hide
            ann.loraine Ann Loraine added a comment - - edited

            New error:

              yaml.scanner.ScannerError: mapping values are not allowed here
                in "versions.yml", line 23, column 21
            

            Tried again and got this result:

            -[nf-core/rnaseq] Pipeline completed successfully-
            Completed at: 03-Mar-2022 14:11:45
            Duration    : 1m 2s
            CPU hours   : 280.3 (100% cached)
            Succeeded   : 2
            Cached      : 534
            
            Show
            ann.loraine Ann Loraine added a comment - - edited New error: yaml.scanner.ScannerError: mapping values are not allowed here in "versions.yml" , line 23, column 21 Tried again and got this result: -[nf-core/rnaseq] Pipeline completed successfully- Completed at: 03-Mar-2022 14:11:45 Duration : 1m 2s CPU hours : 280.3 (100% cached) Succeeded : 2 Cached : 534
            Hide
            ann.loraine Ann Loraine added a comment -

            Checked bam output files in /nobackup/lorainelab/salty_rice/rna-seq/results/star_salmon.
            They are:

            A.C.N.R.1.sorted.bam  A.C.N.S.3.sorted.bam  A.E.N.R.2.sorted.bam  A.E.Y.S.2.sorted.bam  M.C.N.S.1.sorted.bam  M.E.N.R.3.sorted.bam
            A.C.N.R.2.sorted.bam  A.C.Y.S.1.sorted.bam  A.E.N.R.3.sorted.bam  A.E.Y.S.3.sorted.bam  M.C.N.S.2.sorted.bam  M.E.N.S.1.sorted.bam
            A.C.N.R.3.sorted.bam  A.C.Y.S.2.sorted.bam  A.E.N.S.1.sorted.bam  M.C.N.R.1.sorted.bam  M.C.N.S.3.sorted.bam  M.E.N.S.2.sorted.bam
            A.C.N.S.1.sorted.bam  A.C.Y.S.3.sorted.bam  A.E.N.S.2.sorted.bam  M.C.N.R.2.sorted.bam  M.E.N.R.1.sorted.bam  M.E.N.S.3.sorted.bam
            A.C.N.S.2.sorted.bam  A.E.N.R.1.sorted.bam  A.E.N.S.3.sorted.bam  M.C.N.R.3.sorted.bam  M.E.N.R.2.sorted.bam
            

            Multi-qc directory is also available.

            Requesting new first level review by Nowlan Freese.

            Show
            ann.loraine Ann Loraine added a comment - Checked bam output files in /nobackup/lorainelab/salty_rice/rna-seq/results/star_salmon. They are: A.C.N.R.1.sorted.bam A.C.N.S.3.sorted.bam A.E.N.R.2.sorted.bam A.E.Y.S.2.sorted.bam M.C.N.S.1.sorted.bam M.E.N.R.3.sorted.bam A.C.N.R.2.sorted.bam A.C.Y.S.1.sorted.bam A.E.N.R.3.sorted.bam A.E.Y.S.3.sorted.bam M.C.N.S.2.sorted.bam M.E.N.S.1.sorted.bam A.C.N.R.3.sorted.bam A.C.Y.S.2.sorted.bam A.E.N.S.1.sorted.bam M.C.N.R.1.sorted.bam M.C.N.S.3.sorted.bam M.E.N.S.2.sorted.bam A.C.N.S.1.sorted.bam A.C.Y.S.3.sorted.bam A.E.N.S.2.sorted.bam M.C.N.R.2.sorted.bam M.E.N.R.1.sorted.bam M.E.N.S.3.sorted.bam A.C.N.S.2.sorted.bam A.E.N.R.1.sorted.bam A.E.N.S.3.sorted.bam M.C.N.R.3.sorted.bam M.E.N.R.2.sorted.bam Multi-qc directory is also available. Requesting new first level review by Nowlan Freese .
            ann.loraine Ann Loraine made changes -
            Status In Progress [ 3 ] Needs 1st Level Review [ 10005 ]
            ann.loraine Ann Loraine made changes -
            Assignee Ann Loraine [ aloraine ]
            nfreese Nowlan Freese made changes -
            Status Needs 1st Level Review [ 10005 ] First Level Review in Progress [ 10301 ]
            nfreese Nowlan Freese made changes -
            Assignee Nowlan Freese [ nfreese ]
            Hide
            nfreese Nowlan Freese added a comment - - edited

            All sorted bam files are present with file sizes that are in the range of what I would expect. I was able to download the multiqc_report.html and view it in a web browser.

            Show
            nfreese Nowlan Freese added a comment - - edited All sorted bam files are present with file sizes that are in the range of what I would expect. I was able to download the multiqc_report.html and view it in a web browser.
            nfreese Nowlan Freese made changes -
            Assignee Nowlan Freese [ nfreese ]
            nfreese Nowlan Freese made changes -
            Status First Level Review in Progress [ 10301 ] Ready for Pull Request [ 10304 ]
            ann.loraine Ann Loraine made changes -
            Status Ready for Pull Request [ 10304 ] Pull Request Submitted [ 10101 ]
            ann.loraine Ann Loraine made changes -
            Status Pull Request Submitted [ 10101 ] Reviewing Pull Request [ 10303 ]
            ann.loraine Ann Loraine made changes -
            Status Reviewing Pull Request [ 10303 ] Merged Needs Testing [ 10002 ]
            ann.loraine Ann Loraine made changes -
            Status Merged Needs Testing [ 10002 ] Post-merge Testing In Progress [ 10003 ]
            ann.loraine Ann Loraine made changes -
            Resolution Done [ 10000 ]
            Status Post-merge Testing In Progress [ 10003 ] Closed [ 6 ]
            ann.loraine Ann Loraine made changes -
            Assignee Ann Loraine [ aloraine ]
            nfreese Nowlan Freese made changes -
            Link This issue relates to IGBF-3099 [ IGBF-3099 ]
            ann.loraine Ann Loraine made changes -
            Summary Align and process RNA-Seq data using nf-core/rnaseq pipeline Align and process rice RNA-Seq data using nf-core/rnaseq pipeline

              People

              • Assignee:
                ann.loraine Ann Loraine
                Reporter:
                ann.loraine Ann Loraine
              • Votes:
                0 Vote for this issue
                Watchers:
                Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: