Uploaded image for project: 'IGB'
  1. IGB
  2. IGBF-3688

Rerun nextflow with ARE 120 minute Muday data

    Details

    • Type: Task
    • Status: Closed (View Workflow)
    • Priority: Major
    • Resolution: Done
    • Affects Version/s: None
    • Fix Version/s: None
    • Labels:
      None

      Description

      SRP499796

      Directory: /projects/tomato_genome/fnb/dataprocessing/SRP499796

      SL4 and SL5 need to be run with this data set.

      For this task, we need to confirm and sanity-check the ARE 120 minute flavonoid data that Rob recently uploaded and submitted to the Sequence Read Archive.
      If the data are good, we will replace all the existing BAM, junctions, etc. files deployed in the "hotpollen" quickload site with newly processed data.
      For this task:

      • Check SRP on NCBI and review submission
      • Download the data onto the cluster by using the SRP name
      • Run nf-core/rnaseq pipeline
      • Run our coverage graph and junctions scripts on the data

      Note that all files should now use their "SRR" names instead of the existing file names.

        Attachments

          Issue Links

            Activity

            Hide
            ann.loraine Ann Loraine added a comment -

            I reviewed the files as follows:

            • Checked that I could open the MultiQC files in my Web browser. I was able to open and review both.
            • Checked that the data are reported as "unstranded" in the run configuration file SRP499796.csv. They were.
            • Checked that RSeQC reported an about equal number of sense and antisense reads (with respect to gene models provided to the pipeline). It did.
            • However, I notice there is no sample sheet available in the repository for this data set. We will need to provide this to set up the data in IGB quickload. I made a ticket for it with more details.

            Testing passes. Moving to DONE.

            Show
            ann.loraine Ann Loraine added a comment - I reviewed the files as follows: Checked that I could open the MultiQC files in my Web browser. I was able to open and review both. Checked that the data are reported as "unstranded" in the run configuration file SRP499796.csv. They were. Checked that RSeQC reported an about equal number of sense and antisense reads (with respect to gene models provided to the pipeline). It did. However, I notice there is no sample sheet available in the repository for this data set. We will need to provide this to set up the data in IGB quickload. I made a ticket for it with more details. Testing passes. Moving to DONE.
            Hide
            ann.loraine Ann Loraine added a comment -

            PR is merged. Moving to "ready for testing."

            To test, review the files. If no problems observed, move forward to "Done."

            Show
            ann.loraine Ann Loraine added a comment - PR is merged. Moving to "ready for testing." To test, review the files. If no problems observed, move forward to "Done."
            Hide
            Mdavis4290 Molly Davis added a comment - - edited

            Branch: https://bitbucket.org/mdavis4290/molly5-flavonoid-rnaseq/branch/IGBF-3688
            PR: https://bitbucket.org/hotpollen/flavonoid-rnaseq/pull-requests/44

            • ARE-120min-analysis/SRP499796.csv
            • ARE-120min-analysis/SRP499796_SL4_multiqc_report.html
            • ARE-120min-analysis/SRP499796_SL5_multiqc_report.html
            Show
            Mdavis4290 Molly Davis added a comment - - edited Branch : https://bitbucket.org/mdavis4290/molly5-flavonoid-rnaseq/branch/IGBF-3688 PR : https://bitbucket.org/hotpollen/flavonoid-rnaseq/pull-requests/44 ARE-120min-analysis/SRP499796.csv ARE-120min-analysis/SRP499796_SL4_multiqc_report.html ARE-120min-analysis/SRP499796_SL5_multiqc_report.html
            Hide
            robofjoy Robert Reid added a comment -

            The SL4 Folder
            The TSV files look complete.
            All the bedgraphs are about 45MB in size and there are 24 files.
            All the bed files are about 4.5MB in size and there are 24 files.
            All tbi files are ~ 70kb and there are 24 bedgraph versions and 24 bed versions.
            Bams are 2.8 GB in size and there are 24.

            The SL5 Folder
            The TSV files look complete. 36K number of lines looks correct. for SL5.
            All the bedgraphs are about 45MB in size and there are 24 files.
            All the bed files are about 4.5MB in size and there are 24 files.
            All tbi files are ~ 70kb and there are 24 bedgraph versions and 24 bed versions.
            Bams are 2.8 GB in size and there are 24.

            This looks correct! Passing it back to Molly.

            Show
            robofjoy Robert Reid added a comment - The SL4 Folder The TSV files look complete. All the bedgraphs are about 45MB in size and there are 24 files. All the bed files are about 4.5MB in size and there are 24 files. All tbi files are ~ 70kb and there are 24 bedgraph versions and 24 bed versions. Bams are 2.8 GB in size and there are 24. The SL5 Folder The TSV files look complete. 36K number of lines looks correct. for SL5. All the bedgraphs are about 45MB in size and there are 24 files. All the bed files are about 4.5MB in size and there are 24 files. All tbi files are ~ 70kb and there are 24 bedgraph versions and 24 bed versions. Bams are 2.8 GB in size and there are 24. This looks correct! Passing it back to Molly.
            Hide
            Mdavis4290 Molly Davis added a comment - - edited

            Directories:
            /projects/tomato_genome/fnb/dataprocessing/SRP499796/nfcore-SL4/results/star_salmon
            /projects/tomato_genome/fnb/dataprocessing/SRP499796/nfcore-SL5/results/star_salmon
            Reviewer:
            Check that files have reasonable sizes (no "zero" size files, for example)
            Check that every "FJ.bed.gz" file has a corresponding "FJ.bed.gz.tbi" index file
            Check that every bam file has a corresponding "FJ.bed.gz" file
            Check that every bam file has a corresponding "scaled.bedgraph.gz" file
            Check that every "scaled.bedgraph.gz" has a corresponding "scaled.bedgraph.gz.tbi"

            Show
            Mdavis4290 Molly Davis added a comment - - edited Directories : /projects/tomato_genome/fnb/dataprocessing/SRP499796/nfcore-SL4/results/star_salmon /projects/tomato_genome/fnb/dataprocessing/SRP499796/nfcore-SL5/results/star_salmon Reviewer : Check that files have reasonable sizes (no "zero" size files, for example) Check that every "FJ.bed.gz" file has a corresponding "FJ.bed.gz.tbi" index file Check that every bam file has a corresponding "FJ.bed.gz" file Check that every bam file has a corresponding "scaled.bedgraph.gz" file Check that every "scaled.bedgraph.gz" has a corresponding "scaled.bedgraph.gz.tbi"

              People

              • Assignee:
                Mdavis4290 Molly Davis
                Reporter:
                Mdavis4290 Molly Davis
              • Votes:
                0 Vote for this issue
                Watchers:
                3 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: