Uploaded image for project: 'IGB'
  1. IGB
  2. IGBF-3720

Re-run Nextflow Muday time course data again with SL5 and data downloaded from SRA

    Details

    • Type: Task
    • Status: Closed (View Workflow)
    • Priority: Major
    • Resolution: Done
    • Affects Version/s: None
    • Fix Version/s: None
    • Labels:
      None

      Description

      SRP460750

      Directory: /projects/tomato_genome/fnb/dataprocessing/SRP460750/nfcore-SL5-2024-05-07

      Previously we noticed that SRA had mismatched some of the data incorrectly and 16 of the sample-names were mislabeled. Dr. Reid reached out and had SRA change everything to the correct sample names. Now we must rerun the muday SRA data again on the cluster with nextflow and make sure the data is correctly labeled.

      For this task, we need to confirm and sanity-check the muday time course data that Rob recently uploaded and submitted to the Sequence Read Archive.
      If the data are good, we will replace all the existing BAM, junctions, etc. files deployed in the "hotpollen" quickload site with newly processed data.
      For this task:

      • Check SRP on NCBI and review submission
      • Download the data onto the cluster by using the SRP name
      • Run nf-core/rnaseq pipeline
      • Run our coverage graph and junctions scripts on the data

      Note that all files should now use their "SRR" names instead of the existing file names.

        Attachments

          Issue Links

            Activity

            Mdavis4290 Molly Davis created issue -
            Mdavis4290 Molly Davis made changes -
            Field Original Value New Value
            Epic Link IGBF-2993 [ 21429 ]
            Mdavis4290 Molly Davis made changes -
            Link This issue relates to IGBF-3507 [ IGBF-3507 ]
            Mdavis4290 Molly Davis made changes -
            Link This issue relates to IGBF-3683 [ IGBF-3683 ]
            Mdavis4290 Molly Davis made changes -
            Status To-Do [ 10305 ] In Progress [ 3 ]
            Mdavis4290 Molly Davis made changes -
            Rank Ranked higher
            Mdavis4290 Molly Davis made changes -
            Description SRP460750

            *Directory*: /projects/tomato_genome/fnb/dataprocessing/SRP460750/
             

            For this task, we need to confirm and sanity-check the muday time course data that Rob recently uploaded and submitted to the Sequence Read Archive.
            If the data are good, we will replace all the existing BAM, junctions, etc. files deployed in the "hotpollen" quickload site with newly processed data.
            For this task:
            * Check SRP on NCBI and review submission
            * Download the data onto the cluster by using the SRP name
            * Run nf-core/rnaseq pipeline
            * Run our coverage graph and junctions scripts on the data

            Note that all files should now use their "SRR" names instead of the existing file names.
            SRP460750

            *Directory*: /projects/tomato_genome/fnb/dataprocessing/SRP460750/
             
            Previously we noticed that SRA had mismatched some of the data incorrectly and 16 of the sample-names were incorrect. Dr. Reid reached out and had SRA change everything to the correct sample names. Now we must rerun the muday SRA data again on the cluster with nextflow and make sure the data is correctly labeled.

            For this task, we need to confirm and sanity-check the muday time course data that Rob recently uploaded and submitted to the Sequence Read Archive.
            If the data are good, we will replace all the existing BAM, junctions, etc. files deployed in the "hotpollen" quickload site with newly processed data.
            For this task:
            * Check SRP on NCBI and review submission
            * Download the data onto the cluster by using the SRP name
            * Run nf-core/rnaseq pipeline
            * Run our coverage graph and junctions scripts on the data

            Note that all files should now use their "SRR" names instead of the existing file names.
            Mdavis4290 Molly Davis made changes -
            Description SRP460750

            *Directory*: /projects/tomato_genome/fnb/dataprocessing/SRP460750/
             
            Previously we noticed that SRA had mismatched some of the data incorrectly and 16 of the sample-names were incorrect. Dr. Reid reached out and had SRA change everything to the correct sample names. Now we must rerun the muday SRA data again on the cluster with nextflow and make sure the data is correctly labeled.

            For this task, we need to confirm and sanity-check the muday time course data that Rob recently uploaded and submitted to the Sequence Read Archive.
            If the data are good, we will replace all the existing BAM, junctions, etc. files deployed in the "hotpollen" quickload site with newly processed data.
            For this task:
            * Check SRP on NCBI and review submission
            * Download the data onto the cluster by using the SRP name
            * Run nf-core/rnaseq pipeline
            * Run our coverage graph and junctions scripts on the data

            Note that all files should now use their "SRR" names instead of the existing file names.
            SRP460750

            *Directory*: /projects/tomato_genome/fnb/dataprocessing/SRP460750/
             
            Previously we noticed that SRA had mismatched some of the data incorrectly and 16 of the sample-names were mislabeled. Dr. Reid reached out and had SRA change everything to the correct sample names. Now we must rerun the muday SRA data again on the cluster with nextflow and make sure the data is correctly labeled.

            For this task, we need to confirm and sanity-check the muday time course data that Rob recently uploaded and submitted to the Sequence Read Archive.
            If the data are good, we will replace all the existing BAM, junctions, etc. files deployed in the "hotpollen" quickload site with newly processed data.
            For this task:
            * Check SRP on NCBI and review submission
            * Download the data onto the cluster by using the SRP name
            * Run nf-core/rnaseq pipeline
            * Run our coverage graph and junctions scripts on the data

            Note that all files should now use their "SRR" names instead of the existing file names.
            Mdavis4290 Molly Davis made changes -
            Summary Re-run Nextflow Muday time course data with SL5 and data downloaded from SRA Re-run Nextflow Muday time course data again with SL5 and data downloaded from SRA
            Mdavis4290 Molly Davis made changes -
            Description SRP460750

            *Directory*: /projects/tomato_genome/fnb/dataprocessing/SRP460750/
             
            Previously we noticed that SRA had mismatched some of the data incorrectly and 16 of the sample-names were mislabeled. Dr. Reid reached out and had SRA change everything to the correct sample names. Now we must rerun the muday SRA data again on the cluster with nextflow and make sure the data is correctly labeled.

            For this task, we need to confirm and sanity-check the muday time course data that Rob recently uploaded and submitted to the Sequence Read Archive.
            If the data are good, we will replace all the existing BAM, junctions, etc. files deployed in the "hotpollen" quickload site with newly processed data.
            For this task:
            * Check SRP on NCBI and review submission
            * Download the data onto the cluster by using the SRP name
            * Run nf-core/rnaseq pipeline
            * Run our coverage graph and junctions scripts on the data

            Note that all files should now use their "SRR" names instead of the existing file names.
            SRP460750

            *Directory*: /projects/tomato_genome/fnb/dataprocessing/SRP460750/nfcore-SL5-2024-05-07
             
            Previously we noticed that SRA had mismatched some of the data incorrectly and 16 of the sample-names were mislabeled. Dr. Reid reached out and had SRA change everything to the correct sample names. Now we must rerun the muday SRA data again on the cluster with nextflow and make sure the data is correctly labeled.

            For this task, we need to confirm and sanity-check the muday time course data that Rob recently uploaded and submitted to the Sequence Read Archive.
            If the data are good, we will replace all the existing BAM, junctions, etc. files deployed in the "hotpollen" quickload site with newly processed data.
            For this task:
            * Check SRP on NCBI and review submission
            * Download the data onto the cluster by using the SRP name
            * Run nf-core/rnaseq pipeline
            * Run our coverage graph and junctions scripts on the data

            Note that all files should now use their "SRR" names instead of the existing file names.
            Mdavis4290 Molly Davis made changes -
            Comment [ *Zip the fastq files*:


            {code:java}
            #! /bin/bash

            #SBATCH --job-name=zip_fastq
            #SBATCH --partition=Orion
            #SBATCH --nodes=1
            #SBATCH --ntasks-per-node=1
            #SBATCH --mem=40gb
            #SBATCH --output=%x_%j.out
            #SBATCH --time=24:00:00
            #SBATCH --array=1-72
            #setting up where to grab fastq file names from
            fastq=$(sed -n -e "${SLURM_ARRAY_TASK_ID}p" $SLURM_SUBMIT_DIR/srr.txt)


            cd $SLURM_SUBMIT_DIR

            gzip ${fastq}_1.fastq
            gzip ${fastq}_2.fastq
            {code}

            *Run Script*:


            {code:java}
            sbatch fastq.slurm
            {code}
            ]
            Mdavis4290 Molly Davis made changes -
            Status In Progress [ 3 ] Needs 1st Level Review [ 10005 ]
            Mdavis4290 Molly Davis made changes -
            Assignee Molly Davis [ molly ] Ann Loraine [ aloraine ]
            Mdavis4290 Molly Davis made changes -
            Link This issue relates to IGBF-3727 [ IGBF-3727 ]
            Mdavis4290 Molly Davis made changes -
            Status Needs 1st Level Review [ 10005 ] First Level Review in Progress [ 10301 ]
            Mdavis4290 Molly Davis made changes -
            Status First Level Review in Progress [ 10301 ] Ready for Pull Request [ 10304 ]
            Mdavis4290 Molly Davis made changes -
            Status Ready for Pull Request [ 10304 ] Pull Request Submitted [ 10101 ]
            ann.loraine Ann Loraine made changes -
            Status Pull Request Submitted [ 10101 ] Reviewing Pull Request [ 10303 ]
            ann.loraine Ann Loraine made changes -
            Status Reviewing Pull Request [ 10303 ] Merged Needs Testing [ 10002 ]
            ann.loraine Ann Loraine made changes -
            Status Merged Needs Testing [ 10002 ] Post-merge Testing In Progress [ 10003 ]
            ann.loraine Ann Loraine made changes -
            Resolution Done [ 10000 ]
            Status Post-merge Testing In Progress [ 10003 ] Closed [ 6 ]
            ann.loraine Ann Loraine made changes -
            Assignee Ann Loraine [ aloraine ] Molly Davis [ molly ]
            Mdavis4290 Molly Davis made changes -
            Link This issue relates to IGBF-3739 [ IGBF-3739 ]

              People

              • Assignee:
                Mdavis4290 Molly Davis
                Reporter:
                Mdavis4290 Molly Davis
              • Votes:
                0 Vote for this issue
                Watchers:
                Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: