Uploaded image for project: 'IGB'
  1. IGB
  2. IGBF-3709

Edit Draft 2 of the pipeline document

    Details

    • Type: Task
    • Status: To-Do (View Workflow)
    • Priority: Major
    • Resolution: Unresolved
    • Affects Version/s: None
    • Fix Version/s: None
    • Labels:
      None

      Description

      Google Document: https://docs.google.com/document/d/1ig9ET-ykXF5nAX3P487cXWmZDGUlQpcwrvFXpbyP5vw/edit?usp=sharing

      Section Edits:

      • Abstract: Add more detail to the abstract and make sure to be clear about the goal of this specific pipeline
      • SRA Submission: Waiting on a citation and need more information about prepping meta data and submitting to SRA
      • Download data from SRA: Edit the prefetch script so that it can launch all of the SRR names at once. Also make sure to describe what prefetch is specifically doing and downloading. Make sure to remove the 'vdb-config --interactive' line and use the GitHub link to talk more about configuration and online vs offline work on the cluster. Make sure the directories aren't hard coded in the document make sure to use "$SLURM_SUBMIT_DIR" instead when possible. Lastly, we might not need to use the perl validate script because SRA tool kit has a validate option available that we can use instead.
      • Run data with Nextflow nf-core rnaseq pipeline: An updated version of Nextflow may be used next and we will need to update that version. The newest version identifies strandedness automatically so we won't need to check the mutliqc report afterwards anymore. Depending on the HPC cluster someone might be able to run nextflow online or offline and that will need to be discussed in the paper. Might need to describe what the bam, find_junction, and coverage graphs can be used for after creating them.
      • Sanity check SRA Data with Original data: This section will need to be entirely re-written due to a markdown being created that checks the SRA submission to the original data. That markdown can be added as a link from bitbucket or hardcoded into the document we can decide which is better when we get to this section.
      • Submit data to IGB Quickload: This section is a work in progress due to the XML python script being specific to environmental options on each users computer. We will need to troubleshoot the script more and finalize this section with the correct code to use for uploading data to IGB.
      • References: Might want to start adding more references inside the text but right now references might be a draft 3 task.

        Attachments

          Issue Links

            Activity

            Mdavis4290 Molly Davis created issue -
            Mdavis4290 Molly Davis made changes -
            Field Original Value New Value
            Epic Link IGBF-2993 [ 21429 ]
            Mdavis4290 Molly Davis made changes -
            Link This issue relates to IGBF-3622 [ IGBF-3622 ]
            Mdavis4290 Molly Davis made changes -
            Rank Ranked higher
            Mdavis4290 Molly Davis made changes -
            Status To-Do [ 10305 ] In Progress [ 3 ]
            Mdavis4290 Molly Davis made changes -
            Status In Progress [ 3 ] To-Do [ 10305 ]
            Mdavis4290 Molly Davis made changes -
            Rank Ranked higher
            Mdavis4290 Molly Davis made changes -
            Status To-Do [ 10305 ] In Progress [ 3 ]
            Mdavis4290 Molly Davis made changes -
            Status In Progress [ 3 ] To-Do [ 10305 ]
            Mdavis4290 Molly Davis made changes -
            Status To-Do [ 10305 ] In Progress [ 3 ]
            Mdavis4290 Molly Davis made changes -
            Description *Google Document*: https://docs.google.com/document/d/1ig9ET-ykXF5nAX3P487cXWmZDGUlQpcwrvFXpbyP5vw/edit?usp=sharing

            *+Section Edits+*:

            * *Abstract*: Add more detail to the abstract and make sure to be clear about the goal of this specific pipeline
            * *SRA Submission*: Waiting on a citation
            * *Download data from SRA*: Edit the prefetch script so that it can launch all of the SRR names at once. Also make sure to describe what prefetch is specifically doing and downloading. Make sure to remove the 'vdb-config --interactive' line and use the GitHub link to talk more about configuration and online vs offline work on the cluster. Make sure the directories aren't hard coded in the document make sure to use "$SLURM_SUBMIT_DIR" instead when possible. Lastly, we might not need to use the perl validate script because SRA tool kit has a validate option available that we can use instead.
            * *Run data with Nextflow nf-core rnaseq pipeline*: An updated version of Nextflow may be used next and we will need to update that version. The newest version identifies strandedness automatically so we won't need to check the mutliqc report afterwards anymore. Depending on the HPC cluster someone might be able to run nextflow online or offline and that will need to be discussed in the paper. Might need to describe what the bam, find_junction, and coverage graphs can be used for after creating them.
            * *Sanity check SRA Data with Original data*: This section will need to be entirely re-written due to a markdown being created that checks the SRA submission to the original data. That markdown can be added as a link from bitbucket or hardcoded into the document we can decide which is better when we get to this section.
            * *Submit data to IGB Quickload*: This section is a work in progress due to the XML python script being specific to environmental options on each users computer. We will need to troubleshoot the script more and finalize this section with the correct code to use for uploading data to IGB.
            * *References*: Might want to start adding more references inside the text but right now references might be a draft 3 task.
            *Google Document*: https://docs.google.com/document/d/1ig9ET-ykXF5nAX3P487cXWmZDGUlQpcwrvFXpbyP5vw/edit?usp=sharing

            *+Section Edits+*:

            * *Abstract*: Add more detail to the abstract and make sure to be clear about the goal of this specific pipeline
            * *SRA Submission*: Waiting on a citation and need more information about prepping meta data and submitting to SRA
            * *Download data from SRA*: Edit the prefetch script so that it can launch all of the SRR names at once. Also make sure to describe what prefetch is specifically doing and downloading. Make sure to remove the 'vdb-config --interactive' line and use the GitHub link to talk more about configuration and online vs offline work on the cluster. Make sure the directories aren't hard coded in the document make sure to use "$SLURM_SUBMIT_DIR" instead when possible. Lastly, we might not need to use the perl validate script because SRA tool kit has a validate option available that we can use instead.
            * *Run data with Nextflow nf-core rnaseq pipeline*: An updated version of Nextflow may be used next and we will need to update that version. The newest version identifies strandedness automatically so we won't need to check the mutliqc report afterwards anymore. Depending on the HPC cluster someone might be able to run nextflow online or offline and that will need to be discussed in the paper. Might need to describe what the bam, find_junction, and coverage graphs can be used for after creating them.
            * *Sanity check SRA Data with Original data*: This section will need to be entirely re-written due to a markdown being created that checks the SRA submission to the original data. That markdown can be added as a link from bitbucket or hardcoded into the document we can decide which is better when we get to this section.
            * *Submit data to IGB Quickload*: This section is a work in progress due to the XML python script being specific to environmental options on each users computer. We will need to troubleshoot the script more and finalize this section with the correct code to use for uploading data to IGB.
            * *References*: Might want to start adding more references inside the text but right now references might be a draft 3 task.
            Mdavis4290 Molly Davis made changes -
            Status In Progress [ 3 ] To-Do [ 10305 ]
            ann.loraine Ann Loraine made changes -
            Sprint Spring 9 [ 193 ] Spring 9, Spring 10 [ 193, 194 ]
            ann.loraine Ann Loraine made changes -
            Rank Ranked higher
            Mdavis4290 Molly Davis made changes -
            Rank Ranked higher
            Mdavis4290 Molly Davis made changes -
            Status To-Do [ 10305 ] In Progress [ 3 ]
            Mdavis4290 Molly Davis made changes -
            Status In Progress [ 3 ] To-Do [ 10305 ]
            Mdavis4290 Molly Davis made changes -
            Sprint Spring 9, Spring 10 [ 193, 194 ] Spring 9 [ 193 ]

              People

              • Assignee:
                Mdavis4290 Molly Davis
                Reporter:
                Mdavis4290 Molly Davis
              • Votes:
                0 Vote for this issue
                Watchers:
                1 Start watching this issue

                Dates

                • Created:
                  Updated: