Uploaded image for project: 'IGB'
  1. IGB
  2. IGBF-3721

Document Tardigrade RNA-Seq Pipeline

    Details

    • Type: Task
    • Status: Closed (View Workflow)
    • Priority: Major
    • Resolution: Done
    • Affects Version/s: None
    • Fix Version/s: None
    • Labels:
      None

      Description

      The genomic pipeline for downloading and processing tomato data has been well documented, but it includes a version of Nextflow which has recently been superseded. Now that we're working on setting up tardigrade quickloads, we should ideally be adapting this pipeline to use the newest version of Nextflow. This pipeline will likely be shared with the Goldstein lab as part of our outreach efforts.

      Task: Document the genomic pipeline for downloading and processing tardigrade RNA-Seq data with the newest version of Nextflow. Ensure that it is easy to understand (i.e., keep the Goldstein lab in mind as a target audience).

        Attachments

          Issue Links

            Activity

            pkulzer Paige Kulzer created issue -
            pkulzer Paige Kulzer made changes -
            Field Original Value New Value
            Epic Link IGBF-1395 [ 17470 ]
            pkulzer Paige Kulzer made changes -
            Link This issue relates to IGBF-3708 [ IGBF-3708 ]
            Hide
            pkulzer Paige Kulzer added a comment -
            Show
            pkulzer Paige Kulzer added a comment - Link to the Google Doc version of the pipeline: https://docs.google.com/document/d/1o5iAcs4Bk6hNrprGu31-JPdNrSWn1hSetQbGrz1v_SM/edit?usp=sharing
            pkulzer Paige Kulzer made changes -
            Status To-Do [ 10305 ] In Progress [ 3 ]
            pkulzer Paige Kulzer made changes -
            Status In Progress [ 3 ] To-Do [ 10305 ]
            ann.loraine Ann Loraine made changes -
            Sprint Spring 9 [ 193 ] Spring 9, Spring 10 [ 193, 194 ]
            ann.loraine Ann Loraine made changes -
            Rank Ranked higher
            pkulzer Paige Kulzer made changes -
            Summary Document Genomic Pipeline for Tardigrades Document Tardigrade RNA-Seq Pipeline
            pkulzer Paige Kulzer made changes -
            Status To-Do [ 10305 ] In Progress [ 3 ]
            Hide
            pkulzer Paige Kulzer added a comment -

            Moving this to "Needs Review" alongside the linked ticket (IGBF-3708) so that the pipeline documentation can be reviewed at the same time. Please note that I've kept track of encountered errors and their fixes on the linked ticket, not in the pipeline documentation.

            Show
            pkulzer Paige Kulzer added a comment - Moving this to "Needs Review" alongside the linked ticket ( IGBF-3708 ) so that the pipeline documentation can be reviewed at the same time. Please note that I've kept track of encountered errors and their fixes on the linked ticket, not in the pipeline documentation.
            pkulzer Paige Kulzer made changes -
            Status In Progress [ 3 ] Needs 1st Level Review [ 10005 ]
            pkulzer Paige Kulzer made changes -
            Assignee Paige Kulzer [ pkulzer ] Ann Loraine [ aloraine ]
            pkulzer Paige Kulzer made changes -
            Link This issue relates to IGBF-3735 [ IGBF-3735 ]
            Hide
            ann.loraine Ann Loraine added a comment - - edited

            Thanks for the draft!

            Change requests:

            • Please move Table 1 to the end of the document.
            • Make a new section "Introduction" and add some filler text "explain what this document is for" in the Introduction. We'll add more later!
            • Please remove version-controlled code from the document. Instead, link to the file in bitbucket, e.g., https://bitbucket.org/lorainelab/tardigrade/src/main/src/prefetch.sh
            • Include some text explaining what the user will see in their file system after the "prefetch.sh" command has run. What should they observe after running it?
            • Instead of running "prefetch" in "fastq" run it inside another directory called "sra" (you make it). This will ensure that the "sra" files get downloaded into a single location where no other important files will reside. Then, after the ".sra" files get downloaded and used to create the "fastq" files, we can delete the "sra" directory and all its big files in a single command.
            • Run the "fasterq-dump" step inside the "sra" directory after runing the "prefetch" step.
            • After that, we use "gzip.sh" to compress all the fastq files.
            • After the "fasterq-dump" and "gzip" steps finish, we'll move the compressed files to a new directory called "fastq" using unix "mv" command.

            Also, a note:

            The "fasterq-dump" script is going to break when it encounters non-paired-end data. That is, if there is no read 2 file, it will fail. We need to develop a new version that is more able to handle non-paired-end data.

            Show
            ann.loraine Ann Loraine added a comment - - edited Thanks for the draft! Change requests: Please move Table 1 to the end of the document. Make a new section "Introduction" and add some filler text "explain what this document is for" in the Introduction. We'll add more later! Please remove version-controlled code from the document. Instead, link to the file in bitbucket, e.g., https://bitbucket.org/lorainelab/tardigrade/src/main/src/prefetch.sh Include some text explaining what the user will see in their file system after the "prefetch.sh" command has run. What should they observe after running it? Instead of running "prefetch" in "fastq" run it inside another directory called "sra" (you make it). This will ensure that the "sra" files get downloaded into a single location where no other important files will reside. Then, after the ".sra" files get downloaded and used to create the "fastq" files, we can delete the "sra" directory and all its big files in a single command. Run the "fasterq-dump" step inside the "sra" directory after runing the "prefetch" step. After that, we use "gzip.sh" to compress all the fastq files. After the "fasterq-dump" and "gzip" steps finish, we'll move the compressed files to a new directory called "fastq" using unix "mv" command. Also, a note: The "fasterq-dump" script is going to break when it encounters non-paired-end data. That is, if there is no read 2 file, it will fail. We need to develop a new version that is more able to handle non-paired-end data.
            ann.loraine Ann Loraine made changes -
            Status Needs 1st Level Review [ 10005 ] First Level Review in Progress [ 10301 ]
            ann.loraine Ann Loraine made changes -
            Status First Level Review in Progress [ 10301 ] To-Do [ 10305 ]
            ann.loraine Ann Loraine made changes -
            Assignee Ann Loraine [ aloraine ] Paige Kulzer [ pkulzer ]
            pkulzer Paige Kulzer made changes -
            Sprint Spring 9, Spring 10 [ 193, 194 ] Spring 9, Summer 1 [ 193, 195 ]
            pkulzer Paige Kulzer made changes -
            Sprint Spring 9, Summer 1 [ 193, 195 ] Spring 9 [ 193 ]
            pkulzer Paige Kulzer made changes -
            Sprint Spring 9 [ 193 ] Spring 9, Summer 1 [ 193, 195 ]
            nfreese Nowlan Freese made changes -
            Sprint Spring 9, Summer 1 [ 193, 195 ] Spring 9 [ 193 ]
            nfreese Nowlan Freese made changes -
            Sprint Spring 9 [ 193 ] Spring 9, Summer 2 [ 193, 196 ]
            nfreese Nowlan Freese made changes -
            Rank Ranked higher
            ann.loraine Ann Loraine made changes -
            Epic Link IGBF-1395 [ 17470 ] IGBF-3778 [ 22997 ]
            ann.loraine Ann Loraine made changes -
            Sprint Spring 9, Summer 2 [ 193, 196 ] Spring 9, Summer 2, Summer 3 [ 193, 196, 197 ]
            ann.loraine Ann Loraine made changes -
            Rank Ranked higher
            ann.loraine Ann Loraine made changes -
            Sprint Spring 9, Summer 2, Summer 3 [ 193, 196, 197 ] Spring 9, Summer 2, Summer 3, Summer 4 [ 193, 196, 197, 198 ]
            ann.loraine Ann Loraine made changes -
            Rank Ranked higher
            ann.loraine Ann Loraine made changes -
            Sprint Spring 9, Summer 2, Summer 3, Summer 4 [ 193, 196, 197, 198 ] Spring 9, Summer 2, Summer 3, Summer 4, Summer 5 [ 193, 196, 197, 198, 199 ]
            ann.loraine Ann Loraine made changes -
            Rank Ranked higher
            pkulzer Paige Kulzer made changes -
            Sprint Spring 9, Summer 2, Summer 3, Summer 4, Summer 5 [ 193, 196, 197, 198, 199 ] Spring 9, Summer 2, Summer 3, Summer 4, Summer 6 [ 193, 196, 197, 198, 200 ]
            Hide
            ann.loraine Ann Loraine added a comment - - edited

            See linked ticket IGBF-3790 for the most recent trial run of the tardigrade RNA-Seq pipeline in Loraine Lab git repository called "tardigrade."

            Show
            ann.loraine Ann Loraine added a comment - - edited See linked ticket IGBF-3790 for the most recent trial run of the tardigrade RNA-Seq pipeline in Loraine Lab git repository called "tardigrade."
            ann.loraine Ann Loraine made changes -
            Link This issue relates to IGBF-3790 [ IGBF-3790 ]
            ann.loraine Ann Loraine made changes -
            Status To-Do [ 10305 ] In Progress [ 3 ]
            ann.loraine Ann Loraine made changes -
            Status In Progress [ 3 ] Needs 1st Level Review [ 10005 ]
            ann.loraine Ann Loraine made changes -
            Status Needs 1st Level Review [ 10005 ] First Level Review in Progress [ 10301 ]
            ann.loraine Ann Loraine made changes -
            Status First Level Review in Progress [ 10301 ] Ready for Pull Request [ 10304 ]
            ann.loraine Ann Loraine made changes -
            Status Ready for Pull Request [ 10304 ] Pull Request Submitted [ 10101 ]
            ann.loraine Ann Loraine made changes -
            Status Pull Request Submitted [ 10101 ] Reviewing Pull Request [ 10303 ]
            ann.loraine Ann Loraine made changes -
            Status Reviewing Pull Request [ 10303 ] Merged Needs Testing [ 10002 ]
            ann.loraine Ann Loraine made changes -
            Status Merged Needs Testing [ 10002 ] Post-merge Testing In Progress [ 10003 ]
            ann.loraine Ann Loraine made changes -
            Resolution Done [ 10000 ]
            Status Post-merge Testing In Progress [ 10003 ] Closed [ 6 ]

              People

              • Assignee:
                pkulzer Paige Kulzer
                Reporter:
                pkulzer Paige Kulzer
              • Votes:
                0 Vote for this issue
                Watchers:
                2 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: