[IGBF-3709] Edit Draft 2 of the pipeline document - JIRA UNCC

Details

Type: Task
Status: To-Do (View Workflow)
Priority: Major
Resolution: Unresolved
Affects Version/s: None
Fix Version/s: None
Labels:
None

Story Points:
4
Epic Link:
Support NSF pollen grant
Sprint:
Spring 9

Description

Google Document: https://docs.google.com/document/d/1ig9ET-ykXF5nAX3P487cXWmZDGUlQpcwrvFXpbyP5vw/edit?usp=sharing

Section Edits:

Abstract: Add more detail to the abstract and make sure to be clear about the goal of this specific pipeline
SRA Submission: Waiting on a citation and need more information about prepping meta data and submitting to SRA
Download data from SRA: Edit the prefetch script so that it can launch all of the SRR names at once. Also make sure to describe what prefetch is specifically doing and downloading. Make sure to remove the 'vdb-config --interactive' line and use the GitHub link to talk more about configuration and online vs offline work on the cluster. Make sure the directories aren't hard coded in the document make sure to use "$SLURM_SUBMIT_DIR" instead when possible. Lastly, we might not need to use the perl validate script because SRA tool kit has a validate option available that we can use instead.
Run data with Nextflow nf-core rnaseq pipeline: An updated version of Nextflow may be used next and we will need to update that version. The newest version identifies strandedness automatically so we won't need to check the mutliqc report afterwards anymore. Depending on the HPC cluster someone might be able to run nextflow online or offline and that will need to be discussed in the paper. Might need to describe what the bam, find_junction, and coverage graphs can be used for after creating them.
Sanity check SRA Data with Original data: This section will need to be entirely re-written due to a markdown being created that checks the SRA submission to the original data. That markdown can be added as a link from bitbucket or hardcoded into the document we can decide which is better when we get to this section.
Submit data to IGB Quickload: This section is a work in progress due to the XML python script being specific to environmental options on each users computer. We will need to troubleshoot the script more and finalize this section with the correct code to use for uploading data to IGB.
References: Might want to start adding more references inside the text but right now references might be a draft 3 task.

Attachments

Issue Links

relates to

IGBF-3622 Create a Draft 1 Pipeline Document

Closed

Activity

There are no comments yet on this issue.

People

Assignee:

Molly Davis

Reporter:

Molly Davis

Votes:

0 Vote for this issue

Watchers:

1 Start watching this issue

Dates

Created:

26/Apr/24 11:18 AM

Updated:

21/May/24 10:05 AM