Details
-
Type:
Task
-
Status: Closed (View Workflow)
-
Priority:
Major
-
Resolution: Done
-
Affects Version/s: None
-
Fix Version/s: None
-
Labels:None
-
Story Points:3
-
Epic Link:
-
Sprint:Spring 9
Description
SRP460750
Directory: /projects/tomato_genome/fnb/dataprocessing/SRP460750/nfcore-SL5-2024-05-07
Previously we noticed that SRA had mismatched some of the data incorrectly and 16 of the sample-names were mislabeled. Dr. Reid reached out and had SRA change everything to the correct sample names. Now we must rerun the muday SRA data again on the cluster with nextflow and make sure the data is correctly labeled.
For this task, we need to confirm and sanity-check the muday time course data that Rob recently uploaded and submitted to the Sequence Read Archive.
If the data are good, we will replace all the existing BAM, junctions, etc. files deployed in the "hotpollen" quickload site with newly processed data.
For this task:
- Check SRP on NCBI and review submission
- Download the data onto the cluster by using the SRP name
- Run nf-core/rnaseq pipeline
- Run our coverage graph and junctions scripts on the data
Note that all files should now use their "SRR" names instead of the existing file names.
Attachments
Issue Links
- relates to
-
IGBF-3727 Create new input file for checking SRA code
-
- Closed
-
-
IGBF-3739 Re-run SRA muday 2022 timeseries data again with SL4
-
- Closed
-
-
IGBF-3507 Re-run Nextflow Muday time course data with SL4 and data downloaded from SRA
-
- Closed
-
-
IGBF-3683 Update SRA to use the correct sample codes for Muday lab time course data
-
- Closed
-
Re-run Directory: /projects/tomato_genome/fnb/dataprocessing/SRP460750/nfcore-SL5-2024-05-07
Prefetch Script:
Note: Make sure to create the srr.txt file with the list of SRR names. Use the Accession list from NCBI.
Run Script:
Faster Dump Script:
Run Script: