Details
-
Type: Task
-
Status: Closed (View Workflow)
-
Priority: Major
-
Resolution: Done
-
Affects Version/s: None
-
Fix Version/s: None
-
Labels:None
-
Story Points:1
-
Epic Link:
-
Sprint:Spring 10, Summer 1, Summer 2
Description
Situation: The RNA-Seq Quickload needs reorganizing prior to the ASPB workshop. Specifically, the Muday Lab folder is currently organized in a way that makes sense, but I think that we could optimize this folder for the workshop by organizing it a bit differently.
Tasks:
1. Reorganize the Muday Lab folder:
- List all of the A (are) samples first, then V (VF36), then F (VF36-F3H-T3).
- Within each of these subgroups, keep them listed in order first by time point, then by temperature, then by replicate.
- Do this for all three subfolders (Reads, Coverage Graphs, and Junctions).
2. Give the last folder in the tomato RNA-Seq Quickload ("SRP328042 - anther development under simulated drought stress") a name that matches with the other folders. I believe the PI associated with that SRP data is Gang Lu. Here's a link to their paper: https://www.mdpi.com/2073-4409/10/7/1809.
With all of this information, the title of the folder after renaming should be something like "Lu Lab - anther, # varieties, drought stress, # minutes (SRP328042)".
Attachments
Issue Links
- relates to
-
IGBF-3672 Draft a workshop outline for ASPB 2024
- Closed
I have made many changes to the organization and naming schemes for the data in IGB RNA-Seq quickload.
First main change is that I sorted the "are" data from the Muday Lab as requested.
Second main change is that all the data sets created for the NSF project - including data harvested from the SRA - now reside in a folder named for the project, with subfolders named for individual laboratories and tissue types investigated.
I did this because I felt it is useful to separate different lab's datasets into folders because where a dataset was produced (the laboratory location) is a confounder in data analysis.
Similarly, some experiments were done at different times, by different people. Those datasets are also separated into folders.
Thus, I am separating data collections into folders using confounders. The idea is to make it super clear to users which collections of samples can be compared to each others in the same collection using standard methods, and which would need to be compared with the known confounders taken into account.
Please review by visiting tomato genome assemblies from June 2022 (SL5) and Sept. 2019 (SL4).