Details
-
Type:
Task
-
Status: Closed (View Workflow)
-
Priority:
Major
-
Resolution: Done
-
Affects Version/s: None
-
Fix Version/s: None
-
Labels:None
-
Story Points:1
-
Epic Link:
-
Sprint:Summer 4 2023 June 26
Description
Anthony Postiglione request:
"Hi Molly,
Thank you so much for your help with getting all of the RNA-Seq data visualized for us! I have just one more favor to ask of you when it comes to my thesis chapter, I need a brief explanation of how the RNA-Seq analysis was handled on your end. I am out of my depth here for sure, so would you mind adding some brief detail to the section I am pasting below? Thank you so much again!
Analysis of RNA Seq samples
PCA plots were generated by XXX, EdgeR was used to identify DE genes, Volcano plots were generated to show temperature effects within genotypes, and lists of DE genes were compared across genotypes..."
Email back a response or post in thesis document:
https://docs.google.com/document/d/1iJd3wUgBSY8lXlIdPls6_4TU0T2h5Ecb5twQkB7dFg0/edit
Response: The RNA sequencing data was processed through the nf-core/rnaseq bioinformatics pipeline. A STAR alignment was conducted with the SL5 tomato genome. Output included a Salmon gene expression counts file. Each column contained a different experiment that was conducted whether at a different time, temperature, or genotype. Time durations being 15 minutes, 30 minutes, 45 minutes, or 75 minutes. Temperature being 28 degrees celsius or 34 degrees celsius. Genotypes being VF36, OE3, or ARE. The values in the matrix should be un-normalized counts or estimated counts of sequencing reads when used with DESeq. The DESeq2 model internally corrects for library size, so transformed or normalized values such as counts scaled by library size should not be used as input. DESeq was used to identify differentially expressed (DE) genes. The design included the factors 'time' and 'temperature' for the PCA plots. The PCA plots were created with the use of RStudio and plotPCA() and ggplot() packages. Volcano plots were produced with the use of DESeq output also but instead only had a 'temperature' design. This is due to wanting to only view differentially expressed genes that are impacted by heat stress. The volcano plots were created with the use of RStudio and EnhancedVolcano() package.
Code Resources and Packages:
Data and Markdown location: https://bitbucket.org/hotpollen/flavonoid-rnaseq/src/main/
Counts file: muday-144-SL5_counts-salmon.txt
PCA plot Markdown: Muday-DESeq-PCA-Plots.Rmd
Volcano Plot Markdown: FindControlVsStressDEGenes-DESeq.Rmd