Details
-
Type:
Task
-
Status: Closed (View Workflow)
-
Priority:
Major
-
Resolution: Done
-
Affects Version/s: None
-
Fix Version/s: None
-
Labels:None
-
Story Points:1
-
Epic Link:
-
Sprint:Summer 4 2023 June 26
Description
Anthony Postiglione request:
"Hi Molly,
Thank you so much for your help with getting all of the RNA-Seq data visualized for us! I have just one more favor to ask of you when it comes to my thesis chapter, I need a brief explanation of how the RNA-Seq analysis was handled on your end. I am out of my depth here for sure, so would you mind adding some brief detail to the section I am pasting below? Thank you so much again!
Analysis of RNA Seq samples
PCA plots were generated by XXX, EdgeR was used to identify DE genes, Volcano plots were generated to show temperature effects within genotypes, and lists of DE genes were compared across genotypes..."
Email back a response or post in thesis document:
https://docs.google.com/document/d/1iJd3wUgBSY8lXlIdPls6_4TU0T2h5Ecb5twQkB7dFg0/edit
Final response:
The RNA sequencing data was processed through the nf-core pipeline (nf-core ref) where all sequences were aligned to the most recent tomato genome (SL5 ref) using STAR (star ref). This produced a Salmon gene expression counts file with each column representing a different experimental sample consisting of different time, temperature, genotype and replicate. The time course consisted of 15, 30, 45 or 75 minutes. Temperature was controlled at 28 or 34 degrees Celsius. Genotypes sequenced were VF36, OE3, or ARE. The un-normalized counts file was processed with DESeq2 (deseq2 ref). The DESeq2 model internally corrects for library size, so transformed or normalized values such as counts scaled by library size are not necessary as input. DESeq2 identified differentially expressed (DE) genes using an experimental design that included the factors 'time' and 'temperature' for the PCA plots. The PCA plots were created with the use of RStudio and plotPCA() and ggplot() packages. Volcano plots were produced using the DESeq2 output based on 'temperature' in order to explore the differentially expressed genes that are impacted by heat stress. The volcano plots were created with the use of RStudio and EnhancedVolcano() package.
nf-core References:
SL5 Reference:
STAR Reference:
DESeq2 References:
plotPCA() Reference:
ggplot() Reference:
EnhancedVolcano() Reference:
Data and Markdown File names: https://bitbucket.org/hotpollen/flavonoid-rnaseq/src/main/
Counts file: muday-144-SL5_counts-salmon.txt
PCA plot Markdown: Muday-DESeq-PCA-Plots.Rmd
Volcano Plot Markdown: FindControlVsStressDEGenes-DESeq.Rmd