I made some edits:
The RNA sequencing data was processed through the nf-core pipeline (nf-core ref) where all sequences were aligned to the most recent tomato genome (SL5 ref) using STAR (star ref). This produced a Salmon gene expression counts file with each column representing a different experimental sample consisting of different time, temperature, genotype and replicate. The time course consisted of 15, 30, 45 or 75 minutes. Temperature was controlled at 28 or 34 degrees Celsius. Genotypes sequenced were VF36, OE3, or ARE. The un-normalized counts file was processed with DESeq2 (deseq2 ref). The DESeq2 model internally corrects for library size, so transformed or normalized values such as counts scaled by library size are not necessary as input. DESeq2 identified differentially expressed (DE) genes using an experimental design that included the factors 'time' and 'temperature' for the PCA plots. The PCA plots were created with the use of RStudio and plotPCA() and ggplot() packages. Volcano plots were produced using the DESeq2 output based on 'temperature' in order to explore the differentially expressed genes that are impacted by heat stress. The volcano plots were created with the use of RStudio and EnhancedVolcano() package.
SL5 genome reference :
https://www.nature.com/articles/s41586-022-04808-9
deseq2 ref:
Love, Michael I., Wolfgang Huber, and Simon Anders. "Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2." Genome biology 15.12 (2014): 1-21.
Response: The RNA sequencing data was processed through the nf-core/rnaseq bioinformatics pipeline. A STAR alignment was conducted with the SL5 tomato genome. Output included a Salmon gene expression counts file. Each column contained a different experiment that was conducted whether at a different time, temperature, or genotype. Time durations being 15 minutes, 30 minutes, 45 minutes, or 75 minutes. Temperature being 28 degrees celsius or 34 degrees celsius. Genotypes being VF36, OE3, or ARE. The values in the matrix should be un-normalized counts or estimated counts of sequencing reads when used with DESeq. The DESeq2 model internally corrects for library size, so transformed or normalized values such as counts scaled by library size should not be used as input. DESeq was used to identify differentially expressed (DE) genes. The design included the factors 'time' and 'temperature' for the PCA plots. The PCA plots were created with the use of RStudio and plotPCA() and ggplot() packages. Volcano plots were produced with the use of DESeq output also but instead only had a 'temperature' design. This is due to wanting to only view differentially expressed genes that are impacted by heat stress. The volcano plots were created with the use of RStudio and EnhancedVolcano() package.
Code Resources and Packages:
Data and Markdown location: https://bitbucket.org/hotpollen/flavonoid-rnaseq/src/main/
Counts file: muday-144-SL5_counts-salmon.txt
PCA plot Markdown: Muday-DESeq-PCA-Plots.Rmd
Volcano Plot Markdown: FindControlVsStressDEGenes-DESeq.Rmd