Uploaded image for project: 'IGB'
  1. IGB
  2. IGBF-3399

Plan Muday time points and genotypes analysis

    Details

    • Type: Task
    • Status: Closed (View Workflow)
    • Priority: Major
    • Resolution: Done
    • Affects Version/s: None
    • Fix Version/s: None
    • Labels:
      None

      Description

      Talked with Gloria on Tuesday.

      She is delighted with all that has been accomplished.
      But to get to publication she needs 2 more pieces of the puzzles

      1. Analysis comparing genes expressed at different time points (within each plant variety)
      2. Analysis comparing genes expressed btw genotypes

      She has requested we tackle this!!!
      Anthony has defended and his finishing the details. But he will focus on publication soon after that is done.

      Step 1 in this should be to set up a chat with Gloria and better define the scope of what this task should be!

        Attachments

          Issue Links

            Activity

            robofjoy Robert Reid created issue -
            robofjoy Robert Reid made changes -
            Field Original Value New Value
            Epic Link IGBF-3277 [ 22158 ]
            robofjoy Robert Reid made changes -
            Link This issue blocks IGBF-3373 [ IGBF-3373 ]
            robofjoy Robert Reid made changes -
            Link This issue blocks IGBF-3344 [ IGBF-3344 ]
            robofjoy Robert Reid made changes -
            Assignee Ann Loraine [ aloraine ]
            Hide
            robofjoy Robert Reid added a comment -

            I have sent an email out to Gloria to arrange a meeting to discuss what products need to producing for this task.
            We can then break out out into more tasks once we have buy in from Gloria.

            Rob

            Show
            robofjoy Robert Reid added a comment - I have sent an email out to Gloria to arrange a meeting to discuss what products need to producing for this task. We can then break out out into more tasks once we have buy in from Gloria. Rob
            Hide
            Mdavis4290 Molly Davis added a comment - - edited

            Meeting Prep notes:

            DESeq performs an analysis based on the design you choose for it. For example, if I choose a time + temperature design you have chosen interactions to specifically analyze/compare time + temperature DE genes for each experiment/sample. If I just chose temperature for my design then the analysis will focus just on comparing 28C and 34C DE genes. So we can choose many different designs for DESeq and have single or multi factor designs.

            Time Design =

            dds <- DESeqDataSetFromMatrix(countData = round(cts),
                                            colData = coldata,
                                            design = ~ time + temperature) 
            

            Genotype Design =

            dds <- DESeqDataSetFromMatrix(countData = round(cts),
                                            colData = coldata,
                                            design = ~ genotype + temperature) 
            

            Design video tutorial: https://youtu.be/X6p3E-QTcUc?t=645

            Notes about previous Designs:

            • PCA plots: time + temperature Design
            • Volcano Plots: temperature Design

            Questions:

            • How would you like to visualize these interactions? Heatmaps, volcano plot, PCA plot, MA plot, plot top 20 of significant genes, or plot expression of a single gene.

            Code Resources:

            Show
            Mdavis4290 Molly Davis added a comment - - edited Meeting Prep notes : DESeq performs an analysis based on the design you choose for it. For example, if I choose a time + temperature design you have chosen interactions to specifically analyze/compare time + temperature DE genes for each experiment/sample. If I just chose temperature for my design then the analysis will focus just on comparing 28C and 34C DE genes. So we can choose many different designs for DESeq and have single or multi factor designs. Time Design = dds <- DESeqDataSetFromMatrix(countData = round(cts), colData = coldata, design = ~ time + temperature) Genotype Design = dds <- DESeqDataSetFromMatrix(countData = round(cts), colData = coldata, design = ~ genotype + temperature) Design video tutorial : https://youtu.be/X6p3E-QTcUc?t=645 Notes about previous Designs : PCA plots: time + temperature Design Volcano Plots: temperature Design Questions: How would you like to visualize these interactions? Heatmaps, volcano plot, PCA plot, MA plot, plot top 20 of significant genes, or plot expression of a single gene. Code Resources: https://hbctraining.github.io/DGE_workshop/lessons/06_DGE_visualizing_results.html http://bioconductor.org/packages/devel/bioc/vignettes/DESeq2/inst/doc/DESeq2.html
            Mdavis4290 Molly Davis made changes -
            Attachment SL4_SL5_Description.R [ 17922 ]
            Mdavis4290 Molly Davis made changes -
            Attachment SL4_SL5_description_tomato.csv [ 17923 ]
            Mdavis4290 Molly Davis made changes -
            Attachment  SL4_SL5_description_tomato.csv [ 17923 ]
            Mdavis4290 Molly Davis made changes -
            Attachment SL4_SL5_description_tomato.csv [ 17924 ]
            Mdavis4290 Molly Davis made changes -
            Attachment  SL4_SL5_description_tomato.csv [ 17924 ]
            Mdavis4290 Molly Davis made changes -
            Attachment SL4_SL5_Description.R [ 17922 ]
            Mdavis4290 Molly Davis made changes -
            Attachment SL4_SL5_Description.R [ 17925 ]
            Mdavis4290 Molly Davis made changes -
            Attachment SL4_SL5_description_tomato.csv [ 17926 ]
            Mdavis4290 Molly Davis made changes -
            Summary Time course Analysis between genotypes. And analysis comparing time points Analyze Muday time points and genotypes
            Mdavis4290 Molly Davis made changes -
            Sprint Summer 6 2023 July 24 [ 175 ]
            Mdavis4290 Molly Davis made changes -
            Attachment SL4_SL5_counts_muday-144.csv [ 17927 ]
            Hide
            robofjoy Robert Reid added a comment -

            Gloria's request from her email:

            Also, we have a small problem with the excel files on the DE genes within genotype and between treatments, that I think you can help us with easily.
            We are trying to run an enrichment analysis and the software does not recognize the newest V5 genome. You gave us the S4 IDs, but they are part of the description, rather than a separate column. Is there a way to add this to the spreadsheet as a column? That way we can cut and paste a group without manually extracting each gene ID from that text heavy gene description list.
            Thanks
            Gloria

            Show
            robofjoy Robert Reid added a comment - Gloria's request from her email: Also, we have a small problem with the excel files on the DE genes within genotype and between treatments, that I think you can help us with easily. We are trying to run an enrichment analysis and the software does not recognize the newest V5 genome. You gave us the S4 IDs, but they are part of the description, rather than a separate column. Is there a way to add this to the spreadsheet as a column? That way we can cut and paste a group without manually extracting each gene ID from that text heavy gene description list. Thanks Gloria
            ann.loraine Ann Loraine made changes -
            Summary Analyze Muday time points and genotypes Plan Muday time points and genotypes analysis
            ann.loraine Ann Loraine made changes -
            Sprint Summer 6 2023 July 24 [ 175 ] Summer 6 2023 July 24, Summer 7 2023 Aug 7 [ 175, 176 ]
            ann.loraine Ann Loraine made changes -
            Rank Ranked higher
            Mdavis4290 Molly Davis made changes -
            Status To-Do [ 10305 ] In Progress [ 3 ]
            Mdavis4290 Molly Davis made changes -
            Status In Progress [ 3 ] Needs 1st Level Review [ 10005 ]
            ann.loraine Ann Loraine made changes -
            Assignee Ann Loraine [ aloraine ]
            Hide
            ann.loraine Ann Loraine added a comment -

            Change request for [~molly]: the work you did creating the new table mapping SL4 to SL5 names is out of scope for the task of this ticket, which is to plan a new analysis. It would be better to create a new ticket for just that one task and track it there, instead of here.

            Another change request: Please look at the other tickets associated with this epic. Some of them may be related. There may already have been some work done on this.

            Show
            ann.loraine Ann Loraine added a comment - Change request for [~molly] : the work you did creating the new table mapping SL4 to SL5 names is out of scope for the task of this ticket, which is to plan a new analysis. It would be better to create a new ticket for just that one task and track it there, instead of here. Another change request: Please look at the other tickets associated with this epic. Some of them may be related. There may already have been some work done on this.
            ann.loraine Ann Loraine made changes -
            Status Needs 1st Level Review [ 10005 ] First Level Review in Progress [ 10301 ]
            ann.loraine Ann Loraine made changes -
            Status First Level Review in Progress [ 10301 ] To-Do [ 10305 ]
            ann.loraine Ann Loraine made changes -
            Assignee Ann Loraine [ aloraine ] Molly Davis [ molly ]
            Mdavis4290 Molly Davis made changes -
            Attachment SL4_SL5_Description.R [ 17925 ]
            Mdavis4290 Molly Davis made changes -
            Attachment SL4_SL5_description_tomato.csv [ 17926 ]
            Mdavis4290 Molly Davis made changes -
            Attachment SL4_SL5_counts_muday-144.csv [ 17927 ]
            Mdavis4290 Molly Davis made changes -
            Comment [ Dr. Muday also asked for the SL4 gene names to be their own column with the description. So I made an R script and created an output file with the columns "SL4, SL5, Description". I also just added the new SL4 column to the original counts file.

            Script:
             [^SL4_SL5_Description.R]

            File with just gene names and description:
             [^SL4_SL5_description_tomato.csv]

            File with SL4 column added on to the end of original counts file:

             [^SL4_SL5_counts_muday-144.csv] ]
            Hide
            Mdavis4290 Molly Davis added a comment -

            New ticket for SL4 names: IGBF-3407

            Show
            Mdavis4290 Molly Davis added a comment - New ticket for SL4 names : IGBF-3407
            Mdavis4290 Molly Davis made changes -
            Status To-Do [ 10305 ] In Progress [ 3 ]
            Mdavis4290 Molly Davis made changes -
            Status In Progress [ 3 ] Needs 1st Level Review [ 10005 ]
            Mdavis4290 Molly Davis made changes -
            Assignee Molly Davis [ molly ]
            Hide
            Mdavis4290 Molly Davis added a comment -

            Next step: Create tickets for genotype + temperature and time + temperature analysis.

            Notes from meeting: Also spoke about later maybe using heat maps to see gene expression over time for possibly top 20 significant genes. The Muday lab was going to look into Blast to-go and see if they want to use it for annotations.
            Questions that were asked: Group was curious why there were less down-regulated genes than up-regulated on the temperature designed volcano plots. Curious also about why there were more DE genes for ARE compared to the other genotypes.

            Show
            Mdavis4290 Molly Davis added a comment - Next step : Create tickets for genotype + temperature and time + temperature analysis. Notes from meeting : Also spoke about later maybe using heat maps to see gene expression over time for possibly top 20 significant genes. The Muday lab was going to look into Blast to-go and see if they want to use it for annotations. Questions that were asked : Group was curious why there were less down-regulated genes than up-regulated on the temperature designed volcano plots. Curious also about why there were more DE genes for ARE compared to the other genotypes.
            Mdavis4290 Molly Davis made changes -
            Status Needs 1st Level Review [ 10005 ] First Level Review in Progress [ 10301 ]
            Mdavis4290 Molly Davis made changes -
            Status First Level Review in Progress [ 10301 ] To-Do [ 10305 ]
            Hide
            ann.loraine Ann Loraine added a comment -

            I think we should use edgeR for this because the documentation is a lot better.

            See: https://bioconductor.org/packages/release/bioc/vignettes/edgeR/inst/doc/edgeRUsersGuide.pdf.

            Show
            ann.loraine Ann Loraine added a comment - I think we should use edgeR for this because the documentation is a lot better. See: https://bioconductor.org/packages/release/bioc/vignettes/edgeR/inst/doc/edgeRUsersGuide.pdf .
            ann.loraine Ann Loraine made changes -
            Assignee Ann Loraine [ aloraine ]
            ann.loraine Ann Loraine made changes -
            Status To-Do [ 10305 ] In Progress [ 3 ]
            Hide
            Mdavis4290 Molly Davis added a comment - - edited

            Next Step for tickets:
            1. Create the designs for new DESeq and EdgeR analysis'

            • review design creation: https://youtu.be/X6p3E-QTcUc?t=646
            • ensure the designs answer Muday Labs questions about genotype and time
            • make sure to use the same designs for DESeq and EdgeR
            • run the analysis for both statistical methods twice for genotype and time

            2. Compare Deseq and EdgeR results and perform a sanity check
            3. Create top 25-100 DE genes output files
            4. Use top 25 DE genes and visualize with original scaled counts file to make lines graphs and/or heat maps

            Question: Do we want a separate ticket for EdgeR and Deseq for each step?

            Show
            Mdavis4290 Molly Davis added a comment - - edited Next Step for tickets : 1. Create the designs for new DESeq and EdgeR analysis' review design creation: https://youtu.be/X6p3E-QTcUc?t=646 ensure the designs answer Muday Labs questions about genotype and time make sure to use the same designs for DESeq and EdgeR run the analysis for both statistical methods twice for genotype and time 2. Compare Deseq and EdgeR results and perform a sanity check 3. Create top 25-100 DE genes output files 4. Use top 25 DE genes and visualize with original scaled counts file to make lines graphs and/or heat maps Question: Do we want a separate ticket for EdgeR and Deseq for each step?
            ann.loraine Ann Loraine made changes -
            Sprint Summer 6 2023 July 24, Summer 7 2023 Aug 7 [ 175, 176 ] Summer 6 2023 July 24, Summer 7 2023 Aug 7, Summer 8 2023 Aug 21 [ 175, 176, 177 ]
            ann.loraine Ann Loraine made changes -
            Rank Ranked higher
            ann.loraine Ann Loraine made changes -
            Status In Progress [ 3 ] To-Do [ 10305 ]
            ann.loraine Ann Loraine made changes -
            Sprint Summer 6 2023 July 24, Summer 7 2023 Aug 7, Summer 8 2023 Aug 21 [ 175, 176, 177 ] Testing 2 : 4 Nov - 15 Nov, Summer 6 2023 July 24, Summer 7 2023 Aug 7, Summer 8 2023 Aug 21 [ 82, 175, 176, 177 ]
            ann.loraine Ann Loraine made changes -
            Rank Ranked higher
            ann.loraine Ann Loraine made changes -
            Assignee Ann Loraine [ aloraine ]
            Mdavis4290 Molly Davis made changes -
            Assignee Molly Davis [ molly ]
            Mdavis4290 Molly Davis made changes -
            Link This issue relates to IGBF-3436 [ IGBF-3436 ]
            Mdavis4290 Molly Davis made changes -
            Link This issue relates to IGBF-3436 [ IGBF-3436 ]
            Mdavis4290 Molly Davis made changes -
            Link This issue relates to IGBF-3436 [ IGBF-3436 ]
            Mdavis4290 Molly Davis made changes -
            Link This issue relates to IGBF-3437 [ IGBF-3437 ]
            Mdavis4290 Molly Davis made changes -
            Comment [ *EdgeR comment*:

            I don't see much reason to switch to edgeR. I think it is best to stick to our analysis method to keep consistent results with the dataset. DESeq and EdgeR are very similar and both assume that no genes are differentially expressed. DESeq uses a "geometric" normalisation strategy, whereas EdgeR is a weighted mean of log ratios-based method. Both normalise data initially via the calculation of size / normalisation factors. So I think changing now would be unnecessary as I already have DESeq being used in volcano plots and pca plots that they want to publish. If we were to change analysis methods now we would have to redo previous plots. For example, we would want to compare volcano plots or previous results to the new ones so that would be more difficult switching and comparing DESeq results to EdgeR results. Overall, I think it would be best to stick to our DESeq method for now only because we want to stay consistent.

            But I understand wanting to see different plots possibly from EdgeR and we could see basic results from edgeR but for continuing the genotype and time analysis' it might be best to stick to DESeq probably.

            DESeq documentation I use: http://bioconductor.org/packages/devel/bioc/vignettes/DESeq2/inst/doc/DESeq2.html

            Let me know what you think! :) [~aloraine] ]
            Mdavis4290 Molly Davis made changes -
            Link This issue relates to IGBF-3438 [ IGBF-3438 ]
            Mdavis4290 Molly Davis made changes -
            Link This issue relates to IGBF-3439 [ IGBF-3439 ]
            Mdavis4290 Molly Davis made changes -
            Link This issue relates to IGBF-3440 [ IGBF-3440 ]
            Hide
            Mdavis4290 Molly Davis added a comment -

            Created the following tickets to move forward with muday plans:

            IGBF-3436 Create DESeq Genotype and Time Designs for Muday-144
            IGBF-3437 Create EdgeR Genotype and Time Designs for Muday-144
            IGBF-3438 Compare Muday-144 Deseq and EdgeR results and perform a sanity check
            IGBF-3439 Create top 25-100 DE genes output files for Muday-144
            IGBF-3440 Use top 25 DE genes from Muday-144 Analysis and visualize with original scaled counts file to make lines graphs and/or heat maps

            Moving this ticket to done!

            Show
            Mdavis4290 Molly Davis added a comment - Created the following tickets to move forward with muday plans : IGBF-3436 Create DESeq Genotype and Time Designs for Muday-144 IGBF-3437 Create EdgeR Genotype and Time Designs for Muday-144 IGBF-3438 Compare Muday-144 Deseq and EdgeR results and perform a sanity check IGBF-3439 Create top 25-100 DE genes output files for Muday-144 IGBF-3440 Use top 25 DE genes from Muday-144 Analysis and visualize with original scaled counts file to make lines graphs and/or heat maps Moving this ticket to done!
            Mdavis4290 Molly Davis made changes -
            Status To-Do [ 10305 ] In Progress [ 3 ]
            Mdavis4290 Molly Davis made changes -
            Status In Progress [ 3 ] Needs 1st Level Review [ 10005 ]
            Mdavis4290 Molly Davis made changes -
            Status Needs 1st Level Review [ 10005 ] First Level Review in Progress [ 10301 ]
            Mdavis4290 Molly Davis made changes -
            Status First Level Review in Progress [ 10301 ] Ready for Pull Request [ 10304 ]
            Mdavis4290 Molly Davis made changes -
            Status Ready for Pull Request [ 10304 ] Pull Request Submitted [ 10101 ]
            Mdavis4290 Molly Davis made changes -
            Status Pull Request Submitted [ 10101 ] Reviewing Pull Request [ 10303 ]
            Mdavis4290 Molly Davis made changes -
            Status Reviewing Pull Request [ 10303 ] Merged Needs Testing [ 10002 ]
            Mdavis4290 Molly Davis made changes -
            Status Merged Needs Testing [ 10002 ] Post-merge Testing In Progress [ 10003 ]
            Mdavis4290 Molly Davis made changes -
            Resolution Done [ 10000 ]
            Status Post-merge Testing In Progress [ 10003 ] Closed [ 6 ]
            Mdavis4290 Molly Davis made changes -
            Epic Link IGBF-3277 [ 22158 ] IGBF-3446 [ 22548 ]
            ann.loraine Ann Loraine made changes -
            Epic Link IGBF-3446 [ 22548 ] IGBF-3277 [ 22158 ]

              People

              • Assignee:
                Mdavis4290 Molly Davis
                Reporter:
                robofjoy Robert Reid
              • Votes:
                0 Vote for this issue
                Watchers:
                3 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: