Details

    • Type: Task
    • Status: Closed (View Workflow)
    • Priority: Major
    • Resolution: Done
    • Affects Version/s: None
    • Fix Version/s: None
    • Labels:
      None

      Description

      1. PCA plots for all 3 genotypes at control and stress temperatures in each time point (so, this would be 4 PCA plots). Rob had sent graphs with all 4 time points, 3 genotypes and 2 temperatures and that was hard to parse out. Rasha showed her data with individual time points and it was much easier to follow.

        Attachments

          Issue Links

            Activity

            Hide
            Mdavis4290 Molly Davis added a comment - - edited

            The following is the location of the R Markdown, updated data files, and HTML for the markdown:
            https://bitbucket.org/mdavis4290/analysis-flavonoid-rnaseq/src/main/muday-144-analysis/

            Commit: https://bitbucket.org/mdavis4290/analysis-flavonoid-rnaseq/commits/8130c485f27029d6ece7f97adb47212d426d6652

            Next Steps: Work on making functions and making PCA plots for specific genotypes alone.

            Show
            Mdavis4290 Molly Davis added a comment - - edited The following is the location of the R Markdown, updated data files, and HTML for the markdown: https://bitbucket.org/mdavis4290/analysis-flavonoid-rnaseq/src/main/muday-144-analysis/ Commit: https://bitbucket.org/mdavis4290/analysis-flavonoid-rnaseq/commits/8130c485f27029d6ece7f97adb47212d426d6652 Next Steps: Work on making functions and making PCA plots for specific genotypes alone.
            Hide
            Mdavis4290 Molly Davis added a comment - - edited

            Update:

            • Created a function and separated genotypes to analyze individually. Made PCA Plots export as pdf file.

            Commit: https://bitbucket.org/mdavis4290/analysis-flavonoid-rnaseq/commits/8c73d9a711845b770384d811d45ef0d485a8ee16

            Next Steps: Get feedback and determine if I need to parse timepoints out also to analyze. Am I fulfilling the task that Gloria is requesting, view ticket description?

            Review Notes: Dr. Reid said the IQR request is fine at 50 and doesn't need to be changed to more favorable results. The third request, IGBF-3280, actually might answer my question about continuing with grabbing just the timepoints columns to compare?:

            "Request 3. Analyses of DE genes between genotypes. We have not yet seen those genes for the current analysis pipeline and want to match those up with genes in Number 2. Since there are 3 genotypes, I guess the way this needs to be done is 3 pairwise comparisons at each temperature (so 6 comparisons), but if you all have thoughts on another way, we'd love to hear them."

            I appreciate any feedback! Let me know what you think!

            Show
            Mdavis4290 Molly Davis added a comment - - edited Update: Created a function and separated genotypes to analyze individually. Made PCA Plots export as pdf file. Commit: https://bitbucket.org/mdavis4290/analysis-flavonoid-rnaseq/commits/8c73d9a711845b770384d811d45ef0d485a8ee16 Next Steps: Get feedback and determine if I need to parse timepoints out also to analyze. Am I fulfilling the task that Gloria is requesting, view ticket description? Review Notes: Dr. Reid said the IQR request is fine at 50 and doesn't need to be changed to more favorable results. The third request, IGBF-3280, actually might answer my question about continuing with grabbing just the timepoints columns to compare?: "Request 3. Analyses of DE genes between genotypes. We have not yet seen those genes for the current analysis pipeline and want to match those up with genes in Number 2. Since there are 3 genotypes, I guess the way this needs to be done is 3 pairwise comparisons at each temperature (so 6 comparisons), but if you all have thoughts on another way, we'd love to hear them." I appreciate any feedback! Let me know what you think!
            Hide
            ann.loraine Ann Loraine added a comment - - edited
            Show
            ann.loraine Ann Loraine added a comment - - edited PR merged. See repository for code. Repository address is https://bitbucket.org/hotpollen/flavonoid-rnaseq/src/main/ .
            Hide
            ann.loraine Ann Loraine added a comment - - edited

            Review comments on Markdown:

            • One of the first code chunks references absolute file paths on a specific computer. As a result, the Markdown can only run on that specific computer. Instead, the files should be referenced relative to the current working directory. To make this easier, RStudio has a concept of "project". Do this: 1) Create a new project (R project file with .rproj extension) in the same directory and add the project file to the directory. 2) Change the absolute file paths (referencing Molly's file system) to omit the file path. 3) To run the Markdown, users will open the project in RStudio, select the Markdown from the Files section in RStudio, and "knit" it. They can also run individual sections interactively within RStudio. If they do this, there will be no need to provide the absolute path location for input files. Talk with Ann for more information if this does not make sense.
            • Markdown should contain sections similar to a scientific paper: Introduction, Analysis/Results, Discussion, Conclusion. See https://bitbucket.org/hotpollen/flavonoid-rnaseq/src/main/Starter_Markdown_Example.Rmd. You can use this as a "starter Markdown" and customize the sections as needed to fit an analysis.
            • Each figure is a "result" and needs an explanation of what it shows, some text that states "The preceding figure shows that [ ... fill in whatever it shows ... ]."
            • First-pass interpretation : I see no separations between points except for in the first PCA plot, the one that includes all the samples. This plot seems to show a strong separation between genotypes, with ARE samples separated from the others. However, because each individual sample is being shown with its own unique color, I find the plot very difficult to process. I can see there is a separation of points in that plot, but I can't notice any variables that are very different between the two areas of points. Instead of individually color-code each point, use shapes to indicate genotype and color to indicate temperature - blue for cool and orange for warm. For shapes, I would use a triangle for ARE, a circle for VF36, and a square for OE3. ARE is a point mutant, and triangles are sometimes used to indicate mutations in some types of graphics. VF36 is a sort of parent, or mother variety, so it should be round, like the Venus Venus of Willendorf. The OE3 should be a shape with right angles, like a lab bench, as it was created in a laboratory.
            • Because the PCA may not be able to detect or group samples very effectively in this experiment, I'd like to try an alternative clustering method called "MDS" for "multi-dimensional scaling". I did this for another Muday lab RNA-Seq dataset, and the knitted Markdown is here: https://bitbucket.org/hotpollen/flavonoid-rnaseq/src/main/Counts/ClusterCounts.html. I made a new ticket for the task : IGBF-3287
            Show
            ann.loraine Ann Loraine added a comment - - edited Review comments on Markdown: One of the first code chunks references absolute file paths on a specific computer. As a result, the Markdown can only run on that specific computer. Instead, the files should be referenced relative to the current working directory. To make this easier, RStudio has a concept of "project". Do this: 1) Create a new project (R project file with .rproj extension) in the same directory and add the project file to the directory. 2) Change the absolute file paths (referencing Molly's file system) to omit the file path. 3) To run the Markdown, users will open the project in RStudio, select the Markdown from the Files section in RStudio, and "knit" it. They can also run individual sections interactively within RStudio. If they do this, there will be no need to provide the absolute path location for input files. Talk with Ann for more information if this does not make sense. Markdown should contain sections similar to a scientific paper: Introduction, Analysis/Results, Discussion, Conclusion. See https://bitbucket.org/hotpollen/flavonoid-rnaseq/src/main/Starter_Markdown_Example.Rmd . You can use this as a "starter Markdown" and customize the sections as needed to fit an analysis. Each figure is a "result" and needs an explanation of what it shows, some text that states "The preceding figure shows that [ ... fill in whatever it shows ... ]." First-pass interpretation : I see no separations between points except for in the first PCA plot, the one that includes all the samples. This plot seems to show a strong separation between genotypes, with ARE samples separated from the others. However, because each individual sample is being shown with its own unique color, I find the plot very difficult to process. I can see there is a separation of points in that plot, but I can't notice any variables that are very different between the two areas of points. Instead of individually color-code each point, use shapes to indicate genotype and color to indicate temperature - blue for cool and orange for warm. For shapes, I would use a triangle for ARE, a circle for VF36, and a square for OE3. ARE is a point mutant, and triangles are sometimes used to indicate mutations in some types of graphics. VF36 is a sort of parent, or mother variety, so it should be round, like the Venus Venus of Willendorf. The OE3 should be a shape with right angles, like a lab bench, as it was created in a laboratory. Because the PCA may not be able to detect or group samples very effectively in this experiment, I'd like to try an alternative clustering method called "MDS" for "multi-dimensional scaling". I did this for another Muday lab RNA-Seq dataset, and the knitted Markdown is here: https://bitbucket.org/hotpollen/flavonoid-rnaseq/src/main/Counts/ClusterCounts.html . I made a new ticket for the task : IGBF-3287
            Hide
            Mdavis4290 Molly Davis added a comment - - edited

            Updated Bitbucket

            Repository: https://bitbucket.org/mdavis4290/analysis-flavonoid-rnaseq/src/main/

            Commit: https://bitbucket.org/mdavis4290/analysis-flavonoid-rnaseq/commits/67b6a52222e71dd5dfd4f721c82cf8b88ebd4769

            Let me know if I should make Pull Request to merge with Main! Thanks! [~aloraine]

            Show
            Mdavis4290 Molly Davis added a comment - - edited Updated Bitbucket Repository: https://bitbucket.org/mdavis4290/analysis-flavonoid-rnaseq/src/main/ Commit: https://bitbucket.org/mdavis4290/analysis-flavonoid-rnaseq/commits/67b6a52222e71dd5dfd4f721c82cf8b88ebd4769 Let me know if I should make Pull Request to merge with Main! Thanks! [~aloraine]
            Hide
            ann.loraine Ann Loraine added a comment - - edited

            Please see notes on bitbucket for change request details / questions.

            Show
            ann.loraine Ann Loraine added a comment - - edited Please see notes on bitbucket for change request details / questions.
            Hide
            ann.loraine Ann Loraine added a comment -

            Merged changes into main branch of team repository and then added 3 new commits to address comments on above commit. Moving to Done.

            Show
            ann.loraine Ann Loraine added a comment - Merged changes into main branch of team repository and then added 3 new commits to address comments on above commit. Moving to Done.

              People

              • Assignee:
                Mdavis4290 Molly Davis
                Reporter:
                Mdavis4290 Molly Davis
              • Votes:
                0 Vote for this issue
                Watchers:
                Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: