Details

    • Type: Task
    • Status: Closed (View Workflow)
    • Priority: Major
    • Resolution: Done
    • Affects Version/s: None
    • Fix Version/s: None
    • Labels:
      None

      Description

      Markdown: https://bitbucket.org/mdavis4290/molly-flavonoid-rnaseq/src/main/72_F3H_PollenTube/DESeq_analysis/DESeq-Clusters.Rmd

      • Add labels
      • Square figure
      • Make all axis' the same length
      • Circle clusters
      • Improve markdown with in-line R coding

      Note: some of these requests were changed due to not visually working.

        Attachments

          Issue Links

            Activity

            Hide
            robofjoy Robert Reid added a comment -

            Muday-144-PCA-Plots.pdf

            Molly has produced MANY new PCA's and shared the last, best version with Gloria and Anthony.

            Anthony was delighted and quite thankful for the quick turn around time.

            Code likely stills to be merged back into bitbucket.

            Show
            robofjoy Robert Reid added a comment - Muday-144-PCA-Plots.pdf Molly has produced MANY new PCA's and shared the last, best version with Gloria and Anthony. Anthony was delighted and quite thankful for the quick turn around time. Code likely stills to be merged back into bitbucket.
            Hide
            Mdavis4290 Molly Davis added a comment -

            Next steps: Finalize markdown and commit changes.

            Show
            Mdavis4290 Molly Davis added a comment - Next steps : Finalize markdown and commit changes.
            Hide
            Mdavis4290 Molly Davis added a comment -

            Thank you!! [~RobertReid]

            Show
            Mdavis4290 Molly Davis added a comment - Thank you!! [~RobertReid]
            Hide
            robofjoy Robert Reid added a comment -

            The circling of the clusters was a mess in general and did not provide any extra insight.

            For future reference, to add an ellipse in a ggplot,

            the line is:
            ...
            stat_ellipse(segments = 151, aes(group = foo)) +
            ...

            EXAMPLE:

            plot_PCA <-function(pca, timepoint)

            { percentVar <- round(100 * attr(pca, "percentVar")) # add genotype back pca <- pca %>% tidyr::separate(name, into = c("genotype", NA, NA, NA), remove = FALSE) ggplot(pca, aes(PC1, PC2, shape=temperature, color = genotype)) + geom_point() + stat_ellipse(segments = 151, aes(group = genotype)) + ggtitle(sprintf("%s minutes", timepoint)) + theme(legend.position = "none") + coord_fixed(ratio = 1) }
            Show
            robofjoy Robert Reid added a comment - The circling of the clusters was a mess in general and did not provide any extra insight. For future reference, to add an ellipse in a ggplot, the line is: ... stat_ellipse(segments = 151, aes(group = foo)) + ... EXAMPLE: plot_PCA <-function(pca, timepoint) { percentVar <- round(100 * attr(pca, "percentVar")) # add genotype back pca <- pca %>% tidyr::separate(name, into = c("genotype", NA, NA, NA), remove = FALSE) ggplot(pca, aes(PC1, PC2, shape=temperature, color = genotype)) + geom_point() + stat_ellipse(segments = 151, aes(group = genotype)) + ggtitle(sprintf("%s minutes", timepoint)) + theme(legend.position = "none") + coord_fixed(ratio = 1) }
            Show
            Mdavis4290 Molly Davis added a comment - Branch : https://bitbucket.org/mdavis4290/molly-flavonoid-rnaseq/branch/IGBF-3369 Pull Request : https://bitbucket.org/hotpollen/flavonoid-rnaseq/pull-requests/18
            Hide
            ann.loraine Ann Loraine added a comment - - edited

            I decided to skip ahead to merging the PR and then move the ticket to Needs Testing, since [~RobertReid] and you both have looked closely at the new code together.

            Show
            ann.loraine Ann Loraine added a comment - - edited I decided to skip ahead to merging the PR and then move the ticket to Needs Testing, since [~RobertReid] and you both have looked closely at the new code together.
            Hide
            ann.loraine Ann Loraine added a comment -

            Suggestions on how to test:

            • clone repository to your local
            • open the .Rmd file in RStudio
            • install required dependencies
            • run "knit" to compile the Markdown

            If there are errors, take a look. If possible, try to fix the error. Keep notes on your edits via git commit messages.

            Show
            ann.loraine Ann Loraine added a comment - Suggestions on how to test: clone repository to your local open the .Rmd file in RStudio install required dependencies run "knit" to compile the Markdown If there are errors, take a look. If possible, try to fix the error. Keep notes on your edits via git commit messages.
            Hide
            ann.loraine Ann Loraine added a comment - - edited

            After installing the dependencies, ran "knit" to compile the document.
            The Markdown stalled here:

            processing file: Muday-DESeq-PCA-Plots.Rmd
              |.......................................            |  77% (unnamed-chunk-11)
            

            I don't know if this is why, but I did notice something that could cause a delay like in the above output. Line 220 in Muday-DESeq-PCA-Plots.Rmd is knitting (writing to a file) what looks like a large data frame. This might be the problem. Looks like it could be leftover debugging code that just needs to be removed.

            Sending back to [~molly] to take a look when convenient.

            Show
            ann.loraine Ann Loraine added a comment - - edited After installing the dependencies, ran "knit" to compile the document. The Markdown stalled here: processing file: Muday-DESeq-PCA-Plots.Rmd |....................................... | 77% (unnamed-chunk-11) I don't know if this is why, but I did notice something that could cause a delay like in the above output. Line 220 in Muday-DESeq-PCA-Plots.Rmd is knitting (writing to a file) what looks like a large data frame. This might be the problem. Looks like it could be leftover debugging code that just needs to be removed. Sending back to [~molly] to take a look when convenient.
            Hide
            Mdavis4290 Molly Davis added a comment - - edited

            I have no issues knitting the markdown. Line 220 contains the 4 pca plot figure. It may have issues with the directory or you do not have the required packages installed to pull the data. I recommend running the library chunk in the markdown and double checking to see if you have all of the installs that are required. If that is not the issue I will investigate further! Thanks!

            [~aloraine]

            Show
            Mdavis4290 Molly Davis added a comment - - edited I have no issues knitting the markdown. Line 220 contains the 4 pca plot figure. It may have issues with the directory or you do not have the required packages installed to pull the data. I recommend running the library chunk in the markdown and double checking to see if you have all of the installs that are required. If that is not the issue I will investigate further! Thanks! [~aloraine]
            Hide
            ann.loraine Ann Loraine added a comment - - edited

            I tried again. This time I tried running from the .Rmd file's directory location, by opening its companion "project" file (.Rproj) that was co-located in the same directory. It took a while to finish, but it finally finished the knit / compilation process.

            Some minor suggestions that I don't think are necessary to do, but would be "nice to haves" in terms of improving readability:

            • Figure out a nicer way to present the result from sanity checking. Currently, the message letting the reader know that the test passed does not fit on the page, making it harder to follow what is going on. See: p. 3, Muday-DESeq-PCA-Plots.pdf
            • First two words of the section of text appearing after the plot are confusing. It says: "The following plot.." suggesting the plot will be after this text. But the plot is actually appearing before the text, not after it. However, there is only one plot, so it's obvious what the text is talking about.
            Show
            ann.loraine Ann Loraine added a comment - - edited I tried again. This time I tried running from the .Rmd file's directory location, by opening its companion "project" file (.Rproj) that was co-located in the same directory. It took a while to finish, but it finally finished the knit / compilation process. Some minor suggestions that I don't think are necessary to do, but would be "nice to haves" in terms of improving readability: Figure out a nicer way to present the result from sanity checking. Currently, the message letting the reader know that the test passed does not fit on the page, making it harder to follow what is going on. See: p. 3, Muday-DESeq-PCA-Plots.pdf First two words of the section of text appearing after the plot are confusing. It says: "The following plot.." suggesting the plot will be after this text. But the plot is actually appearing before the text, not after it. However, there is only one plot, so it's obvious what the text is talking about.
            Hide
            ann.loraine Ann Loraine added a comment - - edited

            Additional comment:

            • To me, this statement describing part "A" sounds nonsensical: "We also see that over a longer duration of time that there is even more significant variance between the samples." I see filled shapes spread about evenly across the two "slash"-shaped clusters. There is no such trend. Also, I don't think there is something such as "significant variance" in the statistical sense. Maybe you mean to say: "a lot" ?
            • Regarding "B": I agree with the statement made about temperature. I recommend wording it a little differently to refer back to the image itself a bit more. How about: "PC1 cleanly separates points into two visually obvious groups, one containing only earlier-stage sample points and the second containing only-later-stage points."
            • Regarding "C": I agree, there's no trend here. There are two, maybe three clusters, plus a singleton. Not obvious trends are there with respect to any of the variables examined. The current statement: "C) The OE3 genotype was an analysis of just OE3 samples and had no clear indication of clustering or separation of points except for maybe a little at 75 minutes." is correct. But I think you should change the word "just" to "only," or, better yet, say "This plot and the code used to produce it included OE3 genotype samples only." Also, we can easily check this is true by looking at the included code.
            • Regarding "D": There's an obvious visual separation between the treatment (heat stress) samples and the control (no-heat stress sample) in the PC2 dimension. There is also a clear grouping based on early-versus-later sample points in the PC1 dimension. Editorial comment: feel "D" is the most exciting image.
            • This statement again needs to be connected to the plot itself: "The ARE genotype has the largest variation out of all PCA plots and was analyzed with just ARE samples." Why do you say this about variance? How can the reader make the same conclusion about variance for themselves. What do they need to look at in the image? Explain this better and talk about what it means in the Discussion. Recall that the true "result" here is your observation about what the plot looks like.
            Show
            ann.loraine Ann Loraine added a comment - - edited Additional comment: To me, this statement describing part "A" sounds nonsensical: "We also see that over a longer duration of time that there is even more significant variance between the samples." I see filled shapes spread about evenly across the two "slash"-shaped clusters. There is no such trend. Also, I don't think there is something such as "significant variance" in the statistical sense. Maybe you mean to say: "a lot" ? Regarding "B": I agree with the statement made about temperature. I recommend wording it a little differently to refer back to the image itself a bit more. How about: "PC1 cleanly separates points into two visually obvious groups, one containing only earlier-stage sample points and the second containing only-later-stage points." Regarding "C": I agree, there's no trend here. There are two, maybe three clusters, plus a singleton. Not obvious trends are there with respect to any of the variables examined. The current statement: "C) The OE3 genotype was an analysis of just OE3 samples and had no clear indication of clustering or separation of points except for maybe a little at 75 minutes." is correct. But I think you should change the word "just" to "only," or, better yet, say "This plot and the code used to produce it included OE3 genotype samples only." Also, we can easily check this is true by looking at the included code. Regarding "D": There's an obvious visual separation between the treatment (heat stress) samples and the control (no-heat stress sample) in the PC2 dimension. There is also a clear grouping based on early-versus-later sample points in the PC1 dimension. Editorial comment: feel "D" is the most exciting image. This statement again needs to be connected to the plot itself: "The ARE genotype has the largest variation out of all PCA plots and was analyzed with just ARE samples." Why do you say this about variance? How can the reader make the same conclusion about variance for themselves. What do they need to look at in the image? Explain this better and talk about what it means in the Discussion. Recall that the true "result" here is your observation about what the plot looks like.
            Hide
            ann.loraine Ann Loraine added a comment - - edited

            Comments on the Discussion:

            • First sentence is just filler. Remove it.
            • Second sentence has a lot of filler words. Reword it to remove unnecessary words. Change: "The goal was to answer these questions for the experiment to be successful:" to "Question (or questions) this Markdown aims to answer:"

            More thoughts:

            There are a lot of things an author can do with a Discussion, but the ultimate goal should be the same in each case: Explain the Results.

            There are a lot of choices about how to organize a Discussion section. Here are some:

            Listify:

            • Re-list the questions from the Introduction.
            • Edit by including one or two paragraphs following each question stating the answer to the question, and evidence. Interleave answers so that each re-stated question is followed by its answer and evidence for the answer.

            Then list the questions, one-by-one, and say what the answer is, using the Results.

            Show
            ann.loraine Ann Loraine added a comment - - edited Comments on the Discussion: First sentence is just filler. Remove it. Second sentence has a lot of filler words. Reword it to remove unnecessary words. Change: "The goal was to answer these questions for the experiment to be successful:" to "Question (or questions) this Markdown aims to answer:" More thoughts: There are a lot of things an author can do with a Discussion, but the ultimate goal should be the same in each case: Explain the Results. There are a lot of choices about how to organize a Discussion section. Here are some: Listify: Re-list the questions from the Introduction. Edit by including one or two paragraphs following each question stating the answer to the question, and evidence. Interleave answers so that each re-stated question is followed by its answer and evidence for the answer. Then list the questions, one-by-one, and say what the answer is, using the Results.
            Hide
            ann.loraine Ann Loraine added a comment - - edited

            If the question is: "Was the experiment successful" and you have arrived at an answer to this question, you have to explain it more.

            I feel it is too vague, anyway.

            Show
            ann.loraine Ann Loraine added a comment - - edited If the question is: "Was the experiment successful" and you have arrived at an answer to this question, you have to explain it more. I feel it is too vague, anyway.
            Hide
            ann.loraine Ann Loraine added a comment -

            Suggestion: Read the comments and then spend an afternoon or a couple of hours to make some improvements. If you like it better, submit a new PR. If you don't, we can close this.

            As far as I am concerned, I feel I can look at the plots and come to my own conclusions about them. I don't need a Markdown to explain it to me, because I can read code and understand it, generally. Someone else might not, however. I would recommend investing at least another half-day in making improvements, but not much more, unless [~molly] thinks more or less is needed.

            Show
            ann.loraine Ann Loraine added a comment - Suggestion: Read the comments and then spend an afternoon or a couple of hours to make some improvements. If you like it better, submit a new PR. If you don't, we can close this. As far as I am concerned, I feel I can look at the plots and come to my own conclusions about them. I don't need a Markdown to explain it to me, because I can read code and understand it, generally. Someone else might not, however. I would recommend investing at least another half-day in making improvements, but not much more, unless [~molly] thinks more or less is needed.
            Hide
            ann.loraine Ann Loraine added a comment -

            Moving back to "To-Do" for Molly to review the preceding comments by me, following the most recent comment made by [~molly].

            I am happy with it being closed, or worked-on a bit more, as she sees fit.

            Show
            ann.loraine Ann Loraine added a comment - Moving back to "To-Do" for Molly to review the preceding comments by me, following the most recent comment made by [~molly] . I am happy with it being closed, or worked-on a bit more, as she sees fit.
            Hide
            Mdavis4290 Molly Davis added a comment -

            Branch: https://bitbucket.org/mdavis4290/molly3-flavonoid-rnaseq/branch/IGBF-3369
            Pull Request: https://bitbucket.org/hotpollen/flavonoid-rnaseq/pull-requests/19

            Notes: Dr. Loraine has already reviewed the cosmetic changes to the PCA plots and this pull request are just the cosmetic code changes to the plots. The commit should only include the new Markdown and output file of the new figures. I will make more changes to the writing later to then also include the knitted PDF. Thanks!

            Show
            Mdavis4290 Molly Davis added a comment - Branch : https://bitbucket.org/mdavis4290/molly3-flavonoid-rnaseq/branch/IGBF-3369 Pull Request : https://bitbucket.org/hotpollen/flavonoid-rnaseq/pull-requests/19 Notes : Dr. Loraine has already reviewed the cosmetic changes to the PCA plots and this pull request are just the cosmetic code changes to the plots. The commit should only include the new Markdown and output file of the new figures. I will make more changes to the writing later to then also include the knitted PDF. Thanks!
            Hide
            ann.loraine Ann Loraine added a comment -

            PR is merged. Moving to Done.

            Show
            ann.loraine Ann Loraine added a comment - PR is merged. Moving to Done.

              People

              • Assignee:
                Mdavis4290 Molly Davis
                Reporter:
                Mdavis4290 Molly Davis
              • Votes:
                0 Vote for this issue
                Watchers:
                3 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: