Details
-
Type:
Task
-
Status: Closed (View Workflow)
-
Priority:
Minor
-
Resolution: Done
-
Affects Version/s: None
-
Fix Version/s: None
-
Labels:None
-
Epic Link:
-
Sprint:Spring 4, Spring 5
Description
This task is more of a brainstorm based on the details that came out of Kelsey's committee meeting.
In her meeting, an advisor inquired about how Heinz is the reference genome and a sample in the experiment. It would make sense that Heinz will more easily align to reference genome compared to the other 3 TMH varieties.
The PCA (attached) reflects this.
The goal would be to come up with an unbiased way to check MDS / PCA / PCoA where everything is not aligned to Heinz.
1 idea.
- We de novo assembly via trinity.
- Get a set of genes for each variety.
- Assign all of these genes to a SolyID via blast. Along with a gene expression count.
- Run these as a MDS / PCA and see the distribution.
A more detailed plan laid out in this email:
I think I can churn something out in time. Let's lay out a plan first. I can start the pipeline and I have a student rotating with me who can help as well. I think we can have it done in time.
But if not, you can show the pipeline, lay out the justification and describe the expected end result.
It will also be a good idea to run the pipeline in Arizona afterwards. I share with you the code, you run it, we make it all publicly available for publication! You running it = repeatable.
Let me know if the following is a good strategy:
GOAL: To create gene counts for each plant variety using the same plant sequences and a de novo assembly approach to avoid HEINZ reference bias.
Also, I have Molly run and fine tune the pipeline, and she then sets up in BitBucket to make the code publicly available for others.