Suggestions:
- Talk could explain Paige's experience-to-date as a new trainer, outreach specialist, and data manager for the IGB project
- She could describe having the goal of understanding how a researcher would use Galaxy to align and ultimately visualize RNA-Seq expression data for tardigrade. (We started with this task because it's similar to our scientific work - we know what to do and what to expect.)
- She could explain why we picked tardigrade.
Reasons:
1) Researchers have published many RNA-Seq data sets where they did not use the two currently available tardigrade genome assemblies as part of their analyses pipelines. By setting up a way for researchers to align RNA-Seq data to a genome assembly, we could have a positive effect on the field because it would make the two seemingly overlooked assemblies more accessible and more useful for research.
2) Thus far, researchers have mainly used RNA assembly pipelines (trinity) to produce differential gene expression results. These pipelines as an intermediate step produce (propose) RNA transcript sequences. Aligning these transcript sequences to a genome assembly and visualizing the alignments in a genome browser would be useful because it could help practitioners assess and understand the pipeline and the strategy of using RNA sequences only instead of RNA together with DNA sequence.
3) Interesting bioinformatics and genomics mystery: The annotated genes reportedly have no homology to genes in other species. Is this an artifact of genome assembly and annotation, or do tardigrades make unusual, entirely novel proteins?
4) Interesting natural history: There are terrestrial tardigrade species that revive after desiccation. How does this process work? Genomic experiments could let us know whether some sets of genes begin to show expression during dessication or re-animation, a step in the way to understanding these processes.
- Then, describe attempting to use Galaxy+Integrated Genome Browser to create interactive visualizations of RNA-Seq alignments and RNA-Seq scaled coverage graphs.
- End with visualizations - some truly amazing and stunning images that answer a question either directly stated or only suggested in the previous part of the talk.
Goals for attending and participating in the GCC:
- Communicate to the community that our team wants you (Paige) to learn as much as you can from the Galaxy Community about how to attract and support audiences
- A goal of the Galaxy Community is to "democratize science." We want to understand what they mean by that, and how to do it ourselves.
Suggestions:
Reasons:
1) Researchers have published many RNA-Seq data sets where they did not use the two currently available tardigrade genome assemblies as part of their analyses pipelines. By setting up a way for researchers to align RNA-Seq data to a genome assembly, we could have a positive effect on the field because it would make the two seemingly overlooked assemblies more accessible and more useful for research.
2) Thus far, researchers have mainly used RNA assembly pipelines (trinity) to produce differential gene expression results. These pipelines as an intermediate step produce (propose) RNA transcript sequences. Aligning these transcript sequences to a genome assembly and visualizing the alignments in a genome browser would be useful because it could help practitioners assess and understand the pipeline and the strategy of using RNA sequences only instead of RNA together with DNA sequence.
3) Interesting bioinformatics and genomics mystery: The annotated genes reportedly have no homology to genes in other species. Is this an artifact of genome assembly and annotation, or do tardigrades make unusual, entirely novel proteins?
4) Interesting natural history: There are terrestrial tardigrade species that revive after desiccation. How does this process work? Genomic experiments could let us know whether some sets of genes begin to show expression during dessication or re-animation, a step in the way to understanding these processes.
Goals for attending and participating in the GCC: