Details
-
Type:
Task
-
Status: Closed (View Workflow)
-
Priority:
Major
-
Resolution: Done
-
Affects Version/s: None
-
Fix Version/s: None
-
Labels:None
-
Story Points:2
-
Epic Link:
-
Sprint:Spring 6 2023 Mar 20, Spring 7 2023 Apr 10
Description
Muday Lab has detected possible sample switching in their time course data from three genotypes: ARE (mutant), F3H-OX3 (transgenic), and VF36 (wild-type, parent strain for F3H-OX3)
It was easy to identify which samples were ARE and not VF36 or F3H-OX3.
We need a way to distinguish F3H-OX3 from VF36 samples due to the sample switching issue.
To do this, we can use the fact that F3H-OX3 contains transgenic construct and VF36 does not.
Previously, we have found that the plant selective marker gene was expressed in soybean seeds in a transgenic line.
For this task, we'll investigate whether we can use the F3H-OX3 line's plant selection marker gene to distinguish transgenic from non-transgenic lines.
See:
- Open access article link: https://bmcbiotechnol.biomedcentral.com/articles/10.1186/s12896-015-0207-z
- Code repository: https://bitbucket.org/lorainelab/soyseq
Maarten has sent us three files with the construct information in them. These are added to the git repository here:
- ExternalDataSets/pK7WG2-F3H.fa
- ExternalDataSets/pK7WG2-F3H.gb
In this experiment, the plant selection gene was a kanamycin resistance gene.
For this task, use the kanamycin gene either as a query or a target in a search of the fastq files to identify samples that contain RNA-Seq reads from the kanamycin-resistance gene.
To do this, we need to investigate the different tools that might be available for this task.
Let's try to find an easy-to-use tool or approach that will let us identify RNA-Seq samples containing "kan" gene sequence.
Attachments
Issue Links
- relates to
-
IGBF-3290 Use new sample labels recommended by Muday Lab
-
- Closed
-
Ann's first idea on what to do:
Augment the tomato SL5 genome assembly and gene model annotations by adding a new "fake" chromosome containing the construct sequence and construct annotations provided by Maarten. Then, we use this as the target assembly in a fresh run of the nf-core rnaseq pipeline. This should produce a new "counts" file with counts for the newly added construct gene locations, including counts for the kanamycin resistance gene. This might not be a great idea! We might be able to find a more direct and simple approach that would be less work. But if we can't, I'm pretty sure this would get the job done.