Uploaded image for project: 'IGB'
  1. IGB
  2. IGBF-3303

Investigate: Distinguish F3H-OX3 from VF36 samples

    Details

    • Type: Task
    • Status: Closed (View Workflow)
    • Priority: Major
    • Resolution: Done
    • Affects Version/s: None
    • Fix Version/s: None
    • Labels:
      None

      Description

      Muday Lab has detected possible sample switching in their time course data from three genotypes: ARE (mutant), F3H-OX3 (transgenic), and VF36 (wild-type, parent strain for F3H-OX3)

      It was easy to identify which samples were ARE and not VF36 or F3H-OX3.

      We need a way to distinguish F3H-OX3 from VF36 samples due to the sample switching issue.

      To do this, we can use the fact that F3H-OX3 contains transgenic construct and VF36 does not.

      Previously, we have found that the plant selective marker gene was expressed in soybean seeds in a transgenic line.

      For this task, we'll investigate whether we can use the F3H-OX3 line's plant selection marker gene to distinguish transgenic from non-transgenic lines.

      See:

      Maarten has sent us three files with the construct information in them. These are added to the git repository here:

      • ExternalDataSets/pK7WG2-F3H.fa
      • ExternalDataSets/pK7WG2-F3H.gb

      In this experiment, the plant selection gene was a kanamycin resistance gene.

      For this task, use the kanamycin gene either as a query or a target in a search of the fastq files to identify samples that contain RNA-Seq reads from the kanamycin-resistance gene.

      To do this, we need to investigate the different tools that might be available for this task.

      Let's try to find an easy-to-use tool or approach that will let us identify RNA-Seq samples containing "kan" gene sequence.

        Attachments

          Issue Links

            Activity

            Hide
            robofjoy Robert Reid added a comment -

            Some other notes related to this.

            On Tuesday's tech meeting, the group discussed other possibilities that we could collectively try.
            In the end, not much came up as things we can try.
            Gloria is hunting down the original plasmid that was used in the transformation. Maybe from that there is some sequence we can also hunt for.

            I talked about this with Rick White. He has done plant transformations and was fascinated by the problem. (And impressed how IGB was able to detect the mislabeling by the read count profiles!!). He suspected that we would not be able to detect kanamycin. Maybe in genome sequence but that too would be a long shot.

            Show
            robofjoy Robert Reid added a comment - Some other notes related to this. On Tuesday's tech meeting, the group discussed other possibilities that we could collectively try. In the end, not much came up as things we can try. Gloria is hunting down the original plasmid that was used in the transformation. Maybe from that there is some sequence we can also hunt for. I talked about this with Rick White. He has done plant transformations and was fascinated by the problem. (And impressed how IGB was able to detect the mislabeling by the read count profiles!!). He suspected that we would not be able to detect kanamycin. Maybe in genome sequence but that too would be a long shot.
            Hide
            ann.loraine Ann Loraine added a comment -

            During discussion in scrum we realized that results are inconclusive from the kanamycin grep strategy. Let's halt this approach and try the approach from the soybean paper mentioned above. Making a new Jira issue to pursue this other strategy.

            Closing this "investigate" ticket to try the other strategy.

            Show
            ann.loraine Ann Loraine added a comment - During discussion in scrum we realized that results are inconclusive from the kanamycin grep strategy. Let's halt this approach and try the approach from the soybean paper mentioned above. Making a new Jira issue to pursue this other strategy. Closing this "investigate" ticket to try the other strategy.
            Hide
            robofjoy Robert Reid added a comment -

            The STAR run to see if ANY read sequences align to the kanamycin gene that Anthony from Gloria's lab sent.

            He sent it in a Word doc: (KANR2)

            GTCTGGAAAGAAATGCATAAACTTTTGCCATTCTCACCGGATTCAGTCGTCACTCATGGTGATTTCTCACTTGATAA
            CCTTATTTTTGACGAGGGGAAATTAATAGGTTGTATTGATGTTGGACGAGTCGGAATCGCAGACCGATACCAGGA
            TCTTGCCATCCTATGGAACTGCCTCGGTGAGTTTTCTCCTTCATTACAGAAACGGCTTTTTCAAAAATATGGTATTGA
            TAATCCTGATATGAATAAATTGCAGTTTCATTTGATGCTCGATGAGTTTTAACATGGATGCTGATTTATATGGGTAT
            AAATGGGCTCGCGATAATGTCGGGCAATCAGGTGCGACAATCTATCGCTTGTATGGGAAGCCCGATGCGCCAGAG
            TTGTTTCTGAAACATGGCAAAGGTAGCGTTGCCAATGATGTTACAGATGAGATGGTCAGACTAAACTGGCTGACG
            GAATTTATGCCTCTTCCGACCATCAAGCATTTTATCCGTACTCCTGATGATGCATGGTTACTCACCACTGCGATCCCC
            GGAAAAACAGCATTCCAGGTATTAGAAGAATATCCTGATTCAGGTGAAAATATTGTTGATGCGCTGGCAGTGTTCC
            TGCGCCGGTTGCATTCGATTCCTGTTTGTAATTGTCCTTTTAACAGCGATCGCGTATTTCGTCTCGCTCAGGCGCAA
            TCACGAATGAATAACGGTTTGGTTGATGCGAGTGATTTTGATGACGAGCGTAATGGCTGGCCTGTTGAACAAGTC
            TGGAAAGAAATGCATAAACTTTTGCCATTCTCACCGGATTCAGTCGTCACTCATGGTGATTTCTCACTTGATAACCT
            TATTTTTGACGAGGGGAAATTAATAGGTTGTATTGATGTTGGACGAGTCGGAATCGCAGACCGATACCAGGATCT
            TGCCATCCTATGGAACTGCCTCGGTGAGTTTTCTCCTTCATTACAGAAACGGCTTTTTCAAAAATATGGTATTGATA
            ATCCTGATATGAATAAATTGCAGTTTCATTTGATGCTCGATGAGTTTT

            I made a fasta out of this and is on the cluster at this location:
            /nobackup/tomato_genome/muday-144/nfcore-2022/kanhunt/kanR2_SeqfromAnthony-fullsequence.fna

            I treated the above sequence as the reference file for a STAR alignment.
            Results to go here:
            /nobackup/tomato_genome/muday-144/nfcore-2022/kanhunt/star

            First step in STAR is the make a genome index file.
            This is done once via this script:
            /nobackup/tomato_genome/muday-144/nfcore-2022/kanhunt/star-generategenome.slurm

            And then we run 144 jobs as an array, 1 for each read sequence in the experiment:
            star-chunks.slurm
            (silly name, I first planned to align to small chunks, named it this way)

            We get 144 bam files, all of which are empty. The log files appear that STAR ran successfully.

            Show
            robofjoy Robert Reid added a comment - The STAR run to see if ANY read sequences align to the kanamycin gene that Anthony from Gloria's lab sent. He sent it in a Word doc: (KANR2) GTCTGGAAAGAAATGCATAAACTTTTGCCATTCTCACCGGATTCAGTCGTCACTCATGGTGATTTCTCACTTGATAA CCTTATTTTTGACGAGGGGAAATTAATAGGTTGTATTGATGTTGGACGAGTCGGAATCGCAGACCGATACCAGGA TCTTGCCATCCTATGGAACTGCCTCGGTGAGTTTTCTCCTTCATTACAGAAACGGCTTTTTCAAAAATATGGTATTGA TAATCCTGATATGAATAAATTGCAGTTTCATTTGATGCTCGATGAGTTTTAACATGGATGCTGATTTATATGGGTAT AAATGGGCTCGCGATAATGTCGGGCAATCAGGTGCGACAATCTATCGCTTGTATGGGAAGCCCGATGCGCCAGAG TTGTTTCTGAAACATGGCAAAGGTAGCGTTGCCAATGATGTTACAGATGAGATGGTCAGACTAAACTGGCTGACG GAATTTATGCCTCTTCCGACCATCAAGCATTTTATCCGTACTCCTGATGATGCATGGTTACTCACCACTGCGATCCCC GGAAAAACAGCATTCCAGGTATTAGAAGAATATCCTGATTCAGGTGAAAATATTGTTGATGCGCTGGCAGTGTTCC TGCGCCGGTTGCATTCGATTCCTGTTTGTAATTGTCCTTTTAACAGCGATCGCGTATTTCGTCTCGCTCAGGCGCAA TCACGAATGAATAACGGTTTGGTTGATGCGAGTGATTTTGATGACGAGCGTAATGGCTGGCCTGTTGAACAAGTC TGGAAAGAAATGCATAAACTTTTGCCATTCTCACCGGATTCAGTCGTCACTCATGGTGATTTCTCACTTGATAACCT TATTTTTGACGAGGGGAAATTAATAGGTTGTATTGATGTTGGACGAGTCGGAATCGCAGACCGATACCAGGATCT TGCCATCCTATGGAACTGCCTCGGTGAGTTTTCTCCTTCATTACAGAAACGGCTTTTTCAAAAATATGGTATTGATA ATCCTGATATGAATAAATTGCAGTTTCATTTGATGCTCGATGAGTTTT I made a fasta out of this and is on the cluster at this location: /nobackup/tomato_genome/muday-144/nfcore-2022/kanhunt/kanR2_SeqfromAnthony-fullsequence.fna I treated the above sequence as the reference file for a STAR alignment. Results to go here: /nobackup/tomato_genome/muday-144/nfcore-2022/kanhunt/star First step in STAR is the make a genome index file. This is done once via this script: /nobackup/tomato_genome/muday-144/nfcore-2022/kanhunt/star-generategenome.slurm And then we run 144 jobs as an array, 1 for each read sequence in the experiment: star-chunks.slurm (silly name, I first planned to align to small chunks, named it this way) We get 144 bam files, all of which are empty. The log files appear that STAR ran successfully.
            Hide
            robofjoy Robert Reid added a comment -

            Blast RUN:

            Just like Star run but for blast.
            Location:
            /nobackup/tomato_genome/muday-144/nfcore-2022/kanhunt/blast

            Script:
            /nobackup/tomato_genome/muday-144/nfcore-2022/kanhunt/

            -Need to run seqtk to convert the fastq to fasta to run blast.

            So far there is an issue. 35 jobs complete. And the rest do not.
            Getting kicked off the cluster because of my internet. Hope to resolve from office.
            Of the 35 jobs that completed, no sequences were found to hit at a e-score cutofff = 1e-25.

            Show
            robofjoy Robert Reid added a comment - Blast RUN: Just like Star run but for blast. Location: /nobackup/tomato_genome/muday-144/nfcore-2022/kanhunt/blast Script: /nobackup/tomato_genome/muday-144/nfcore-2022/kanhunt/ -Need to run seqtk to convert the fastq to fasta to run blast. So far there is an issue. 35 jobs complete. And the rest do not. Getting kicked off the cluster because of my internet. Hope to resolve from office. Of the 35 jobs that completed, no sequences were found to hit at a e-score cutofff = 1e-25.
            Hide
            ann.loraine Ann Loraine added a comment -

            Using kan-r gene in this way does not yield much. Closing this ticket.
            Next strategy to try will be running an RNA-Seq alignment tool on the fastq files and the new plasmid genome, either by itself or with the rest of the tomato SL5 genome assembly, using the fasta file and bed file made by [~molly] in IGBF-3306.

            We will make a new ticket for running this RNA-Seq alignment strategy to detect Kan-r expression.

            Show
            ann.loraine Ann Loraine added a comment - Using kan-r gene in this way does not yield much. Closing this ticket. Next strategy to try will be running an RNA-Seq alignment tool on the fastq files and the new plasmid genome, either by itself or with the rest of the tomato SL5 genome assembly, using the fasta file and bed file made by [~molly] in IGBF-3306. We will make a new ticket for running this RNA-Seq alignment strategy to detect Kan-r expression.

              People

              • Assignee:
                robofjoy Robert Reid
                Reporter:
                ann.loraine Ann Loraine
              • Votes:
                0 Vote for this issue
                Watchers:
                Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: