Uploaded image for project: 'IGB'
  1. IGB
  2. IGBF-3434

Identify the aligned region and visualize it in IGB

    Details

    • Type: Task
    • Status: Closed (View Workflow)
    • Priority: Major
    • Resolution: Done
    • Affects Version/s: None
    • Fix Version/s: None
    • Labels:
      None

      Description

      Relates to error: IGBF-3420

      Error:

      Exception in thread "main" java.lang.RuntimeException: Postion is too high (more than 64792705)
      at org.biojava.nbio.genome.parsers.twobit.TwoBitParser.setCurrentSequencePosition(TwoBitParser.java:191)
      at org.biojava.nbio.genome.parsers.twobit.TwoBitParser.loadFragment(TwoBitParser.java:332)
      at org.lorainelab.findjunctions.FindJunctions.main(FindJunctions.java:249)
      

      Issue: Looked at the bam file which looks fine but there are reads that expand upon the chromosome. They have been soft clipped but find_junctions isn't taking that into consideration and won't process the file entirely to create the .FJ.bed.gz file.

      Visualize the error in IGB.

        Attachments

        1. Junction_Error.png
          141 kB
          Molly Davis
        2. SL4_ch10.png
          122 kB
          Molly Davis
        3. SL40ch10_64782127_64792705.bam
          35 kB
          Nowlan Freese
        4. SL40ch10_64782127_64792705.bam.bai
          31 kB
          Nowlan Freese

          Issue Links

            Activity

            Hide
            Mdavis4290 Molly Davis added a comment - - edited

            Files used to visualize error:

            • Proof nothing wrong with bam file: Malintka-R1-0hr-25C-self.bam
            • Needed to view bam in IGB: Malintka-R1-0hr-25C-self.bam.bai
            • Error file: Malintka-R1-0hr-25C-self.FJ.bed.gz
            • Control file: Nagcarlang-R2-8hr-37C-self.FJ.bed.gz

            Directory: /projects/tomato_genome/fnb/dataprocessing/30-804059537-KP/S_lycopersicum_Sep_2019/results/star_salmon

            Visualization of the Junction Error in IGB:

            Issue was seen at end of chromosome 10 SL4. You can see in the red track that a lot of data is missing for the junction file, while in the green the junction file is normal and was processed correctly.

            End of chromosome 10:

            Show
            Mdavis4290 Molly Davis added a comment - - edited Files used to visualize error : Proof nothing wrong with bam file : Malintka-R1-0hr-25C-self.bam Needed to view bam in IGB : Malintka-R1-0hr-25C-self.bam.bai Error file : Malintka-R1-0hr-25C-self.FJ.bed.gz Control file : Nagcarlang-R2-8hr-37C-self.FJ.bed.gz Directory: /projects/tomato_genome/fnb/dataprocessing/30-804059537-KP/S_lycopersicum_Sep_2019/results/star_salmon Visualization of the Junction Error in IGB : Issue was seen at end of chromosome 10 SL4. You can see in the red track that a lot of data is missing for the junction file, while in the green the junction file is normal and was processed correctly. End of chromosome 10 :
            Hide
            nfreese Nowlan Freese added a comment -

            I have added a test bam file with the problematic reads (SL40ch10_64782127_64792705.bam as well as its index) from Malintka-R1-0hr-25C-self.bam.
            Code:

            samtools view -b Malintka-R1-0hr-25C-self.bam SL4.0ch10:64782127-64792705 > SL40ch10_64782127_64792705.bam
            samtools index SL40ch10_64782127_64792705.bam
            

            To test the SL40ch10_64782127_64792705.bam with find junctions run:

            java -jar find-junctions-1.0.0-jar-with-dependencies.jar -u -f 5 -b S_lycopersicum_Sep_2019.2bit -o SL40ch10_64782127_64792705.FJ.bed SL40ch10_64782127_64792705.bam
            

            The S_lycopersicum_Sep_2019.2bit file can be found here.
            The find-junctions-1.0.0-jar-with-dependencies.jar can be found here.

            Show
            nfreese Nowlan Freese added a comment - I have added a test bam file with the problematic reads (SL40ch10_64782127_64792705.bam as well as its index) from Malintka-R1-0hr-25C-self.bam. Code: samtools view -b Malintka-R1-0hr-25C-self.bam SL4.0ch10:64782127-64792705 > SL40ch10_64782127_64792705.bam samtools index SL40ch10_64782127_64792705.bam To test the SL40ch10_64782127_64792705.bam with find junctions run: java -jar find-junctions-1.0.0-jar-with-dependencies.jar -u -f 5 -b S_lycopersicum_Sep_2019.2bit -o SL40ch10_64782127_64792705.FJ.bed SL40ch10_64782127_64792705.bam The S_lycopersicum_Sep_2019.2bit file can be found here . The find-junctions-1.0.0-jar-with-dependencies.jar can be found here .
            Hide
            Mdavis4290 Molly Davis added a comment -

            Next step: Create ticket to modify junction code so that it accepts soft clipping issue and the junction file can be complete.

            Show
            Mdavis4290 Molly Davis added a comment - Next step : Create ticket to modify junction code so that it accepts soft clipping issue and the junction file can be complete.
            Hide
            nfreese Nowlan Freese added a comment -

            I have created a new ticket, IGBF-3435 to address the edge case in FindJunctions.

            Show
            nfreese Nowlan Freese added a comment - I have created a new ticket, IGBF-3435 to address the edge case in FindJunctions.

              People

              • Assignee:
                Mdavis4290 Molly Davis
                Reporter:
                Mdavis4290 Molly Davis
              • Votes:
                0 Vote for this issue
                Watchers:
                2 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: