Uploaded image for project: 'IGB'
  1. IGB
  2. IGBF-3434

Identify the aligned region and visualize it in IGB

    Details

    • Type: Task
    • Status: Closed (View Workflow)
    • Priority: Major
    • Resolution: Done
    • Affects Version/s: None
    • Fix Version/s: None
    • Labels:
      None

      Description

      Relates to error: IGBF-3420

      Error:

      Exception in thread "main" java.lang.RuntimeException: Postion is too high (more than 64792705)
      at org.biojava.nbio.genome.parsers.twobit.TwoBitParser.setCurrentSequencePosition(TwoBitParser.java:191)
      at org.biojava.nbio.genome.parsers.twobit.TwoBitParser.loadFragment(TwoBitParser.java:332)
      at org.lorainelab.findjunctions.FindJunctions.main(FindJunctions.java:249)
      

      Issue: Looked at the bam file which looks fine but there are reads that expand upon the chromosome. They have been soft clipped but find_junctions isn't taking that into consideration and won't process the file entirely to create the .FJ.bed.gz file.

      Visualize the error in IGB.

        Attachments

          Issue Links

            Activity

            Hide
            Mdavis4290 Molly Davis added a comment - - edited

            Files used to visualize error:

            • Proof nothing wrong with bam file: Malintka-R1-0hr-25C-self.bam
            • Needed to view bam in IGB: Malintka-R1-0hr-25C-self.bam.bai
            • Error file: Malintka-R1-0hr-25C-self.FJ.bed.gz
            • Control file: Nagcarlang-R2-8hr-37C-self.FJ.bed.gz

            Directory: /projects/tomato_genome/fnb/dataprocessing/30-804059537-KP/S_lycopersicum_Sep_2019/results/star_salmon

            Visualization of the Junction Error in IGB:

            Issue was seen at end of chromosome 10 SL4. You can see in the red track that a lot of data is missing for the junction file, while in the green the junction file is normal and was processed correctly.

            End of chromosome 10:

            Show
            Mdavis4290 Molly Davis added a comment - - edited Files used to visualize error : Proof nothing wrong with bam file : Malintka-R1-0hr-25C-self.bam Needed to view bam in IGB : Malintka-R1-0hr-25C-self.bam.bai Error file : Malintka-R1-0hr-25C-self.FJ.bed.gz Control file : Nagcarlang-R2-8hr-37C-self.FJ.bed.gz Directory: /projects/tomato_genome/fnb/dataprocessing/30-804059537-KP/S_lycopersicum_Sep_2019/results/star_salmon Visualization of the Junction Error in IGB : Issue was seen at end of chromosome 10 SL4. You can see in the red track that a lot of data is missing for the junction file, while in the green the junction file is normal and was processed correctly. End of chromosome 10 :
            Hide
            nfreese Nowlan Freese added a comment -

            I have added a test bam file with the problematic reads (SL40ch10_64782127_64792705.bam as well as its index) from Malintka-R1-0hr-25C-self.bam.
            Code:

            samtools view -b Malintka-R1-0hr-25C-self.bam SL4.0ch10:64782127-64792705 > SL40ch10_64782127_64792705.bam
            samtools index SL40ch10_64782127_64792705.bam
            

            To test the SL40ch10_64782127_64792705.bam with find junctions run:

            java -jar find-junctions-1.0.0-jar-with-dependencies.jar -u -f 5 -b S_lycopersicum_Sep_2019.2bit -o SL40ch10_64782127_64792705.FJ.bed SL40ch10_64782127_64792705.bam
            

            The S_lycopersicum_Sep_2019.2bit file can be found here.
            The find-junctions-1.0.0-jar-with-dependencies.jar can be found here.

            Show
            nfreese Nowlan Freese added a comment - I have added a test bam file with the problematic reads (SL40ch10_64782127_64792705.bam as well as its index) from Malintka-R1-0hr-25C-self.bam. Code: samtools view -b Malintka-R1-0hr-25C-self.bam SL4.0ch10:64782127-64792705 > SL40ch10_64782127_64792705.bam samtools index SL40ch10_64782127_64792705.bam To test the SL40ch10_64782127_64792705.bam with find junctions run: java -jar find-junctions-1.0.0-jar-with-dependencies.jar -u -f 5 -b S_lycopersicum_Sep_2019.2bit -o SL40ch10_64782127_64792705.FJ.bed SL40ch10_64782127_64792705.bam The S_lycopersicum_Sep_2019.2bit file can be found here . The find-junctions-1.0.0-jar-with-dependencies.jar can be found here .
            Hide
            Mdavis4290 Molly Davis added a comment -

            Next step: Create ticket to modify junction code so that it accepts soft clipping issue and the junction file can be complete.

            Show
            Mdavis4290 Molly Davis added a comment - Next step : Create ticket to modify junction code so that it accepts soft clipping issue and the junction file can be complete.
            Hide
            nfreese Nowlan Freese added a comment -

            I have created a new ticket, IGBF-3435 to address the edge case in FindJunctions.

            Show
            nfreese Nowlan Freese added a comment - I have created a new ticket, IGBF-3435 to address the edge case in FindJunctions.

              People

              • Assignee:
                Mdavis4290 Molly Davis
                Reporter:
                Mdavis4290 Molly Davis
              • Votes:
                0 Vote for this issue
                Watchers:
                2 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: