Uploaded image for project: 'IGB'
  1. IGB
  2. IGBF-4316

Use index files of VCF for faster loading time

    Details

    • Type: Task
    • Status: Closed (View Workflow)
    • Priority: Major
    • Resolution: Done
    • Affects Version/s: None
    • Fix Version/s: None
    • Labels:
      None
    • Story Points:
      4
    • Sprint:
      Fall 1, Spring 3, Spring 5, Spring 6, Spring 8, Summer 2, Summer 3, Summer 4, Summer 6, Fall 2, Fall 3, Fall 4, Fall 5

      Description

      Identified two related issues in IGB 10.2.0 involving the loading of VCF files, both of which are regressions from version 10.1.0. These issues negatively affect performance and functionality when working with large or indexed VCF datasets.

      1. Performance and Memory Issue When Loading VCF Files

      When loading larger VCF files (e.g., 1KG.chr22.anno.infocol.vcf.gz), IGB 10.2.0 exhibits significantly increased memory usage and may crash or freeze when navigating to a gene.

      In IGB 10.1.0, loading this file and accessing gene-level data works as expected.

      In IGB 10.2.0, it appears the entire file is being loaded into memory when "Load Data" is clicked, rather than just the in-frame genomic region (as was the behavior in 10.1.0).

      This results in performance degradation and potential out-of-memory errors, especially with large datasets.

      2.* Failure to Load Tabix-Indexed VCF Files*

      When attempting to load a VCF file with an associated Tabix index (e.g., Genome in a Bottle VCF), IGB 10.2.0 throws the following error:

      ClassCastException: VCFSymLoaderTabix cannot be cast to QuickLoadSymLoader

      These same files load without issue in IGB 10.1.0.

      This appears to be a class loading or module registration issue introduced in the newer version, likely related to recent changes in VCF parsing logic.

        Attachments

          Issue Links

            Activity

            Hide
            pkulzer Paige Kulzer added a comment -

            Downloaded the early release version of IGB from bioviz.org and tested on my Mac. VCF files with space-separated headings are being handled with appropriate warning messages. VCF files with an index in the same directory are loading much faster than they do without an index. VCF files can be added via URL or locally and seem to be working consistently.

            Closing ticket!

            Show
            pkulzer Paige Kulzer added a comment - Downloaded the early release version of IGB from bioviz.org and tested on my Mac. VCF files with space-separated headings are being handled with appropriate warning messages. VCF files with an index in the same directory are loading much faster than they do without an index. VCF files can be added via URL or locally and seem to be working consistently. Closing ticket!
            Hide
            ann.loraine Ann Loraine added a comment -

            New installers are built and deployed to early access section of bioviz.org website.
            Ready for final testing.

            Show
            ann.loraine Ann Loraine added a comment - New installers are built and deployed to early access section of bioviz.org website. Ready for final testing.
            Hide
            ann.loraine Ann Loraine added a comment -

            Branch is merged into main.

            Show
            ann.loraine Ann Loraine added a comment - Branch is merged into main.
            Hide
            sjagarap saideepthi jagarapu (Inactive) added a comment - - edited
            Show
            sjagarap saideepthi jagarapu (Inactive) added a comment - - edited Sure Nowlan Freese ! Done with final requested changes https://bitbucket.org/lorainelab/integrated-genome-browser/pull-requests/1082
            Hide
            nfreese Nowlan Freese added a comment - - edited

            saideepthi jagarapu - two requests:
            1) Can you please add back the following line to UnindexedSymLoader.java:

            // strategyList.add(LoadStrategy.CHROMOSOME);

            It's a minor thing but I would like to leave it in as it is out of scope for the VCF work.

            2) Can you change your commit message to something like:

            Update VCF parsing logic to use HTSJDK

            Show
            nfreese Nowlan Freese added a comment - - edited saideepthi jagarapu - two requests: 1) Can you please add back the following line to UnindexedSymLoader.java: // strategyList.add(LoadStrategy.CHROMOSOME); It's a minor thing but I would like to leave it in as it is out of scope for the VCF work. 2) Can you change your commit message to something like: Update VCF parsing logic to use HTSJDK

              People

              • Assignee:
                sjagarap saideepthi jagarapu (Inactive)
                Reporter:
                sjagarap saideepthi jagarapu (Inactive)
              • Votes:
                0 Vote for this issue
                Watchers:
                4 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: