Uploaded image for project: 'IGB'
  1. IGB
  2. IGBF-4316

Use index files of VCF for faster loading time

    Details

    • Type: Task
    • Status: Closed (View Workflow)
    • Priority: Major
    • Resolution: Done
    • Affects Version/s: None
    • Fix Version/s: None
    • Labels:
      None
    • Story Points:
      4
    • Sprint:
      Fall 1, Spring 3, Spring 5, Spring 6, Spring 8, Summer 2, Summer 3, Summer 4, Summer 6, Fall 2, Fall 3, Fall 4, Fall 5

      Description

      Identified two related issues in IGB 10.2.0 involving the loading of VCF files, both of which are regressions from version 10.1.0. These issues negatively affect performance and functionality when working with large or indexed VCF datasets.

      1. Performance and Memory Issue When Loading VCF Files

      When loading larger VCF files (e.g., 1KG.chr22.anno.infocol.vcf.gz), IGB 10.2.0 exhibits significantly increased memory usage and may crash or freeze when navigating to a gene.

      In IGB 10.1.0, loading this file and accessing gene-level data works as expected.

      In IGB 10.2.0, it appears the entire file is being loaded into memory when "Load Data" is clicked, rather than just the in-frame genomic region (as was the behavior in 10.1.0).

      This results in performance degradation and potential out-of-memory errors, especially with large datasets.

      2.* Failure to Load Tabix-Indexed VCF Files*

      When attempting to load a VCF file with an associated Tabix index (e.g., Genome in a Bottle VCF), IGB 10.2.0 throws the following error:

      ClassCastException: VCFSymLoaderTabix cannot be cast to QuickLoadSymLoader

      These same files load without issue in IGB 10.1.0.

      This appears to be a class loading or module registration issue introduced in the newer version, likely related to recent changes in VCF parsing logic.

        Attachments

          Issue Links

            Activity

            sjagarap saideepthi jagarapu (Inactive) created issue -
            sjagarap saideepthi jagarapu (Inactive) made changes -
            Field Original Value New Value
            Epic Link IGBF-3836 [ 23135 ]
            sjagarap saideepthi jagarapu (Inactive) made changes -
            Link This issue clones IGBF-4219 [ IGBF-4219 ]
            sjagarap saideepthi jagarapu (Inactive) made changes -
            Description Situation: Current implementation is failing when trying to load VCF from Quickload

            Issue eloboration:
            New implementation is using VCFFileReader not lineReader, which can process only local files not any kind of URL (http)

            UPDATE: The issue was with loading VCF file via URL. I do not think the issue was due to the file being part of a Quickload.
            identified two related issues in IGB 10.2.0 involving the loading of VCF files, both of which are regressions from version 10.1.0. These issues negatively affect performance and functionality when working with large or indexed VCF datasets.

            1. Performance and Memory Issue When Loading VCF Files

            When loading larger VCF files (e.g., 1KG.chr22.anno.infocol.vcf.gz), IGB 10.2.0 exhibits significantly increased memory usage and may crash or freeze when navigating to a gene.

            In IGB 10.1.0, loading this file and accessing gene-level data works as expected.

            In IGB 10.2.0, it appears the entire file is being loaded into memory when "Load Data" is clicked, rather than just the in-frame genomic region (as was the behavior in 10.1.0).

            This results in performance degradation and potential out-of-memory errors, especially with large datasets.

            2. Failure to Load Tabix-Indexed VCF Files

            When attempting to load a VCF file with an associated Tabix index (e.g., Genome in a Bottle VCF), IGB 10.2.0 throws the following error:

            ClassCastException: VCFSymLoaderTabix cannot be cast to QuickLoadSymLoader


            These same files load without issue in IGB 10.1.0.

            This appears to be a class loading or module registration issue introduced in the newer version, likely related to recent changes in VCF parsing logic.
            sjagarap saideepthi jagarapu (Inactive) made changes -
            Link This issue relates to IGBF-4219 [ IGBF-4219 ]
            sjagarap saideepthi jagarapu (Inactive) made changes -
            Link This issue clones IGBF-4219 [ IGBF-4219 ]
            sjagarap saideepthi jagarapu (Inactive) made changes -
            Description identified two related issues in IGB 10.2.0 involving the loading of VCF files, both of which are regressions from version 10.1.0. These issues negatively affect performance and functionality when working with large or indexed VCF datasets.

            1. Performance and Memory Issue When Loading VCF Files

            When loading larger VCF files (e.g., 1KG.chr22.anno.infocol.vcf.gz), IGB 10.2.0 exhibits significantly increased memory usage and may crash or freeze when navigating to a gene.

            In IGB 10.1.0, loading this file and accessing gene-level data works as expected.

            In IGB 10.2.0, it appears the entire file is being loaded into memory when "Load Data" is clicked, rather than just the in-frame genomic region (as was the behavior in 10.1.0).

            This results in performance degradation and potential out-of-memory errors, especially with large datasets.

            2. Failure to Load Tabix-Indexed VCF Files

            When attempting to load a VCF file with an associated Tabix index (e.g., Genome in a Bottle VCF), IGB 10.2.0 throws the following error:

            ClassCastException: VCFSymLoaderTabix cannot be cast to QuickLoadSymLoader


            These same files load without issue in IGB 10.1.0.

            This appears to be a class loading or module registration issue introduced in the newer version, likely related to recent changes in VCF parsing logic.
            Identified two related issues in IGB 10.2.0 involving the loading of VCF files, both of which are regressions from version 10.1.0. These issues negatively affect performance and functionality when working with large or indexed VCF datasets.

            1. *Performance and Memory Issue When Loading VCF Files*

            When loading larger VCF files (e.g., 1KG.chr22.anno.infocol.vcf.gz), IGB 10.2.0 exhibits significantly increased memory usage and may crash or freeze when navigating to a gene.

            In IGB 10.1.0, loading this file and accessing gene-level data works as expected.

            In IGB 10.2.0, it appears the entire file is being loaded into memory when "Load Data" is clicked, rather than just the in-frame genomic region (as was the behavior in 10.1.0).

            This results in performance degradation and potential out-of-memory errors, especially with large datasets.

            2.* Failure to Load Tabix-Indexed VCF Files*

            When attempting to load a VCF file with an associated Tabix index (e.g., Genome in a Bottle VCF), IGB 10.2.0 throws the following error:

            ClassCastException: VCFSymLoaderTabix cannot be cast to QuickLoadSymLoader

            These same files load without issue in IGB 10.1.0.

            This appears to be a class loading or module registration issue introduced in the newer version, likely related to recent changes in VCF parsing logic.
            sjagarap saideepthi jagarapu (Inactive) made changes -
            Status To-Do [ 10305 ] In Progress [ 3 ]
            ann.loraine Ann Loraine made changes -
            Sprint Fall 1, Spring 3, Spring 5, Spring 6, Spring 8, Summer 2, Summer 3, Summer 4, Summer 6, Fall 2 [ 83, 212, 214, 215, 217, 219, 220, 221, 223, 225 ] Fall 1, Spring 3, Spring 5, Spring 6, Spring 8, Summer 2, Summer 3, Summer 4, Summer 6, Fall 2, Fall 3 [ 83, 212, 214, 215, 217, 219, 220, 221, 223, 225, 226 ]
            ann.loraine Ann Loraine made changes -
            Rank Ranked higher
            sjagarap saideepthi jagarapu (Inactive) made changes -
            Status In Progress [ 3 ] Needs 1st Level Review [ 10005 ]
            sjagarap saideepthi jagarapu (Inactive) made changes -
            Assignee saideepthi jagarapu [ sjagarap ]
            nfreese Nowlan Freese made changes -
            Status Needs 1st Level Review [ 10005 ] First Level Review in Progress [ 10301 ]
            nfreese Nowlan Freese made changes -
            Status First Level Review in Progress [ 10301 ] Ready for Pull Request [ 10304 ]
            nfreese Nowlan Freese made changes -
            Status Ready for Pull Request [ 10304 ] Pull Request Submitted [ 10101 ]
            nfreese Nowlan Freese made changes -
            Status Pull Request Submitted [ 10101 ] Reviewing Pull Request [ 10303 ]
            nfreese Nowlan Freese made changes -
            Status Reviewing Pull Request [ 10303 ] To-Do [ 10305 ]
            nfreese Nowlan Freese made changes -
            Status To-Do [ 10305 ] In Progress [ 3 ]
            nfreese Nowlan Freese made changes -
            Status In Progress [ 3 ] Needs 1st Level Review [ 10005 ]
            pkulzer Paige Kulzer made changes -
            Assignee Paige Kulzer [ pkulzer ]
            pkulzer Paige Kulzer made changes -
            Status Needs 1st Level Review [ 10005 ] First Level Review in Progress [ 10301 ]
            pkulzer Paige Kulzer made changes -
            Status First Level Review in Progress [ 10301 ] Ready for Pull Request [ 10304 ]
            pkulzer Paige Kulzer made changes -
            Assignee Paige Kulzer [ pkulzer ] saideepthi jagarapu [ sjagarap ]
            pkulzer Paige Kulzer made changes -
            Status Ready for Pull Request [ 10304 ] Pull Request Submitted [ 10101 ]
            pkulzer Paige Kulzer made changes -
            Status Pull Request Submitted [ 10101 ] Reviewing Pull Request [ 10303 ]
            pkulzer Paige Kulzer made changes -
            Status Reviewing Pull Request [ 10303 ] To-Do [ 10305 ]
            pkulzer Paige Kulzer made changes -
            Status To-Do [ 10305 ] In Progress [ 3 ]
            pkulzer Paige Kulzer made changes -
            Status In Progress [ 3 ] Needs 1st Level Review [ 10005 ]
            pkulzer Paige Kulzer made changes -
            Assignee saideepthi jagarapu [ sjagarap ] Nowlan Freese [ nfreese ]
            pkulzer Paige Kulzer made changes -
            Link This issue relates to IGBF-4331 [ IGBF-4331 ]
            ann.loraine Ann Loraine made changes -
            Sprint Fall 1, Spring 3, Spring 5, Spring 6, Spring 8, Summer 2, Summer 3, Summer 4, Summer 6, Fall 2, Fall 3 [ 83, 212, 214, 215, 217, 219, 220, 221, 223, 225, 226 ] Fall 1, Spring 3, Spring 5, Spring 6, Spring 8, Summer 2, Summer 3, Summer 4, Summer 6, Fall 2, Fall 3, Fall 4 [ 83, 212, 214, 215, 217, 219, 220, 221, 223, 225, 226, 227 ]
            ann.loraine Ann Loraine made changes -
            Rank Ranked higher
            nfreese Nowlan Freese made changes -
            Status Needs 1st Level Review [ 10005 ] First Level Review in Progress [ 10301 ]
            nfreese Nowlan Freese made changes -
            Assignee Nowlan Freese [ nfreese ] saideepthi jagarapu [ sjagarap ]
            nfreese Nowlan Freese made changes -
            Status First Level Review in Progress [ 10301 ] To-Do [ 10305 ]
            sjagarap saideepthi jagarapu (Inactive) made changes -
            Status To-Do [ 10305 ] In Progress [ 3 ]
            pkulzer Paige Kulzer made changes -
            Status In Progress [ 3 ] Needs 1st Level Review [ 10005 ]
            pkulzer Paige Kulzer made changes -
            Status Needs 1st Level Review [ 10005 ] First Level Review in Progress [ 10301 ]
            pkulzer Paige Kulzer made changes -
            Assignee saideepthi jagarapu [ sjagarap ] Paige Kulzer [ pkulzer ]
            pkulzer Paige Kulzer made changes -
            Status First Level Review in Progress [ 10301 ] Ready for Pull Request [ 10304 ]
            pkulzer Paige Kulzer made changes -
            Assignee Paige Kulzer [ pkulzer ] saideepthi jagarapu [ sjagarap ]
            ann.loraine Ann Loraine made changes -
            Sprint Fall 1, Spring 3, Spring 5, Spring 6, Spring 8, Summer 2, Summer 3, Summer 4, Summer 6, Fall 2, Fall 3, Fall 4 [ 83, 212, 214, 215, 217, 219, 220, 221, 223, 225, 226, 227 ] Fall 1, Spring 3, Spring 5, Spring 6, Spring 8, Summer 2, Summer 3, Summer 4, Summer 6, Fall 2, Fall 3, Fall 4, Fall 5 [ 83, 212, 214, 215, 217, 219, 220, 221, 223, 225, 226, 227, 228 ]
            ann.loraine Ann Loraine made changes -
            Rank Ranked higher
            pkulzer Paige Kulzer made changes -
            Status Ready for Pull Request [ 10304 ] Pull Request Submitted [ 10101 ]
            nfreese Nowlan Freese made changes -
            Status Pull Request Submitted [ 10101 ] Reviewing Pull Request [ 10303 ]
            nfreese Nowlan Freese made changes -
            Status Reviewing Pull Request [ 10303 ] To-Do [ 10305 ]
            sjagarap saideepthi jagarapu (Inactive) made changes -
            Status To-Do [ 10305 ] In Progress [ 3 ]
            sjagarap saideepthi jagarapu (Inactive) made changes -
            Status In Progress [ 3 ] Needs 1st Level Review [ 10005 ]
            sjagarap saideepthi jagarapu (Inactive) made changes -
            Assignee saideepthi jagarapu [ sjagarap ]
            sjagarap saideepthi jagarapu (Inactive) made changes -
            Status Needs 1st Level Review [ 10005 ] First Level Review in Progress [ 10301 ]
            sjagarap saideepthi jagarapu (Inactive) made changes -
            Status First Level Review in Progress [ 10301 ] Ready for Pull Request [ 10304 ]
            sjagarap saideepthi jagarapu (Inactive) made changes -
            Status Ready for Pull Request [ 10304 ] Pull Request Submitted [ 10101 ]
            ann.loraine Ann Loraine made changes -
            Status Pull Request Submitted [ 10101 ] Reviewing Pull Request [ 10303 ]
            ann.loraine Ann Loraine made changes -
            Status Reviewing Pull Request [ 10303 ] Merged Needs Testing [ 10002 ]
            pkulzer Paige Kulzer made changes -
            Status Merged Needs Testing [ 10002 ] Post-merge Testing In Progress [ 10003 ]
            pkulzer Paige Kulzer made changes -
            Assignee Paige Kulzer [ pkulzer ]
            pkulzer Paige Kulzer made changes -
            Resolution Done [ 10000 ]
            Status Post-merge Testing In Progress [ 10003 ] Closed [ 6 ]
            pkulzer Paige Kulzer made changes -
            Assignee Paige Kulzer [ pkulzer ] saideepthi jagarapu [ sjagarap ]

              People

              • Assignee:
                sjagarap saideepthi jagarapu (Inactive)
                Reporter:
                sjagarap saideepthi jagarapu (Inactive)
              • Votes:
                0 Vote for this issue
                Watchers:
                4 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: