Uploaded image for project: 'IGB'
  1. IGB
  2. IGBF-3169

Investigate IGB 9.1.10 opening genomes slowly

    Details

    • Type: Bug
    • Status: Closed (View Workflow)
    • Priority: Major
    • Resolution: Done
    • Affects Version/s: None
    • Fix Version/s: None
    • Labels:
      None

      Description

      Situation: While testing IGBF-3164 I noticed that 9.1.10 master was opening genomes much slower than IGB 9.1.8 release.

      Task: Investigate if 9.1.10 master is opening genomes more slowly than 9.1.8 and if so, why it is happening.

        Attachments

          Issue Links

            Activity

            Hide
            karthik Karthik Raveendran added a comment - - edited

            When loading genomes in 9.1.8 and 9.1.10, initially there is a difference in load times (22s and over 1 min respectively) but then the wait times reduces even though the caches are cleared. This is most likely not because of a code issue rather a server issue because the primary and secondary genome quickload sites were switched in these two versions. Whether switching the sites back would change this behavior is yet to be tested.

            Show
            karthik Karthik Raveendran added a comment - - edited When loading genomes in 9.1.8 and 9.1.10, initially there is a difference in load times (22s and over 1 min respectively) but then the wait times reduces even though the caches are cleared. This is most likely not because of a code issue rather a server issue because the primary and secondary genome quickload sites were switched in these two versions. Whether switching the sites back would change this behavior is yet to be tested.
            Hide
            nfreese Nowlan Freese added a comment - - edited

            There is something going on with the loading of genomes being more slowly on the most recent version of master. I'm checking individual commits for 9.1.10 to identify which commit caused the issue, but so far I have not been able to identify it.

            I will continue to investigate.

            Notes so far:
            On ninerwifi

            IGB 9.1.10
            primary http://igbquickload.org/quickload/
            mirror http://lorainelab-quickload.scidas.org/quickload/
            hg38 loaded with a slight pause of ~7 seconds

            IGB 9.1.8
            primary http://igbquickload.org/quickload/
            mirror http://lorainelab-quickload.scidas.org/quickload/
            hg38 loaded almost immediately with a pause of ~2 seconds

            Testing on Netbeans

            git checkout 4131d0f
            hg38 loaded with a slight pause of ~7 seconds

            git checkout b8e645f6d5e3013a96e44868adbe85a63e84e38d
            hg38 loaded with a slight pause of ~7 seconds

            testing on release-9.1.8
            hg38 loaded almost immediately with a pause of ~2 seconds

            git checkout 9f07c3c
            hg38 loaded almost immediately with a pause of ~2 seconds

            git checkout ddbb962
            Loaded RefSeq Curated in 1.246 s

            git checkout 6f00b4b
            Loaded RefSeq Curated in 1.312 s

            git checkout 25ea823
            hg38 loaded with a slight pause of ~7 seconds

            Show
            nfreese Nowlan Freese added a comment - - edited There is something going on with the loading of genomes being more slowly on the most recent version of master. I'm checking individual commits for 9.1.10 to identify which commit caused the issue, but so far I have not been able to identify it. I will continue to investigate. Notes so far: On ninerwifi IGB 9.1.10 primary http://igbquickload.org/quickload/ mirror http://lorainelab-quickload.scidas.org/quickload/ hg38 loaded with a slight pause of ~7 seconds IGB 9.1.8 primary http://igbquickload.org/quickload/ mirror http://lorainelab-quickload.scidas.org/quickload/ hg38 loaded almost immediately with a pause of ~2 seconds Testing on Netbeans git checkout 4131d0f hg38 loaded with a slight pause of ~7 seconds git checkout b8e645f6d5e3013a96e44868adbe85a63e84e38d hg38 loaded with a slight pause of ~7 seconds testing on release-9.1.8 hg38 loaded almost immediately with a pause of ~2 seconds git checkout 9f07c3c hg38 loaded almost immediately with a pause of ~2 seconds git checkout ddbb962 Loaded RefSeq Curated in 1.246 s git checkout 6f00b4b Loaded RefSeq Curated in 1.312 s git checkout 25ea823 hg38 loaded with a slight pause of ~7 seconds
            Hide
            nfreese Nowlan Freese added a comment -

            The difference in speed appears to be due to the UCSC DAS data provider being enabled by default in 9.1.10.

            The speed at which the hg38 genome loads is variable, but it is consistently faster when either the DAS data provider is disabled or when recent versions of IGB master are earlier than commit 25ea823 where DAS was enabled by default. The difference in loading is not enough at the moment that it would impact the user experience. With DAS enabled the hg38 genome loads in around 12 seconds, versus 7 seconds with DAS disabled. I have not encountered the longer loading times I was experiencing last week, where loading the genome could take up to 1 minute.

            The DAS data provider being the culprit would make sense, as I normally test using the Arabidopsis genome, which does not have any data provided by DAS. The issue being DAS may also explain the discrepancies in loading times while testing for this issue, as the DAS servers speed may be variable.

            As UCSC DAS provides additional data, genomes, and some functionality (external view and I think it is important for our new Track Hubs page) we may want to leave it enabled. However, I think it would be good to try to switch from DAS to the UCSC API, which may be faster.

            Show
            nfreese Nowlan Freese added a comment - The difference in speed appears to be due to the UCSC DAS data provider being enabled by default in 9.1.10. The speed at which the hg38 genome loads is variable, but it is consistently faster when either the DAS data provider is disabled or when recent versions of IGB master are earlier than commit 25ea823 where DAS was enabled by default. The difference in loading is not enough at the moment that it would impact the user experience. With DAS enabled the hg38 genome loads in around 12 seconds, versus 7 seconds with DAS disabled. I have not encountered the longer loading times I was experiencing last week, where loading the genome could take up to 1 minute. The DAS data provider being the culprit would make sense, as I normally test using the Arabidopsis genome, which does not have any data provided by DAS. The issue being DAS may also explain the discrepancies in loading times while testing for this issue, as the DAS servers speed may be variable. As UCSC DAS provides additional data, genomes, and some functionality (external view and I think it is important for our new Track Hubs page) we may want to leave it enabled. However, I think it would be good to try to switch from DAS to the UCSC API, which may be faster.
            Hide
            nfreese Nowlan Freese added a comment -

            I have moved this ticket to needs review. Assuming no further action is necessary for this ticket, it can be closed upon review.

            Show
            nfreese Nowlan Freese added a comment - I have moved this ticket to needs review. Assuming no further action is necessary for this ticket, it can be closed upon review.
            Hide
            ann.loraine Ann Loraine added a comment -

            Thank you for the clear explanation.

            I have a thought:

            We could implement a proxy+caching solution on the server to reduce response time for the DAS service. We could replace the DAS endpoint with a "bioviz" endpoint that would cache output that is slow and update the cache periodically, or, on demand.

            Another idea: We could create an "uptime" and "response" time monitor to let us know if and when the response time from the DAS host varies.

            Moving this to closed.

            Show
            ann.loraine Ann Loraine added a comment - Thank you for the clear explanation. I have a thought: We could implement a proxy+caching solution on the server to reduce response time for the DAS service. We could replace the DAS endpoint with a "bioviz" endpoint that would cache output that is slow and update the cache periodically, or, on demand. Another idea: We could create an "uptime" and "response" time monitor to let us know if and when the response time from the DAS host varies. Moving this to closed.

              People

              • Assignee:
                nfreese Nowlan Freese
                Reporter:
                nfreese Nowlan Freese
              • Votes:
                0 Vote for this issue
                Watchers:
                3 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: