Uploaded image for project: 'IGB'
  1. IGB
  2. IGBF-3780

Implement logic to load the available genomes from Ensembl API

    Details

    • Type: Task
    • Status: Closed (View Workflow)
    • Priority: Major
    • Resolution: Done
    • Affects Version/s: None
    • Fix Version/s: None
    • Labels:
      None

      Description

      Implement logic to load the available genomes from Ensembl API:

      To get the list of species (genomes) available in Ensembl:
      GET info/divisions
      Get the list of all Ensembl divisions for which information is available: https://rest.ensembl.org/documentation/info/info_divisions
      Example: https://rest.ensembl.org/info/divisions?content-type=application/json

      GET info/species
      Lists all available species, their aliases, available adaptor groups, and data release.
      To get the species for that ensembl division, have to specify the division optional parameter.
      API to get the list of all species: https://rest.ensembl.org/documentation/info/species
      Example: https://rest.ensembl.org/info/species?content-type=application/json
      Use name, display_name, and assembly from the API response as the species name, tooltip description, and version respectively.

      To get all the available species we have to call the above info/species API for all the ensemble divisions that are returned from the first API.

      Add the logic in the retrieveDataContainer() method to check whether the species that is being loaded by the Ensembl provider is already present in the species list, if it's present don't add it if not add the species to the list and also, write logic to get the species name for the provided versionName.

        Attachments

          Issue Links

            Activity

            jsirigin Jaya Sravani Sirigineedi created issue -
            jsirigin Jaya Sravani Sirigineedi made changes -
            Field Original Value New Value
            Epic Link IGBF-3555 [ 22774 ]
            jsirigin Jaya Sravani Sirigineedi made changes -
            Status To-Do [ 10305 ] In Progress [ 3 ]
            jsirigin Jaya Sravani Sirigineedi made changes -
            Story Points 2 3
            jsirigin Jaya Sravani Sirigineedi made changes -
            Description Implement logic to load the available genomes from Ensembl API:

            *+To get the list of species (genomes) available in Ensembl:+*
            *GET info/divisions*
            Get the list of all Ensembl divisions for which information is available: https://rest.ensembl.org/documentation/info/info_divisions
            Example: https://rest.ensembl.org/info/divisions?content-type=application/json

            *GET info/species*
            Lists all available species, their aliases, available adaptor groups, and data release.
            To get the species for that ensembl division, have to specify the division optional parameter.
            API to get the list of all species: https://rest.ensembl.org/documentation/info/species
            Example: https://rest.ensembl.org/info/species?content-type=application/json
            _Use name, display_name, and accession from the API response as the species name, tooltip description, and version respectively._

            To get all the available species we have to call the above info/species API for all the ensemble divisions that are returned from the first API.
            Implement logic to load the available genomes from Ensembl API:

            *+To get the list of species (genomes) available in Ensembl:+*
            *GET info/divisions*
            Get the list of all Ensembl divisions for which information is available: https://rest.ensembl.org/documentation/info/info_divisions
            Example: https://rest.ensembl.org/info/divisions?content-type=application/json

            *GET info/species*
            Lists all available species, their aliases, available adaptor groups, and data release.
            To get the species for that ensembl division, have to specify the division optional parameter.
            API to get the list of all species: https://rest.ensembl.org/documentation/info/species
            Example: https://rest.ensembl.org/info/species?content-type=application/json
            _Use name, display_name, and accession from the API response as the species name, tooltip description, and version respectively._

            To get all the available species we have to call the above info/species API for all the ensemble divisions that are returned from the first API.

            Add the logic in the retrieveDataContainer() method to check whether the species that is being loaded by the Ensembl provider is already present in the species list, if it's present don't add it if not add the species to the list and also, write logic to get the species name for the provided versionName.
            Hide
            jsirigin Jaya Sravani Sirigineedi added a comment -

            While investigating the ticket, found that the assembly variable from the response is more suited for that version name than accession, so will be using that. Changed the description in the ticket and also added one more requirement to this as both are related.

            Show
            jsirigin Jaya Sravani Sirigineedi added a comment - While investigating the ticket, found that the assembly variable from the response is more suited for that version name than accession, so will be using that. Changed the description in the ticket and also added one more requirement to this as both are related.
            jsirigin Jaya Sravani Sirigineedi made changes -
            Description Implement logic to load the available genomes from Ensembl API:

            *+To get the list of species (genomes) available in Ensembl:+*
            *GET info/divisions*
            Get the list of all Ensembl divisions for which information is available: https://rest.ensembl.org/documentation/info/info_divisions
            Example: https://rest.ensembl.org/info/divisions?content-type=application/json

            *GET info/species*
            Lists all available species, their aliases, available adaptor groups, and data release.
            To get the species for that ensembl division, have to specify the division optional parameter.
            API to get the list of all species: https://rest.ensembl.org/documentation/info/species
            Example: https://rest.ensembl.org/info/species?content-type=application/json
            _Use name, display_name, and accession from the API response as the species name, tooltip description, and version respectively._

            To get all the available species we have to call the above info/species API for all the ensemble divisions that are returned from the first API.

            Add the logic in the retrieveDataContainer() method to check whether the species that is being loaded by the Ensembl provider is already present in the species list, if it's present don't add it if not add the species to the list and also, write logic to get the species name for the provided versionName.
            Implement logic to load the available genomes from Ensembl API:

            *+To get the list of species (genomes) available in Ensembl:+*
            *GET info/divisions*
            Get the list of all Ensembl divisions for which information is available: https://rest.ensembl.org/documentation/info/info_divisions
            Example: https://rest.ensembl.org/info/divisions?content-type=application/json

            *GET info/species*
            Lists all available species, their aliases, available adaptor groups, and data release.
            To get the species for that ensembl division, have to specify the division optional parameter.
            API to get the list of all species: https://rest.ensembl.org/documentation/info/species
            Example: https://rest.ensembl.org/info/species?content-type=application/json
            _Use name, display_name, and assembly from the API response as the species name, tooltip description, and version respectively._

            To get all the available species we have to call the above info/species API for all the ensemble divisions that are returned from the first API.

            Add the logic in the retrieveDataContainer() method to check whether the species that is being loaded by the Ensembl provider is already present in the species list, if it's present don't add it if not add the species to the list and also, write logic to get the species name for the provided versionName.
            Hide
            jsirigin Jaya Sravani Sirigineedi added a comment -

            Development is done for this ticket, but while testing found that there are a lot of available genomes from Ensembl because of that the application is not starting. I have included multi-threading in making the API calls, but the issue is after the provider is initialized. This method loadSupportedGenomeVersions() which is responsible for adding the species to the species list for a given provider is the issue, this has a loop that goes over all the genomeVersions that the provider has in the case of Ensembl it's 33694 so the method is taking a very long time to run, right now working on optimizing this by introducing multi-threading in this method as well.

            Show
            jsirigin Jaya Sravani Sirigineedi added a comment - Development is done for this ticket, but while testing found that there are a lot of available genomes from Ensembl because of that the application is not starting. I have included multi-threading in making the API calls, but the issue is after the provider is initialized. This method loadSupportedGenomeVersions() which is responsible for adding the species to the species list for a given provider is the issue, this has a loop that goes over all the genomeVersions that the provider has in the case of Ensembl it's 33694 so the method is taking a very long time to run, right now working on optimizing this by introducing multi-threading in this method as well.
            jsirigin Jaya Sravani Sirigineedi made changes -
            Link This issue is blocked by IGBF-3782 [ IGBF-3782 ]
            ann.loraine Ann Loraine made changes -
            Sprint Summer 2 [ 196 ] Summer 2, Summer 3 [ 196, 197 ]
            ann.loraine Ann Loraine made changes -
            Rank Ranked higher
            jsirigin Jaya Sravani Sirigineedi made changes -
            Link This issue is blocked by IGBF-3782 [ IGBF-3782 ]
            jsirigin Jaya Sravani Sirigineedi made changes -
            Link This issue relates to IGBF-3782 [ IGBF-3782 ]
            ann.loraine Ann Loraine made changes -
            Sprint Summer 2, Summer 3 [ 196, 197 ] Summer 2, Summer 3, Summer 4 [ 196, 197, 198 ]
            ann.loraine Ann Loraine made changes -
            Rank Ranked higher
            jsirigin Jaya Sravani Sirigineedi made changes -
            Story Points 3 5
            Hide
            jsirigin Jaya Sravani Sirigineedi added a comment -

            Modified and optimized the code, now with all the divisions IGB takes around 45 seconds to load, with Ensembl taking around 35 to initialize, and with only the three divisions it takes around 16 seconds to load, with Ensembl taking only 4 secs to initialize. Need to discuss these with Nowlan Freese and either close the ticket or continue investigating.

            Show
            jsirigin Jaya Sravani Sirigineedi added a comment - Modified and optimized the code, now with all the divisions IGB takes around 45 seconds to load, with Ensembl taking around 35 to initialize, and with only the three divisions it takes around 16 seconds to load, with Ensembl taking only 4 secs to initialize. Need to discuss these with Nowlan Freese and either close the ticket or continue investigating.
            Hide
            jsirigin Jaya Sravani Sirigineedi added a comment -

            Updated code is at the branch: https://bitbucket.org/jaya-sravani/integrated-genome-browser/branch/IGBF-3780. Nowlan Freese Please review and test it, and let me know if there are any issues.

            Show
            jsirigin Jaya Sravani Sirigineedi added a comment - Updated code is at the branch: https://bitbucket.org/jaya-sravani/integrated-genome-browser/branch/IGBF-3780 . Nowlan Freese Please review and test it, and let me know if there are any issues.
            jsirigin Jaya Sravani Sirigineedi made changes -
            Status In Progress [ 3 ] Needs 1st Level Review [ 10005 ]
            jsirigin Jaya Sravani Sirigineedi made changes -
            Assignee Jaya Sravani Sirigineedi [ jsirigin ] Nowlan Freese [ nfreese ]
            nfreese Nowlan Freese made changes -
            Status Needs 1st Level Review [ 10005 ] First Level Review in Progress [ 10301 ]
            ann.loraine Ann Loraine made changes -
            Sprint Summer 2, Summer 3, Summer 4 [ 196, 197, 198 ] Summer 2, Summer 3, Summer 4, Summer 5 [ 196, 197, 198, 199 ]
            ann.loraine Ann Loraine made changes -
            Rank Ranked higher
            nfreese Nowlan Freese made changes -
            Status First Level Review in Progress [ 10301 ] Needs 1st Level Review [ 10005 ]
            Hide
            nfreese Nowlan Freese added a comment -

            Tested Sravani's branch on Mac.

            IGB built successfully without any issues.
            Measuring wall clock time, the Initializing Ensembl REST API: https://rest.ensembl.org/ appeared to take about 30 seconds.
            IGB showed many, many genomes available.

            Jaya Sravani Sirigineedi - I'm a little concerned about the length of time it took to initialize the Ensembl data provider. I feel like it was consistently taking around 30 seconds to initialize on my laptop. I think we go ahead and merge in these changes, but we should probably consider further limiting which divisions we are adding by default, if any.

            Ready for pull request.

            Show
            nfreese Nowlan Freese added a comment - Tested Sravani's branch on Mac. IGB built successfully without any issues. Measuring wall clock time, the Initializing Ensembl REST API: https://rest.ensembl.org/ appeared to take about 30 seconds. IGB showed many, many genomes available. Jaya Sravani Sirigineedi - I'm a little concerned about the length of time it took to initialize the Ensembl data provider. I feel like it was consistently taking around 30 seconds to initialize on my laptop. I think we go ahead and merge in these changes, but we should probably consider further limiting which divisions we are adding by default, if any. Ready for pull request.
            nfreese Nowlan Freese made changes -
            Assignee Nowlan Freese [ nfreese ] Jaya Sravani Sirigineedi [ jsirigin ]
            nfreese Nowlan Freese made changes -
            Status Needs 1st Level Review [ 10005 ] First Level Review in Progress [ 10301 ]
            nfreese Nowlan Freese made changes -
            Status First Level Review in Progress [ 10301 ] Ready for Pull Request [ 10304 ]
            ann.loraine Ann Loraine made changes -
            Sprint Summer 2, Summer 3, Summer 4, Summer 5 [ 196, 197, 198, 199 ] Summer 2, Summer 3, Summer 4, Summer 5, Summer 6 [ 196, 197, 198, 199, 200 ]
            ann.loraine Ann Loraine made changes -
            Rank Ranked higher
            Hide
            jsirigin Jaya Sravani Sirigineedi added a comment -

            Raised the pull request: https://bitbucket.org/lorainelab/integrated-genome-browser/pull-requests/1027. Please review and let me know if there are any issues.

            Show
            jsirigin Jaya Sravani Sirigineedi added a comment - Raised the pull request: https://bitbucket.org/lorainelab/integrated-genome-browser/pull-requests/1027 . Please review and let me know if there are any issues.
            jsirigin Jaya Sravani Sirigineedi made changes -
            Status Ready for Pull Request [ 10304 ] Pull Request Submitted [ 10101 ]
            jsirigin Jaya Sravani Sirigineedi made changes -
            Assignee Jaya Sravani Sirigineedi [ jsirigin ]
            Hide
            ann.loraine Ann Loraine added a comment - - edited

            Bold moves Jaya Sravani Sirigineedi and Nowlan Freese!

            PR is merged and newly built installers are deployed to bioviz.org "early access" section. Ready for testing.

            Show
            ann.loraine Ann Loraine added a comment - - edited Bold moves Jaya Sravani Sirigineedi and Nowlan Freese ! PR is merged and newly built installers are deployed to bioviz.org "early access" section. Ready for testing.
            ann.loraine Ann Loraine made changes -
            Status Pull Request Submitted [ 10101 ] Reviewing Pull Request [ 10303 ]
            ann.loraine Ann Loraine made changes -
            Status Reviewing Pull Request [ 10303 ] Merged Needs Testing [ 10002 ]
            nfreese Nowlan Freese made changes -
            Status Merged Needs Testing [ 10002 ] Post-merge Testing In Progress [ 10003 ]
            nfreese Nowlan Freese made changes -
            Assignee Nowlan Freese [ nfreese ]
            Hide
            nfreese Nowlan Freese added a comment -

            Tested main branch installer on Mac.

            IGB showed the genomes from Ensembl in the Species dropdown (note that sequence/annotations/chromosomes do not load yet, that will be in another ticket).
            IGB starts quickly, around 10 seconds wall time. I'm not sure why it is so much faster this time.

            Closing ticket.

            Show
            nfreese Nowlan Freese added a comment - Tested main branch installer on Mac. IGB showed the genomes from Ensembl in the Species dropdown (note that sequence/annotations/chromosomes do not load yet, that will be in another ticket). IGB starts quickly, around 10 seconds wall time. I'm not sure why it is so much faster this time. Closing ticket.
            nfreese Nowlan Freese made changes -
            Assignee Nowlan Freese [ nfreese ] Jaya Sravani Sirigineedi [ jsirigin ]
            nfreese Nowlan Freese made changes -
            Resolution Done [ 10000 ]
            Status Post-merge Testing In Progress [ 10003 ] Closed [ 6 ]

              People

              • Assignee:
                jsirigin Jaya Sravani Sirigineedi
                Reporter:
                jsirigin Jaya Sravani Sirigineedi
              • Votes:
                0 Vote for this issue
                Watchers:
                3 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: