Uploaded image for project: 'IGB'
  1. IGB
  2. IGBF-2911

Handle araTha1 synonym in track hub facade

    Details

    • Story Points:
      0.5
    • Sprint:
      Fall 1 2021 Aug 16 - Aug 27, Fall 2 2021 Aug 30 - Sep10, Fall 3 2021 Sep 13 - Sep 24

      Description

      The JASPAR TFBS track hub appears to have data for the Arabidopsis thaliana genome, but uses a name that IGB does not currently recognize: "araTha1".

      For this task, we need to figure out what the corresponding IGB name is and add it to IGB's synonyms system, or provide the synonym within the track hub synonyms endpoint.

      (See attached image.)

        Attachments

          Issue Links

            Activity

            Hide
            ann.loraine Ann Loraine added a comment -

            Recommend looking at synonyms.txt in IGB.
            attn: Nowlan Freese

            Show
            ann.loraine Ann Loraine added a comment - Recommend looking at synonyms.txt in IGB. attn: Nowlan Freese
            Hide
            pbadzuh Philip Badzuh (Inactive) added a comment -

            I have updated the species.txt endpoint of the hub API to better translate UCSC organism names into IGB species names.

            Please see my changes here.

            These changes involve:

            • First attempting to derive an IGB species name from, given a UCSC genome version.
            • If the above fails, default to using the track hub organism name as the IGB species name.
            • If track hub genome contains no organism information, give the genome a tentatively named umbrella species name of 'Trackhub misc'.

            In the case of the genome described in this issue, however, neither it nor its provided organism name map to the default/global IGB synonyms.txt or species.txt. Therefore, based on my changes, the genome shows up under the species name of ''thale cress". To get it to show up under the expected IGB species Arabidopsis thaliana, either "thale cress" needs to be added to the global species.txt, or "araTha1" needs to be added to the global synonyms.txt. Only one of this additions is required for these changes to allow for the correct placement of this genome under the appropriate species.

            Show
            pbadzuh Philip Badzuh (Inactive) added a comment - I have updated the species.txt endpoint of the hub API to better translate UCSC organism names into IGB species names. Please see my changes here . These changes involve: First attempting to derive an IGB species name from, given a UCSC genome version. If the above fails, default to using the track hub organism name as the IGB species name. If track hub genome contains no organism information, give the genome a tentatively named umbrella species name of 'Trackhub misc'. In the case of the genome described in this issue, however, neither it nor its provided organism name map to the default/global IGB synonyms.txt or species.txt. Therefore, based on my changes, the genome shows up under the species name of ''thale cress". To get it to show up under the expected IGB species Arabidopsis thaliana, either "thale cress" needs to be added to the global species.txt, or "araTha1" needs to be added to the global synonyms.txt. Only one of this additions is required for these changes to allow for the correct placement of this genome under the appropriate species.
            Hide
            ann.loraine Ann Loraine added a comment -

            Did code review only, not functional review. Will assess functionality during testing phase.

            Show
            ann.loraine Ann Loraine added a comment - Did code review only, not functional review. Will assess functionality during testing phase.
            Hide
            ann.loraine Ann Loraine added a comment -

            Philip Badzuh - Please submit PR.

            Show
            ann.loraine Ann Loraine added a comment - Philip Badzuh - Please submit PR.
            Hide
            pbadzuh Philip Badzuh (Inactive) added a comment -

            Please see PR here.

            Show
            pbadzuh Philip Badzuh (Inactive) added a comment - Please see PR here .
            Hide
            ann.loraine Ann Loraine added a comment -

            Merged and deployed to testing sites.

            Show
            ann.loraine Ann Loraine added a comment - Merged and deployed to testing sites.
            Hide
            ann.loraine Ann Loraine added a comment -

            The bug appears to be persisting as the data source mentioned above is not visible.
            I attempted to view data from the following URL by adding it as a data source. But when I visited the genome version for Arabidopsis by clicking on the image on the home page (which is the latest available) I did not see the data source appear.

            https://translate.bioviz.org/rest_api/?hubUrl=http://expdata.cmmt.ubc.ca/JASPAR/UCSC_tracks/hub.txt&fileName=/
            

            The host translate.bioviz.org is up-to-date with the latest track hub facade code.

            Show
            ann.loraine Ann Loraine added a comment - The bug appears to be persisting as the data source mentioned above is not visible. I attempted to view data from the following URL by adding it as a data source. But when I visited the genome version for Arabidopsis by clicking on the image on the home page (which is the latest available) I did not see the data source appear. https: //translate.bioviz.org/rest_api/?hubUrl=http://expdata.cmmt.ubc.ca/JASPAR/UCSC_tracks/hub.txt&fileName=/ The host translate.bioviz.org is up-to-date with the latest track hub facade code.
            Hide
            pbadzuh Philip Badzuh (Inactive) added a comment -

            Did you make the changes I describe near the end of this previous comment?

            Show
            pbadzuh Philip Badzuh (Inactive) added a comment - Did you make the changes I describe near the end of this previous comment ?
            Hide
            ann.loraine Ann Loraine added a comment -

            Sorry, I did not. Working on it.

            Show
            ann.loraine Ann Loraine added a comment - Sorry, I did not. Working on it.
            Hide
            ann.loraine Ann Loraine added a comment -
            Show
            ann.loraine Ann Loraine added a comment - Build and ran new version of IGB from this branch: https://bitbucket.org/aloraine/integrated-genome-browser/branch/IGBF-2911?dest=master
            Hide
            ann.loraine Ann Loraine added a comment -

            Adding the new synonym enables IGB to access the JASPAR track hub. However, all the data files appear to load extremely slowly, even when I zoom in and click "Load Data" for a smaller region.
            I tried downloading one of the files directly onto my computer, and it will take probably around 15 minutes or longer for the file to download.
            (This is the file URL: http://expdata.cmmt.ubc.ca/JASPAR/downloads/UCSC_tracks/2022/JASPAR2022_araTha1.bb)
            Is the "bigbed" format being handled the way it should in IGB, with the index from the top of the file getting loaded first followed by a second request to retrieve precisely the data needed? Will make a separate issue for this. For now, will merge in the new synonym update to the main development branch and move this ticket to closed.

            Show
            ann.loraine Ann Loraine added a comment - Adding the new synonym enables IGB to access the JASPAR track hub. However, all the data files appear to load extremely slowly, even when I zoom in and click "Load Data" for a smaller region. I tried downloading one of the files directly onto my computer, and it will take probably around 15 minutes or longer for the file to download. (This is the file URL: http://expdata.cmmt.ubc.ca/JASPAR/downloads/UCSC_tracks/2022/JASPAR2022_araTha1.bb ) Is the "bigbed" format being handled the way it should in IGB, with the index from the top of the file getting loaded first followed by a second request to retrieve precisely the data needed? Will make a separate issue for this. For now, will merge in the new synonym update to the main development branch and move this ticket to closed.
            Hide
            ann.loraine Ann Loraine added a comment -

            Merged into IGB master branch (POM version 9.1.10). Moving to Done.

            Show
            ann.loraine Ann Loraine added a comment - Merged into IGB master branch (POM version 9.1.10). Moving to Done.

              People

              • Assignee:
                pbadzuh Philip Badzuh (Inactive)
                Reporter:
                ann.loraine Ann Loraine
              • Votes:
                0 Vote for this issue
                Watchers:
                2 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: