Uploaded image for project: 'IGB'
  1. IGB
  2. IGBF-3079

CLONE - Attempt to convert and load each track hub featured in the table interface

    Details

    • Type: Task
    • Status: Closed (View Workflow)
    • Priority: Major
    • Resolution: Done
    • Affects Version/s: None
    • Fix Version/s: None
    • Labels:
      None
    • Story Points:
      2.5
    • Sprint:
      Fall 3 2021 Sep 13 - Sep 24, Fall 4 2021 Sep 27 - Oct 8, Spring 4 2022 Feb 14 - Feb 25, Spring 5 2022 Feb 28 - Mar 11, Spring 6 2022 Mar 14 - Mar 25, Spring 7 2022 Mar 28 - Apr 8, Summer 1 2022 May 23

      Description

      The goal of this issue is to identify unknown bugs or usability issues in the current implementation of the track hub facade interface and back end.

      Check at least one genome (ideally all of them) for each track hub.

      Also, make note of any odd behavior in the genome and species menus. (These are populated from "species" and "synonyms" and "contents" endpoints and are therefore vulnerable to bugs in those endpoints.)

        Attachments

          Issue Links

            Activity

            Hide
            ann.loraine Ann Loraine added a comment -

            Omkar Marne asks for clarification regarding "odd behavior":

            How to test:

            • Make sure you have activated the DAS data source in the Preferences window
            • Use the latest master branch version of IGB (it is 9.1.10, currently)
            • After testing a converted Track Hub URL, reset preferences to defaults

            During testing, check the Species and Genome Version menus. Ensure that you see the species and genome versions of the Quickload there. The Genome Versions should be listed using the IGB name, not the UCSC name. If you notice the UCSC name instead, make a note of it. This is because IGB listing the UCSC name instead of the IGB name indicates that we would need to update our synonyms infrastructure.

            Show
            ann.loraine Ann Loraine added a comment - Omkar Marne asks for clarification regarding "odd behavior": How to test: Make sure you have activated the DAS data source in the Preferences window Use the latest master branch version of IGB (it is 9.1.10, currently) After testing a converted Track Hub URL, reset preferences to defaults During testing, check the Species and Genome Version menus. Ensure that you see the species and genome versions of the Quickload there. The Genome Versions should be listed using the IGB name, not the UCSC name. If you notice the UCSC name instead, make a note of it. This is because IGB listing the UCSC name instead of the IGB name indicates that we would need to update our synonyms infrastructure.
            Hide
            pbadzuh Philip Badzuh (Inactive) added a comment -

            When adding quickload data sources from the converter site which contain genomes without an 'open in IGB' button, these will be listed in IGB under their UCSC genome names, which are the ones used in the table in such a case. Now that I think about it, for these genome versions, information regarding the species to which they relate is not present in the table, so there is no way to open the genome except for trying all possible species to see if any has that genome version available. Also, the fact that a single hub can have data for multiple species would make this worse. [~aloraine], I think we removed from the table the species associated with each genome - which were in parenthesis to the right of each genome version - to include an IGB genome name equivalent in parenthesis instead, if available. I think species information should be added back to the genome column in the table - we could maybe group genome versions by species. What do you think?

            Show
            pbadzuh Philip Badzuh (Inactive) added a comment - When adding quickload data sources from the converter site which contain genomes without an 'open in IGB' button, these will be listed in IGB under their UCSC genome names, which are the ones used in the table in such a case. Now that I think about it, for these genome versions, information regarding the species to which they relate is not present in the table, so there is no way to open the genome except for trying all possible species to see if any has that genome version available. Also, the fact that a single hub can have data for multiple species would make this worse. [~aloraine] , I think we removed from the table the species associated with each genome - which were in parenthesis to the right of each genome version - to include an IGB genome name equivalent in parenthesis instead, if available. I think species information should be added back to the genome column in the table - we could maybe group genome versions by species. What do you think?
            Hide
            ann.loraine Ann Loraine added a comment -

            Actually, I think what happens is that an unrecognized genome version name gets listed in the species menu. Can you check?

            Show
            ann.loraine Ann Loraine added a comment - Actually, I think what happens is that an unrecognized genome version name gets listed in the species menu. Can you check?
            Hide
            omarne Omkar Marne (Inactive) added a comment - - edited

            Testing is in progress.

            Below genomes can be successfully opened in IGB. Open in IGB link works fine I think.

            • hg19 (H_sapiens_Feb_2009)
            • hg38 (H_sapiens_Dec_2013)
            • hg19 (H_sapiens_Feb_2009)
            • hg19 (H_sapiens_Feb_2009)
            • canFam3 (C_familiaris_Sep_2011)
            • hg38 (H_sapiens_Dec_2013)
            • hg19 (H_sapiens_Feb_2009)
            • hg19 (H_sapiens_Feb_2009)
            • hg38 (H_sapiens_Dec_2013)
            • hg19 (H_sapiens_Feb_2009)
            • ailMel1 (A_melanoleuca_Dec_2009)
            • latCha1 (L_chalumnae_Aug_2011)
            • mm9 (M_musculus_Jul_2007)
            • hg19 (H_sapiens_Feb_2009)
            • hg19 (H_sapiens_Feb_2009)
            • hg38 (H_sapiens_Dec_2013)
            • mm9 (M_musculus_Jul_2007)
            • hg38 (H_sapiens_Dec_2013)
            • hg19 (H_sapiens_Feb_2009)
            • hg38 (H_sapiens_Dec_2013)
            • danRer11 (D_rerio_May_2017)
            • danRer10 (D_rerio_Sep_2014)
            • hg19 (H_sapiens_Feb_2009)
            • hg19 (H_sapiens_Feb_2009)
            • hg38 (H_sapiens_Dec_2013)
            • hg19 (H_sapiens_Feb_2009)
            • hg38 (H_sapiens_Dec_2013)
            • hg38 (H_sapiens_Dec_2013)
            • hg19 (H_sapiens_Feb_2009)
            • hg38 (H_sapiens_Dec_2013)
            • hg38 (H_sapiens_Dec_2013)
            • tair10 (A_thaliana_Jun_2009)
            • panTro4 (P_troglodytes_Feb_2011)
            • hg19 (H_sapiens_Feb_2009)
            • mm10 (M_musculus_Dec_2011)
            • hg19 (H_sapiens_Feb_2009)
            • hg38 (H_sapiens_Dec_2013)
            • hg38 (H_sapiens_Dec_2013)
            • mm10 (M_musculus_Dec_2011)
            • mm10 (M_musculus_Dec_2011)
            • hg19 (H_sapiens_Feb_2009)
            • hg38 (H_sapiens_Dec_2013)
            • hg19 (H_sapiens_Feb_2009)
            • mm10 (M_musculus_Dec_2011)
            • hg38 (H_sapiens_Dec_2013)
            • galGal5 (G_gallus_Dec_2015)
            • danRer7 (D_rerio_Jul_2010)
            • hg19 (H_sapiens_Feb_2009)
            • hg38 (H_sapiens_Dec_2013)

            Below Trackhub URL are unreachable through IGB but the file can be opened through browser.

            Show
            omarne Omkar Marne (Inactive) added a comment - - edited Testing is in progress. Below genomes can be successfully opened in IGB. Open in IGB link works fine I think. hg19 (H_sapiens_Feb_2009) hg38 (H_sapiens_Dec_2013) hg19 (H_sapiens_Feb_2009) hg19 (H_sapiens_Feb_2009) canFam3 (C_familiaris_Sep_2011) hg38 (H_sapiens_Dec_2013) hg19 (H_sapiens_Feb_2009) hg19 (H_sapiens_Feb_2009) hg38 (H_sapiens_Dec_2013) hg19 (H_sapiens_Feb_2009) ailMel1 (A_melanoleuca_Dec_2009) latCha1 (L_chalumnae_Aug_2011) mm9 (M_musculus_Jul_2007) hg19 (H_sapiens_Feb_2009) hg19 (H_sapiens_Feb_2009) hg38 (H_sapiens_Dec_2013) mm9 (M_musculus_Jul_2007) hg38 (H_sapiens_Dec_2013) hg19 (H_sapiens_Feb_2009) hg38 (H_sapiens_Dec_2013) danRer11 (D_rerio_May_2017) danRer10 (D_rerio_Sep_2014) hg19 (H_sapiens_Feb_2009) hg19 (H_sapiens_Feb_2009) hg38 (H_sapiens_Dec_2013) hg19 (H_sapiens_Feb_2009) hg38 (H_sapiens_Dec_2013) hg38 (H_sapiens_Dec_2013) hg19 (H_sapiens_Feb_2009) hg38 (H_sapiens_Dec_2013) hg38 (H_sapiens_Dec_2013) tair10 (A_thaliana_Jun_2009) panTro4 (P_troglodytes_Feb_2011) hg19 (H_sapiens_Feb_2009) mm10 (M_musculus_Dec_2011) hg19 (H_sapiens_Feb_2009) hg38 (H_sapiens_Dec_2013) hg38 (H_sapiens_Dec_2013) mm10 (M_musculus_Dec_2011) mm10 (M_musculus_Dec_2011) hg19 (H_sapiens_Feb_2009) hg38 (H_sapiens_Dec_2013) hg19 (H_sapiens_Feb_2009) mm10 (M_musculus_Dec_2011) hg38 (H_sapiens_Dec_2013) galGal5 (G_gallus_Dec_2015) danRer7 (D_rerio_Jul_2010) hg19 (H_sapiens_Feb_2009) hg38 (H_sapiens_Dec_2013) Below Trackhub URL are unreachable through IGB but the file can be opened through browser. https://translate.bioviz.org/api/?hubUrl=https://ftp.ncbi.nlm.nih.gov/snp/population_frequency/TrackHub/20200227123210/hub.txt&filePath=/ https://translate.bioviz.org/api/?hubUrl=https://hgdownload.soe.ucsc.edu/hubs/birds/hub.txt&filePath=/ https://translate.bioviz.org/api/?hubUrl=https://s3.us-east-2.amazonaws.com/brainepigenome/hub.txt&filePath=/ https://translate.bioviz.org/api/?hubUrl=https://raw.githubusercontent.com/Kraus-Lab/publichubs/master/breast_cancer_lncrna_pmid_26236012/hub.txt&filePath=/ https://translate.bioviz.org/api/?hubUrl=https://raw.githubusercontent.com/Kraus-Lab/publichubs/master/breast_cancer_lncrna_pmid_26236012/hub.txt&filePath=/ https://translate.bioviz.org/api/?hubUrl=https://data.broadinstitute.org/vgb/dog/dog/hub.txt&filePath=/ https://translate.bioviz.org/api/?hubUrl=http://waterston.gs.washington.edu/trackhubs/isolates/hub.txt&filePath=/ https://translate.bioviz.org/api/?hubUrl=https://krishna.gs.washington.edu/download/CADD/bigWig/CADD-browserTracks/hub.txt&filePath=/ https://translate.bioviz.org/api/?hubUrl=http://www.bcgsc.ca/downloads/edcc/data/CEMT/hub/bcgsc_datahub.txt&filePath=/ https://translate.bioviz.org/api/?hubUrl=https://biochem.wisc.edu/sites/default/files/labs/pike/genome-ucsc/GSE79815_MSC_HUB/hub.txt&filePath=/ https://translate.bioviz.org/api/?hubUrl=http://openslice.fenyolab.org/tracks/CPTAC/cptac/v1/hub.txt&filePath=/ https://translate.bioviz.org/api/?hubUrl=https://raw.githubusercontent.com/Iuliana-Ionita-Laza/Colocalization_tracks/master/hub.txt&filePath=/ https://translate.bioviz.org/api/?hubUrl=http://lisanwanglab.org/DASHR/tracks/hub.txt&filePath=/ https://translate.bioviz.org/api/?hubUrl=http://dbrip.org/dbRIPHub/hub.txt&filePath=/ https://translate.bioviz.org/api/?hubUrl=https://www.dfam.org/web_download/DfamHub/hub.txt&filePath=/ https://translate.bioviz.org/api/?hubUrl=https://resources.altius.org/~jvierstra/projects/footprinting.2020/hub.txt&filePath=/ https://translate.bioviz.org/api/?hubUrl=http://ftp.ebi.ac.uk/pub/databases/ensembl/encode/integration_data_jan2011/hub.txt&filePath=/ https://translate.bioviz.org/api/?hubUrl=https://api.wenglab.org/screen_v13/hubs/integrative52/hub.txt&filePath=/
            Hide
            omarne Omkar Marne (Inactive) added a comment - - edited

            Observed IGB Usability issues:

            • Prolonged genome data loading after clicking the 'Open In IGB' link. ( Around 30 secs).
            • After adding quickload URL as a data source, the data source wasn't available instantly for the some of the genomes. I had to restart the IGB to make it available.
            • Removing data sources caused IGB to exit or hang. Error in the log: Exception in thread "AWT-EventQueue-0" java.lang.OutOfMemoryError: unable to create a new native thread.
            Show
            omarne Omkar Marne (Inactive) added a comment - - edited Observed IGB Usability issues: Prolonged genome data loading after clicking the 'Open In IGB' link. ( Around 30 secs). After adding quickload URL as a data source, the data source wasn't available instantly for the some of the genomes. I had to restart the IGB to make it available. Removing data sources caused IGB to exit or hang. Error in the log: Exception in thread "AWT-EventQueue-0" java.lang.OutOfMemoryError: unable to create a new native thread.
            Hide
            omarne Omkar Marne (Inactive) added a comment - - edited

            Spreadsheet link for Genome and HUB name which cannot be opened using IGB : https://docs.google.com/spreadsheets/d/1l8Fwvk3AsWngRsvv3os5kDn4Y1klw1NxLOvsN4KC3Sk/edit#gid=0

            I have observed a pattern of Quickload URLs that are reachable through IGB when added as a data source but my observation could be wrong. Quickload URLs of genome Homo Sapien and M_musculus are reachable through IGB. Quickload URLs of other genomes are unreachable. Also, most of the Quickload URLs belong to the Homo Sapien genome. Quickload URLs of Genomes that don't have the 'Open in IGB' link are unreachable.

            Show
            omarne Omkar Marne (Inactive) added a comment - - edited Spreadsheet link for Genome and HUB name which cannot be opened using IGB : https://docs.google.com/spreadsheets/d/1l8Fwvk3AsWngRsvv3os5kDn4Y1klw1NxLOvsN4KC3Sk/edit#gid=0 I have observed a pattern of Quickload URLs that are reachable through IGB when added as a data source but my observation could be wrong. Quickload URLs of genome Homo Sapien and M_musculus are reachable through IGB. Quickload URLs of other genomes are unreachable. Also, most of the Quickload URLs belong to the Homo Sapien genome. Quickload URLs of Genomes that don't have the 'Open in IGB' link are unreachable.
            Hide
            pbadzuh Philip Badzuh (Inactive) added a comment -

            The main issue seems to be related to IGBF-2964 and IGBF-3001. After doing some further investigation, I have noted the following:

            Sample trackhub, for viewing SSL certificate in chrome: https://cgl.gi.ucsc.edu/data/cactus/363-avian-2020-hub/Antrostomus_carolinensis/

            • Current validation as implemented here is not robust. Unexpected last element in certDomainComponents.
              • expected: gov
              • actual: gov, O=National Library of Medicine, L=Bethesda, ST=Maryland, C=US
            • Alternative name in IGB is different from chrome
              • IGB uses v2 of X.509, chrome uses v3. It could be that IGB is requesting outdated cert files form server by using v2 instead of v3

            UPDATE IGB to use v3 - update dependency? Java version? java version 1.8.0_181 currently used in IGB

            Show
            pbadzuh Philip Badzuh (Inactive) added a comment - The main issue seems to be related to IGBF-2964 and IGBF-3001 . After doing some further investigation, I have noted the following: Sample trackhub, for viewing SSL certificate in chrome: https://cgl.gi.ucsc.edu/data/cactus/363-avian-2020-hub/Antrostomus_carolinensis/ Current validation as implemented here is not robust. Unexpected last element in certDomainComponents. expected: gov actual: gov, O=National Library of Medicine, L=Bethesda, ST=Maryland, C=US Alternative name in IGB is different from chrome IGB uses v2 of X.509, chrome uses v3. It could be that IGB is requesting outdated cert files form server by using v2 instead of v3 UPDATE IGB to use v3 - update dependency? Java version? java version 1.8.0_181 currently used in IGB
            Hide
            ann.loraine Ann Loraine added a comment - - edited

            Attaching spreadsheet of defective quickload urls downloaded from spreadsheet created by Omkar Marne.

            Show
            ann.loraine Ann Loraine added a comment - - edited Attaching spreadsheet of defective quickload urls downloaded from spreadsheet created by Omkar Marne .
            Hide
            ann.loraine Ann Loraine added a comment - - edited

            The newer Java update included in the new MacOS installer might correct the possible problem with IGB using v2 instead of v3 of something?

            Re-opening this issue so that we can test it.

            To test, try opening track hub from previous comment, quoting:

            Sample trackhub, for viewing SSL certificate in chrome: https://cgl.gi.ucsc.edu/data/cactus/363-avian-2020-hub/Antrostomus_carolinensis/

            Show
            ann.loraine Ann Loraine added a comment - - edited The newer Java update included in the new MacOS installer might correct the possible problem with IGB using v2 instead of v3 of something? Re-opening this issue so that we can test it. To test, try opening track hub from previous comment, quoting: Sample trackhub, for viewing SSL certificate in chrome: https://cgl.gi.ucsc.edu/data/cactus/363-avian-2020-hub/Antrostomus_carolinensis/
            Hide
            nfreese Nowlan Freese added a comment - - edited

            Testing on Mac OS 11.6.5 with IGB zulu. My overall testing strategy was to add the track hubs as a new data source and attempt to load the genome and data.

            I tested the track hub referenced in previous comments (https://translate.bioviz.org/api/?hubUrl=https://cgl.gi.ucsc.edu/data/cactus/363-avian-2020-hub/hub.txt&filePath=/) in IGB zulu and I was able to load data for Antrostomus carolinensis > b10kgenes > bigGenePred > Bird_10K_genes with no issues in the logs. However, when I tested this same file in IGB 9.1.8 release it also worked.

            Upon testing other trackhubs mentioned to have issues in previous comments I ran into varying issues.

            Attempting to load data from the track hub https://translate.bioviz.org/api/?hubUrl=https://data.broadinstitute.org/vgb/dog/dog/hub.txt&filePath=/ resulted in a null pointer exception in the BigBedSymLoader class.

            Attempting to load data from the track hub https://translate.bioviz.org/api/?hubUrl=https://ftp.ncbi.nlm.nih.gov/snp/population_frequency/TrackHub/20200227123210/hub.txt&filePath=/ resulted in a URL not reachable, status code = -1 error.

            Attempting to load data from the track hub https://translate.bioviz.org/api/?hubUrl=https://s3-us-west-2.amazonaws.com/croc-paper-hub/hub.txt&filePath=/ resulted in a URL not reachable, status code = 403 error.

            Attempting to load data from the track hub https://translate.bioviz.org/api/?hubUrl=http://augustus.uni-greifswald.de/hubs/pisaster/hub.txt&filePath=/ resulted in an invalid URL or annots.xml not found error.

            Attempting to load data from the track hub https://translate.bioviz.org/api/?hubUrl=https://data.cyverse.org/dav-anon/iplant/home/samksmith/steguina_track-hub/steguina/hub.txt&filePath=/ worked with no issues.

            It is unclear to me what the issue was with the Antrostomus carolinensis genome data, as I was able to load the sequence and data. Without specific testing instructions for this genome I am unable to replicate and test the issue as it appears to work on both Zulu and release versions of IGB.

            Regarding the other errors, there seems to be a myriad of errors occurring that are preventing data from loading from different track hubs, as outlined above. These public track hubs assume that the data are valid and the track hub hosting site configured correctly. Unfortunately I think this may not be the case as it appears some of the files may be corrupt, locked, or in general unreachable. It may be difficult and/or time consuming to track down why each of these data files fails to load in IGB.

            Show
            nfreese Nowlan Freese added a comment - - edited Testing on Mac OS 11.6.5 with IGB zulu. My overall testing strategy was to add the track hubs as a new data source and attempt to load the genome and data. I tested the track hub referenced in previous comments ( https://translate.bioviz.org/api/?hubUrl=https://cgl.gi.ucsc.edu/data/cactus/363-avian-2020-hub/hub.txt&filePath=/ ) in IGB zulu and I was able to load data for Antrostomus carolinensis > b10kgenes > bigGenePred > Bird_10K_genes with no issues in the logs. However, when I tested this same file in IGB 9.1.8 release it also worked. Upon testing other trackhubs mentioned to have issues in previous comments I ran into varying issues. Attempting to load data from the track hub https://translate.bioviz.org/api/?hubUrl=https://data.broadinstitute.org/vgb/dog/dog/hub.txt&filePath=/ resulted in a null pointer exception in the BigBedSymLoader class. Attempting to load data from the track hub https://translate.bioviz.org/api/?hubUrl=https://ftp.ncbi.nlm.nih.gov/snp/population_frequency/TrackHub/20200227123210/hub.txt&filePath=/ resulted in a URL not reachable, status code = -1 error. Attempting to load data from the track hub https://translate.bioviz.org/api/?hubUrl=https://s3-us-west-2.amazonaws.com/croc-paper-hub/hub.txt&filePath=/ resulted in a URL not reachable, status code = 403 error. Attempting to load data from the track hub https://translate.bioviz.org/api/?hubUrl=http://augustus.uni-greifswald.de/hubs/pisaster/hub.txt&filePath=/ resulted in an invalid URL or annots.xml not found error. Attempting to load data from the track hub https://translate.bioviz.org/api/?hubUrl=https://data.cyverse.org/dav-anon/iplant/home/samksmith/steguina_track-hub/steguina/hub.txt&filePath=/ worked with no issues. It is unclear to me what the issue was with the Antrostomus carolinensis genome data, as I was able to load the sequence and data. Without specific testing instructions for this genome I am unable to replicate and test the issue as it appears to work on both Zulu and release versions of IGB. Regarding the other errors, there seems to be a myriad of errors occurring that are preventing data from loading from different track hubs, as outlined above. These public track hubs assume that the data are valid and the track hub hosting site configured correctly. Unfortunately I think this may not be the case as it appears some of the files may be corrupt, locked, or in general unreachable. It may be difficult and/or time consuming to track down why each of these data files fails to load in IGB.
            Hide
            nfreese Nowlan Freese added a comment -

            Closing ticket.

            Additional tickets may be needed to further investigate issues identified in this ticket.

            Show
            nfreese Nowlan Freese added a comment - Closing ticket. Additional tickets may be needed to further investigate issues identified in this ticket.

              People

              • Assignee:
                Unassigned
                Reporter:
                ann.loraine Ann Loraine
              • Votes:
                0 Vote for this issue
                Watchers:
                4 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: