Uploaded image for project: 'IGB'
  1. IGB
  2. IGBF-2836

Get familiar with track hub format

    Details

    • Type: Task
    • Status: Closed (View Workflow)
    • Priority: Major
    • Resolution: Done
    • Affects Version/s: None
    • Fix Version/s: None
    • Labels:
      None
    • Story Points:
      2
    • Sprint:
      Spring 2 2021 Apr 5 - Apr 16, Spring 3 2021 Apr 19 - Apr 30, Spring 4 2021 May 3 - May 14, Spring 5 2021 May 17 - May 28, Spring 6 2021 May 31 - June 11, Summer 1 2021 Jun 14 - Jun 25

      Description

      None of us is very familiar yet with the Track Hub specification.

      To gain better knowledge of it, we could write some code (e.g., a script) that given a Track Hub URL, produces an IGB Quickload file and folder structure from the Track Hub.

      When a user "opens" this folder in IGB, they should be able to then see files from the Track Hub displayed in the Available Data Sets section of IGB.

      This task will expose potential problems with mapping Track Hub specification aspects onto the IGB Quickload specification.

      If such a script existed, we could automate conversion of all available public Track Hubs.

      Request: Please add any new code to this repository: https://bitbucket.org/lorainelab/igbquickload

        Attachments

          Issue Links

            Activity

            Hide
            chirag24 Chirag Chandrahas Shetty added a comment - - edited
            Show
            chirag24 Chirag Chandrahas Shetty added a comment - - edited Quickload website: http://igbquickload.org/quickload
            Hide
            aloraine Ann Loraine added a comment -

            Note that in the UCSC system, there appear to be two types of hubs: Track hubs and assembly hubs.
            Assembly hubs contain reference sequence data information. Track hubs contain data that are mapped onto the reference genomes specified by assembly hubs. ("Assembly" is another name for the collection of sequences that make up a reference genome version.)
            UCSC has a REST API (emits JSON format data) clients can use to get information about track hubs and assembly hubs. I think the converter can probably take advantage of this API in several ways, such as identifying the location of 2bit files needed by IGB to display reference genome sequence data.
            See: http://genome.ucsc.edu/goldenPath/help/api.html

            attn: Chirag Chandrahas Shetty, Irvin Naylor, Rachel Weidenhammer, Omkar Marne

            Show
            aloraine Ann Loraine added a comment - Note that in the UCSC system, there appear to be two types of hubs: Track hubs and assembly hubs. Assembly hubs contain reference sequence data information. Track hubs contain data that are mapped onto the reference genomes specified by assembly hubs. ("Assembly" is another name for the collection of sequences that make up a reference genome version.) UCSC has a REST API (emits JSON format data) clients can use to get information about track hubs and assembly hubs. I think the converter can probably take advantage of this API in several ways, such as identifying the location of 2bit files needed by IGB to display reference genome sequence data. See: http://genome.ucsc.edu/goldenPath/help/api.html attn: Chirag Chandrahas Shetty , Irvin Naylor , Rachel Weidenhammer , Omkar Marne
            Show
            chirag24 Chirag Chandrahas Shetty added a comment - This api might help us create quickload/contents.txt file https://api.genome.ucsc.edu/list/hubGenomes?hubUrl=http://hgdownload.soe.ucsc.edu/hubs/mouseStrains/hub.txt
            Show
            chirag24 Chirag Chandrahas Shetty added a comment - Similar to genome.txt https://api.genome.ucsc.edu/list/chromosomes?hubUrl=http://hgdownload.soe.ucsc.edu/hubs/mouseStrains/hub.txt;genome=CAST_EiJ;track=assembly
            Hide
            chirag24 Chirag Chandrahas Shetty added a comment -

            trackDb.txt is similar annots.xml

            Show
            chirag24 Chirag Chandrahas Shetty added a comment - trackDb.txt is similar annots.xml
            Show
            omarne Omkar Marne added a comment - https://bitbucket.org/omarne/track-hub-converter-webapp/src/main/ Link for reference code : https://www.geeksforgeeks.org/how-to-convert-lists-to-xml-in-python/
            Hide
            omarne Omkar Marne added a comment - - edited
            Show
            omarne Omkar Marne added a comment - - edited Irvin Naylor Please pull the changes. Pull requests - https://bitbucket.org/Inaylor01/track-hub-converter-webapp/pull-requests/4/igbf-2877
            Hide
            omarne Omkar Marne added a comment - - edited

            Dr. Ann Loraine - I compared all the trackdb.txt files and all the files have similar parameters ( track, long label, short label and type). So we can use either of the parameters as title.
            Currently, we have used track as title which would be fine.

            Below are the links for all trackdb.txt files.

            http://hgdownload.soe.ucsc.edu//hubs//mouseStrains//GCA_001632525.1_BALB_cJ_v1//GCA_001632525.1_BALB_cJ_v1.trackDb.txt

            http://hgdownload.soe.ucsc.edu//hubs//mouseStrains//GCA_001624445.1_CAST_EiJ_v1//GCA_001624445.1_CAST_EiJ_v1.trackDb.txt

            http://hgdownload.soe.ucsc.edu//hubs//mouseStrains//GCA_001624185.1_129S1_SvImJ_v1//GCA_001624185.1_129S1_SvImJ_v1.trackDb.txt

            http://hgdownload.soe.ucsc.edu//hubs//mouseStrains//GCA_001624535.1_FVB_NJ_v1//GCA_001624535.1_FVB_NJ_v1.trackDb.txt

            http://hgdownload.soe.ucsc.edu//hubs//mouseStrains//GCA_001632555.1_C57BL_6NJ_v1//GCA_001632555.1_C57BL_6NJ_v1.trackDb.txt

            http://hgdownload.soe.ucsc.edu//hubs//mouseStrains//GCA_001624295.1_AKR_J_v1//GCA_001624295.1_AKR_J_v1.trackDb.txt

            http://hgdownload.soe.ucsc.edu//hubs//mouseStrains//GCA_001624215.1_A_J_v1//GCA_001624215.1_A_J_v1.trackDb.txt

            http://hgdownload.soe.ucsc.edu//hubs//mouseStrains//GCA_001624745.1_NZO_HlLtJ_v1//GCA_001624745.1_NZO_HlLtJ_v1.trackDb.txt

            http://hgdownload.soe.ucsc.edu//hubs//mouseStrains//GCA_001624675.1_NOD_ShiLtJ_v1//GCA_001624675.1_NOD_ShiLtJ_v1.trackDb.txt

            http://hgdownload.soe.ucsc.edu//hubs//mouseStrains//GCA_001624865.1_SPRET_EiJ_v1//GCA_001624865.1_SPRET_EiJ_v1.trackDb.txt

            http://hgdownload.soe.ucsc.edu//hubs//mouseStrains//GCA_001632575.1_C3H_HeJ_v1//GCA_001632575.1_C3H_HeJ_v1.trackDb.txt

            http://hgdownload.soe.ucsc.edu//hubs//mouseStrains//GCA_001624835.1_WSB_EiJ_v1//GCA_001624835.1_WSB_EiJ_v1.trackDb.txt

            http://hgdownload.soe.ucsc.edu//hubs//mouseStrains//GCA_001624475.1_CBA_J_v1//GCA_001624475.1_CBA_J_v1.trackDb.txt

            http://hgdownload.soe.ucsc.edu//hubs//mouseStrains//GCA_001624775.1_PWK_PhJ_v1//GCA_001624775.1_PWK_PhJ_v1.trackDb.txt

            http://hgdownload.soe.ucsc.edu//hubs//mouseStrains//GCA_001632615.1_LP_J_v1//GCA_001632615.1_LP_J_v1.trackDb.txt

            http://hgdownload.soe.ucsc.edu//hubs//mouseStrains//GCA_001624505.1_DBA_2J_v1//GCA_001624505.1_DBA_2J_v1.trackDb.txt

            https://ftp.ncbi.nlm.nih.gov/snp/population_frequency/TrackHub/20200227123210/hg19/trackDb.txt

            Show
            omarne Omkar Marne added a comment - - edited Dr. Ann Loraine - I compared all the trackdb.txt files and all the files have similar parameters ( track, long label, short label and type). So we can use either of the parameters as title. Currently, we have used track as title which would be fine. Below are the links for all trackdb.txt files. http://hgdownload.soe.ucsc.edu//hubs//mouseStrains//GCA_001632525.1_BALB_cJ_v1//GCA_001632525.1_BALB_cJ_v1.trackDb.txt http://hgdownload.soe.ucsc.edu//hubs//mouseStrains//GCA_001624445.1_CAST_EiJ_v1//GCA_001624445.1_CAST_EiJ_v1.trackDb.txt http://hgdownload.soe.ucsc.edu//hubs//mouseStrains//GCA_001624185.1_129S1_SvImJ_v1//GCA_001624185.1_129S1_SvImJ_v1.trackDb.txt http://hgdownload.soe.ucsc.edu//hubs//mouseStrains//GCA_001624535.1_FVB_NJ_v1//GCA_001624535.1_FVB_NJ_v1.trackDb.txt http://hgdownload.soe.ucsc.edu//hubs//mouseStrains//GCA_001632555.1_C57BL_6NJ_v1//GCA_001632555.1_C57BL_6NJ_v1.trackDb.txt http://hgdownload.soe.ucsc.edu//hubs//mouseStrains//GCA_001624295.1_AKR_J_v1//GCA_001624295.1_AKR_J_v1.trackDb.txt http://hgdownload.soe.ucsc.edu//hubs//mouseStrains//GCA_001624215.1_A_J_v1//GCA_001624215.1_A_J_v1.trackDb.txt http://hgdownload.soe.ucsc.edu//hubs//mouseStrains//GCA_001624745.1_NZO_HlLtJ_v1//GCA_001624745.1_NZO_HlLtJ_v1.trackDb.txt http://hgdownload.soe.ucsc.edu//hubs//mouseStrains//GCA_001624675.1_NOD_ShiLtJ_v1//GCA_001624675.1_NOD_ShiLtJ_v1.trackDb.txt http://hgdownload.soe.ucsc.edu//hubs//mouseStrains//GCA_001624865.1_SPRET_EiJ_v1//GCA_001624865.1_SPRET_EiJ_v1.trackDb.txt http://hgdownload.soe.ucsc.edu//hubs//mouseStrains//GCA_001632575.1_C3H_HeJ_v1//GCA_001632575.1_C3H_HeJ_v1.trackDb.txt http://hgdownload.soe.ucsc.edu//hubs//mouseStrains//GCA_001624835.1_WSB_EiJ_v1//GCA_001624835.1_WSB_EiJ_v1.trackDb.txt http://hgdownload.soe.ucsc.edu//hubs//mouseStrains//GCA_001624475.1_CBA_J_v1//GCA_001624475.1_CBA_J_v1.trackDb.txt http://hgdownload.soe.ucsc.edu//hubs//mouseStrains//GCA_001624775.1_PWK_PhJ_v1//GCA_001624775.1_PWK_PhJ_v1.trackDb.txt http://hgdownload.soe.ucsc.edu//hubs//mouseStrains//GCA_001632615.1_LP_J_v1//GCA_001632615.1_LP_J_v1.trackDb.txt http://hgdownload.soe.ucsc.edu//hubs//mouseStrains//GCA_001624505.1_DBA_2J_v1//GCA_001624505.1_DBA_2J_v1.trackDb.txt https://ftp.ncbi.nlm.nih.gov/snp/population_frequency/TrackHub/20200227123210/hg19/trackDb.txt

              People

              • Assignee:
                omarne Omkar Marne
                Reporter:
                aloraine Ann Loraine
              • Votes:
                0 Vote for this issue
                Watchers:
                3 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: