Uploaded image for project: 'IGB'
  1. IGB
  2. IGBF-4104

Create a POC for track hub converter

    Details

    • Type: Task
    • Status: Closed (View Workflow)
    • Priority: Major
    • Resolution: Done
    • Affects Version/s: None
    • Fix Version/s: None
    • Labels:
      None

      Attachments

        Issue Links

          Activity

          uchinta Udaya Chinta (Inactive) created issue -
          uchinta Udaya Chinta (Inactive) made changes -
          Field Original Value New Value
          Epic Link IGBF-2831 [ 19524 ]
          uchinta Udaya Chinta (Inactive) made changes -
          Status To-Do [ 10305 ] In Progress [ 3 ]
          Hide
          uchinta Udaya Chinta (Inactive) added a comment - - edited

          Created an example for POC from existing quickload:

          • hub.txt:
          1. hub IGBQuickLoad
          2. shortLabel IGB QUICK LOAD
          3. longLabel Generated from IGB QuickLoad
          4. genomesFile genomes.txt
          5. email myEmail@address
          • genomes.txt
          1. genome C_papaya_Aug_2010
            C_papaya_Aug_2010 is igb quickload name but we need to assign ucsc genome name. To get ucsc genome name we need to check in synomyns.txt. If it is not in synomyns.txt then we assign same igb quickload name.
          2. trackDb C_papaya_Aug_2010/trackDb.txt
          3. organism Carica papaya
            (we get the organism from species.txt by checking with C_papaya i.e we split the genome name with "_ " an combine first two words in list with "_ " an dchcek if it is present in species.txt. If it is present we get the first column name as organism name)
          4. defaultPos chrI:0-6239266
            From <http://igbquickload.org/quickload/C_papaya_Aug_2010/genome.txt>
          5. twoBitPath http://igbquickload.org/quickload/C_papaya_Aug_2010/C_papaya_Aug_2010.2bit

          we can get directly from annots.xml when reference = true or we can fetch from genome version folder
          http://igbquickload.org/quickload/C_papaya_Aug_2010/

          1. Organism, defaultPos and 2 bit file are not mandatory
            It is required only when ucsc genome name is not matched with igbquickload name
          • trackDb.txt
          1. track Caricapapaya
          2. bigDataUrl C_papaya_Aug_2010.bed.gz
            From <http://igbquickload.org/quickload/C_papaya_Aug_2010/annots.xml> (name)
          3. shortLabel Carica papaya gene models from Phytozome v7
            From <http://igbquickload.org/quickload/C_papaya_Aug_2010/annots.xml>
          4. longLabel protein-coding gene models
            From <http://igbquickload.org/quickload/C_papaya_Aug_2010/annots.xml>
          5. type bed
          6. visibility dense (hard coded)

          track hub : https://api.bitbucket.org/2.0/repositories/lorainelab_udaya/testquickload/src/main/trackHub1/hub.txt

          The above track hub did not work since i need to convert bigdataurl to bed

          Show
          uchinta Udaya Chinta (Inactive) added a comment - - edited Created an example for POC from existing quickload: hub.txt: hub IGBQuickLoad shortLabel IGB QUICK LOAD longLabel Generated from IGB QuickLoad genomesFile genomes.txt email myEmail@address genomes.txt genome C_papaya_Aug_2010 C_papaya_Aug_2010 is igb quickload name but we need to assign ucsc genome name. To get ucsc genome name we need to check in synomyns.txt. If it is not in synomyns.txt then we assign same igb quickload name. trackDb C_papaya_Aug_2010/trackDb.txt organism Carica papaya (we get the organism from species.txt by checking with C_papaya i.e we split the genome name with "_ " an combine first two words in list with "_ " an dchcek if it is present in species.txt. If it is present we get the first column name as organism name) defaultPos chrI:0-6239266 From < http://igbquickload.org/quickload/C_papaya_Aug_2010/genome.txt > twoBitPath http://igbquickload.org/quickload/C_papaya_Aug_2010/C_papaya_Aug_2010.2bit we can get directly from annots.xml when reference = true or we can fetch from genome version folder http://igbquickload.org/quickload/C_papaya_Aug_2010/ Organism, defaultPos and 2 bit file are not mandatory It is required only when ucsc genome name is not matched with igbquickload name trackDb.txt track Caricapapaya bigDataUrl C_papaya_Aug_2010.bed.gz From < http://igbquickload.org/quickload/C_papaya_Aug_2010/annots.xml > (name) shortLabel Carica papaya gene models from Phytozome v7 From < http://igbquickload.org/quickload/C_papaya_Aug_2010/annots.xml > longLabel protein-coding gene models From < http://igbquickload.org/quickload/C_papaya_Aug_2010/annots.xml > type bed visibility dense (hard coded) track hub : https://api.bitbucket.org/2.0/repositories/lorainelab_udaya/testquickload/src/main/trackHub1/hub.txt The above track hub did not work since i need to convert bigdataurl to bed
          uchinta Udaya Chinta (Inactive) made changes -
          Link This issue relates to IGBF-4090 [ IGBF-4090 ]
          Hide
          uchinta Udaya Chinta (Inactive) added a comment - - edited

          The above papaya example is not working since there is an issue with that genome in igb.
          So created another POC.

          Trackhub : https://api.bitbucket.org/2.0/repositories/lorainelab_udaya/testquickload/src/main/trackHub2/hub.txt
          UCSC connect hub : https://genome.ucsc.edu/cgi-bin/hgHubConnect

          To test if the above hub is working, add the provided track hub in ucsc connected hubs

          Show
          uchinta Udaya Chinta (Inactive) added a comment - - edited The above papaya example is not working since there is an issue with that genome in igb. So created another POC. Trackhub : https://api.bitbucket.org/2.0/repositories/lorainelab_udaya/testquickload/src/main/trackHub2/hub.txt UCSC connect hub : https://genome.ucsc.edu/cgi-bin/hgHubConnect To test if the above hub is working, add the provided track hub in ucsc connected hubs
          uchinta Udaya Chinta (Inactive) made changes -
          Link This issue relates to IGBF-4105 [ IGBF-4105 ]
          uchinta Udaya Chinta (Inactive) made changes -
          Status In Progress [ 3 ] Needs 1st Level Review [ 10005 ]
          uchinta Udaya Chinta (Inactive) made changes -
          Epic Link IGBF-2831 [ 19524 ]
          uchinta Udaya Chinta (Inactive) made changes -
          Epic Link IGBF-2831 [ 19524 ]
          uchinta Udaya Chinta (Inactive) made changes -
          Assignee Udaya Chinta [ uchinta ]
          ann.loraine Ann Loraine made changes -
          Sprint Spring 2 [ 211 ] Spring 2, Spring 3 [ 211, 212 ]
          ann.loraine Ann Loraine made changes -
          Rank Ranked higher
          nfreese Nowlan Freese made changes -
          Status Needs 1st Level Review [ 10005 ] First Level Review in Progress [ 10301 ]
          nfreese Nowlan Freese made changes -
          Assignee Nowlan Freese [ nfreese ]
          Hide
          nfreese Nowlan Freese added a comment -

          When I add the hub to UCSC and then click GO I see the following error:
          Can't find description for hub_6367087_tair9

          Show
          nfreese Nowlan Freese added a comment - When I add the hub to UCSC and then click GO I see the following error: Can't find description for hub_6367087_tair9
          nfreese Nowlan Freese made changes -
          Assignee Nowlan Freese [ nfreese ] Udaya Chinta [ uchinta ]
          nfreese Nowlan Freese made changes -
          Status First Level Review in Progress [ 10301 ] To-Do [ 10305 ]
          Show
          uchinta Udaya Chinta (Inactive) added a comment - Bitbucket : https://bitbucket.org/lorainelab_udaya/testquickload/src/main/trackHub2/
          Hide
          uchinta Udaya Chinta (Inactive) added a comment - - edited

          Dr.Nowlan Freese, Could you please check now? It is working.

          I made below changes in genomes.txt:

          Before :
          genome tair9
          trackDb tair9/trackDb.txt
          organism araTha
          *defaultPos chrI:0-30427671
          twoBitPath http://igbquickload.org/quickload/A_thaliana_Jun_2009/A_thaliana_Jun_2009.2bit

          Now:
          genome tair9
          trackDb tair9/trackDb.txt
          organism araTha
          *defaultPos Chr1:0-30427671
          twoBitPath http://igbquickload.org/quickload/A_thaliana_Jun_2009/A_thaliana_Jun_2009.2bit

          Show
          uchinta Udaya Chinta (Inactive) added a comment - - edited Dr. Nowlan Freese , Could you please check now? It is working. I made below changes in genomes.txt: Before : genome tair9 trackDb tair9/trackDb.txt organism araTha * defaultPos chrI:0-30427671 twoBitPath http://igbquickload.org/quickload/A_thaliana_Jun_2009/A_thaliana_Jun_2009.2bit Now: genome tair9 trackDb tair9/trackDb.txt organism araTha * defaultPos Chr1:0-30427671 twoBitPath http://igbquickload.org/quickload/A_thaliana_Jun_2009/A_thaliana_Jun_2009.2bit
          nfreese Nowlan Freese made changes -
          Status To-Do [ 10305 ] In Progress [ 3 ]
          nfreese Nowlan Freese made changes -
          Status In Progress [ 3 ] Needs 1st Level Review [ 10005 ]
          nfreese Nowlan Freese made changes -
          Assignee Udaya Chinta [ uchinta ] Nowlan Freese [ nfreese ]
          nfreese Nowlan Freese made changes -
          Status Needs 1st Level Review [ 10005 ] First Level Review in Progress [ 10301 ]
          Hide
          nfreese Nowlan Freese added a comment -

          One additional issue with the type parameter in the trackDb.txt. The type parameter should be set to bigBed 12 + as this specifies that the file is bed 12 plus additional columns (the bigbed file is a bed detail with 14 columns). Unfortunately this means that we may have to make assumptions about how many columns are present in a user's bigBed file as this information is not required for an IGB QuickLoad.
          From: https://genome.ucsc.edu/goldenpath/help/hgTrackHubHelp.html

          type - the format of the file specified by bigDataUrl. Must be either bigWig, bigBed, bigBarChart, bigGenePred, bigInteract, bigNarrowPeak, bigChain, bigPsl, bigMaf, hic, bam, halSnake or vcfTabix (Note: use type bam for CRAM files). If the type is bigBed, it may be followed by an optional number denoting the number of fields in the bigBed file (e.g., "type bigBed 12" for a file with 12 fields or "type bigBed 12 +" for a file that contains additional non-standard columns). If no number is given, a default value of 3 is assumed (a very limited display that omits names, strand information, and exon boundaries).

          Show
          nfreese Nowlan Freese added a comment - One additional issue with the type parameter in the trackDb.txt . The type parameter should be set to bigBed 12 + as this specifies that the file is bed 12 plus additional columns (the bigbed file is a bed detail with 14 columns). Unfortunately this means that we may have to make assumptions about how many columns are present in a user's bigBed file as this information is not required for an IGB QuickLoad. From: https://genome.ucsc.edu/goldenpath/help/hgTrackHubHelp.html type - the format of the file specified by bigDataUrl. Must be either bigWig, bigBed, bigBarChart, bigGenePred, bigInteract, bigNarrowPeak, bigChain, bigPsl, bigMaf, hic, bam, halSnake or vcfTabix (Note: use type bam for CRAM files). If the type is bigBed, it may be followed by an optional number denoting the number of fields in the bigBed file (e.g., "type bigBed 12" for a file with 12 fields or "type bigBed 12 +" for a file that contains additional non-standard columns). If no number is given, a default value of 3 is assumed (a very limited display that omits names, strand information, and exon boundaries).
          Hide
          nfreese Nowlan Freese added a comment - - edited

          Note on the E_unicornis example file. It is a bed detail (12 + 2 columns), but note that there are no introns, so the gene models will appear as rectangles in UCSC/IGB.

          chrI 1806 2169 NM_001180043.1 0
          1806 2169 0 1 363, 0, PAU8 seripauperin PAU8
          Show
          nfreese Nowlan Freese added a comment - - edited Note on the E_unicornis example file. It is a bed detail (12 + 2 columns), but note that there are no introns, so the gene models will appear as rectangles in UCSC/IGB. chrI 1806 2169 NM_001180043.1 0 1806 2169 0 1 363, 0, PAU8 seripauperin PAU8
          Hide
          nfreese Nowlan Freese added a comment -

          With the change of to type bigBed 12 + the gene models are appearing as expected in UCSC.

          Show
          nfreese Nowlan Freese added a comment - With the change of to type bigBed 12 + the gene models are appearing as expected in UCSC.
          nfreese Nowlan Freese made changes -
          Assignee Nowlan Freese [ nfreese ] Udaya Chinta [ uchinta ]
          nfreese Nowlan Freese made changes -
          Status First Level Review in Progress [ 10301 ] Ready for Pull Request [ 10304 ]
          ann.loraine Ann Loraine made changes -
          Sprint Spring 2, Spring 3 [ 211, 212 ] Spring 2, Spring 3, Spring 4 [ 211, 212, 213 ]
          ann.loraine Ann Loraine made changes -
          Rank Ranked higher
          uchinta Udaya Chinta (Inactive) made changes -
          Status Ready for Pull Request [ 10304 ] Pull Request Submitted [ 10101 ]
          uchinta Udaya Chinta (Inactive) made changes -
          Status Pull Request Submitted [ 10101 ] Reviewing Pull Request [ 10303 ]
          uchinta Udaya Chinta (Inactive) made changes -
          Status Reviewing Pull Request [ 10303 ] Merged Needs Testing [ 10002 ]
          uchinta Udaya Chinta (Inactive) made changes -
          Status Merged Needs Testing [ 10002 ] Post-merge Testing In Progress [ 10003 ]
          uchinta Udaya Chinta (Inactive) made changes -
          Resolution Done [ 10000 ]
          Status Post-merge Testing In Progress [ 10003 ] Closed [ 6 ]

            People

            • Assignee:
              uchinta Udaya Chinta (Inactive)
              Reporter:
              uchinta Udaya Chinta (Inactive)
            • Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: