Uploaded image for project: 'IGB'
  1. IGB
  2. IGBF-1879

Ensure Galaxy default dbkey values are in synonyms.txt

    Details

    • Type: Improvement
    • Status: Closed (View Workflow)
    • Priority: Major
    • Resolution: Done
    • Affects Version/s: None
    • Fix Version/s: None
    • Labels:
    • Story Points:
      2
    • Sprint:
      Summer 2019 Sprint 12, Fall 2019 Sprint 1, Fall 2019 Sprint 2, Fall 2019 Sprint 3, Fall 4 : 30 Sep to 11 Oct, Fall 5 : 14 Oct to 25 Oct, Fall 6 : 28 Oct to 8 Nov, Fall 7 : 11 Nov to 22 Nov, Fall 8 : 25 Nov to 6 Dec, Spring 5 2021 May 17 - May 28

      Description

      This ticket was first worked on in 2014 and needs to be re-visited because Galaxy code base has progressed and many new genome versions are available.

      The Galaxy software relies heavily on the UCSC Genome Browser informatics system to support many different genome versions. There's a "cron job" that Galaxy admin's periodically run to update genome version data in Galaxy. We need to understand how that works to make sure that IGB's synonyms system stays up to date with Galaxy.

      The script(s) run during the cron job reside in:

      The script in this directory that "kicks off" the update is updateucsc.sh.sample.

      It also handles getting length files with chromosome names and sizes. Note that this script also manages import of non-ucsc genome names. Look at the script to see how that aspect is working.

      The above is actually a legacy system but has been used for a long time and seems to work quite well. In addition, there's a "data manager" system that allows an admin to use the Galaxy UI to define new genomes.

      Each genome version is identified by a "dbkey" that in the case of ucsc genomes, is identical to ucsc genome version names.

      An example dbkey is "apiMel2", which is IGB, is called "A_mellifera_Jan_2005"

      Dan Blankenberg (https://galaxyproject.org/people/dan/) notes that there may actually be a REST endpoint that can report the dbkeys for all genome versions supported in a given Galaxy instance. If not, he recommends we open a ticket in Galaxy github requesting this.

      Galaxy also keeps track of a user-friendly name that is displayed to users. This is equivalent to column 2 in IGB Quickload's contents.txt file – see http://igbquickload.org/quickload/contents.txt.

      For this task, check that the synonyms.txt file includes all genome versions supported by Galaxy. If not, update it.

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                shamika Shamika Gajanan Kulkarni (Inactive)
                Reporter:
                dcnorris David Norris (Inactive)
              • Votes:
                0 Vote for this issue
                Watchers:
                3 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved:

                  Time Tracking

                  Estimated:
                  Original Estimate - 4 hours
                  4h
                  Remaining:
                  Remaining Estimate - 4 hours
                  4h
                  Logged:
                  Time Spent - Not Specified
                  Not Specified