Uploaded image for project: 'IGB'
  1. IGB
  2. IGBF-1879

Ensure Galaxy default dbkey values are in synonyms.txt

    Details

    • Type: Improvement
    • Status: Closed (View Workflow)
    • Priority: Major
    • Resolution: Done
    • Affects Version/s: None
    • Fix Version/s: None
    • Labels:
    • Story Points:
      2
    • Sprint:
      Summer 2019 Sprint 12, Fall 2019 Sprint 1, Fall 2019 Sprint 2, Fall 2019 Sprint 3, Fall 4 : 30 Sep to 11 Oct, Fall 5 : 14 Oct to 25 Oct, Fall 6 : 28 Oct to 8 Nov, Fall 7 : 11 Nov to 22 Nov, Fall 8 : 25 Nov to 6 Dec, Spring 5 2021 May 17 - May 28

      Description

      This ticket was first worked on in 2014 and needs to be re-visited because Galaxy code base has progressed and many new genome versions are available.

      The Galaxy software relies heavily on the UCSC Genome Browser informatics system to support many different genome versions. There's a "cron job" that Galaxy admin's periodically run to update genome version data in Galaxy. We need to understand how that works to make sure that IGB's synonyms system stays up to date with Galaxy.

      The script(s) run during the cron job reside in:

      The script in this directory that "kicks off" the update is updateucsc.sh.sample.

      It also handles getting length files with chromosome names and sizes. Note that this script also manages import of non-ucsc genome names. Look at the script to see how that aspect is working.

      The above is actually a legacy system but has been used for a long time and seems to work quite well. In addition, there's a "data manager" system that allows an admin to use the Galaxy UI to define new genomes.

      Each genome version is identified by a "dbkey" that in the case of ucsc genomes, is identical to ucsc genome version names.

      An example dbkey is "apiMel2", which is IGB, is called "A_mellifera_Jan_2005"

      Dan Blankenberg (https://galaxyproject.org/people/dan/) notes that there may actually be a REST endpoint that can report the dbkeys for all genome versions supported in a given Galaxy instance. If not, he recommends we open a ticket in Galaxy github requesting this.

      Galaxy also keeps track of a user-friendly name that is displayed to users. This is equivalent to column 2 in IGB Quickload's contents.txt file – see http://igbquickload.org/quickload/contents.txt.

      For this task, check that the synonyms.txt file includes all genome versions supported by Galaxy. If not, update it.

        Attachments

          Issue Links

            Activity

            dcnorris David Norris (Inactive) created issue -
            dcnorris David Norris (Inactive) made changes -
            Field Original Value New Value
            Status Open [ 1 ] Resolved [ 5 ]
            Resolution Fixed [ 1 ]
            dcnorris David Norris (Inactive) made changes -
            Status Resolved [ 5 ] Closed [ 6 ]
            ann.loraine Ann Loraine made changes -
            Resolution Fixed [ 1 ]
            Status Closed [ 6 ] Reopened [ 4 ]
            Assignee David Norris [ dcnorris ]
            ann.loraine Ann Loraine made changes -
            Story Points 0.5
            Sprint Summer 2019 Sprint 11 [ 70 ]
            Labels Intermediate
            ann.loraine Ann Loraine made changes -
            Project IGB 8.1 [ 10640 ] IGB [ 10840 ]
            Key IGBE-142 IGBF-1879
            Workflow classic default workflow [ 14357 ] Loraine Lab Workflow [ 18602 ]
            ann.loraine Ann Loraine made changes -
            Description This ticket was first worked on in 2014 and needs to be re-visited because Galaxy code base has progressed and many new genome versions are available.

            The Galaxy software relies heavily on the UCSC Genome Browser informatics system to support many different genome versions. There's a "cron job" that Galaxy admin's periodically run to update genome version data in Galaxy. We need to understand how that works to make sure that IGB's synonyms system stays up to date with Galaxy.

            The script(s) run during the cron job reside in:
            * https://github.com/galaxyproject/galaxy/tree/dev/cron

            The script in this directory that "kicks off" the update is updateucsc.sh.sample.

            It also handles getting length files with chromosome names and sizes. Note that this script also manages import of non-ucsc genome names. Look at the script to see how that aspect is working.

            The above is actually a legacy system but has been used for a long time and seems to work quite well. In addition, there's a "data manager" system that allows an admin to use the Galaxy UI to define new genomes.

            Each genome version is identified by a "dbkey" that in the case of ucsc genomes, is identical to ucsc genome version names.

            An example dbkey is "apiMel2", which is IGB, is called "A_mellifera_Jan_2005"

            Dan Blankenberg (https://galaxyproject.org/people/dan/) notes that there may actually be a REST endpoint that can report the dbkeys for all genome versions supported in a given Galaxy instance. If not, he recommends we open a ticket in Galaxy github requesting this.

            Galaxy also keeps track of a user-friendly name that is displayed to users. This is equivalent to column 2 in IGB Quickload's contents.txt file -- see http://igbquickload.org/quickload/contents.txt.

            For this task, check that the synonyms.txt file includes all genome versions supported by Galaxy. If not, update it.

            ann.loraine Ann Loraine made changes -
            Comment [ This ticket was first worked on in 2014 and needs to be re-visited because Galaxy code base has progressed and many new genome versions are available.

            The Galaxy software relies heavily on the UCSC Genome Browser informatics system to support many different genome versions. There's a "cron job" that Galaxy admin's periodically run to update genome version data in Galaxy. We need to understand how that works to make sure that IGB's synonyms system stays up to date with Galaxy.

            The script(s) run during the cron job reside in:
            * https://github.com/galaxyproject/galaxy/tree/dev/cron

            The script in this directory that "kicks off" the update is updateucsc.sh.sample.

            It also handles getting length files with chromosome names and sizes. Note that this script also manages import of non-ucsc genome names. Look at the script to see how that aspect is working.

            The above is actually a legacy system but has been used for a long time and seems to work quite well. In addition, there's a "data manager" system that allows an admin to use the Galaxy UI to define new genomes.

            Each genome version is identified by a "dbkey" that in the case of ucsc genomes, is identical to ucsc genome version names.

            An example dbkey is "apiMel2", which is IGB, is called "A_mellifera_Jan_2005"

            Dan Blankenberg (https://galaxyproject.org/people/dan/) notes that there may actually be a REST endpoint that can report the dbkeys for all genome versions supported in a given Galaxy instance. If not, he recommends we open a ticket in Galaxy github requesting this.

            Galaxy also keeps track of a user-friendly name that is displayed to users. This is equivalent to column 2 in IGB Quickload's contents.txt file -- see http://igbquickload.org/quickload/contents.txt.

            For this task, check that the synonyms.txt file includes all genome versions supported by Galaxy. If not, update it.

            ]
            ann.loraine Ann Loraine made changes -
            Sprint Summer 2019 Sprint 11 [ 70 ] Summer 2019 Sprint 12 [ 71 ]
            ann.loraine Ann Loraine made changes -
            Rank Ranked higher
            ann.loraine Ann Loraine made changes -
            Epic Link IGBF-1880 [ 17970 ]
            ann.loraine Ann Loraine made changes -
            Story Points 0.5 2
            ann.loraine Ann Loraine made changes -
            Rank Ranked higher
            shamika Shamika Gajanan Kulkarni (Inactive) made changes -
            Assignee Shamika Gajanan Kulkarni [ shamika ]
            ann.loraine Ann Loraine made changes -
            Sprint Summer 2019 Sprint 12 [ 71 ] Summer 2019 Sprint 12, Fall 2019 Sprint 1 [ 71, 72 ]
            ann.loraine Ann Loraine made changes -
            Rank Ranked higher
            Hide
            ann.loraine Ann Loraine added a comment -

            Unassigning since SK will be out of town during the sprint.

            Show
            ann.loraine Ann Loraine added a comment - Unassigning since SK will be out of town during the sprint.
            ann.loraine Ann Loraine made changes -
            Assignee Shamika Gajanan Kulkarni [ shamika ]
            ann.loraine Ann Loraine made changes -
            Workflow Loraine Lab Workflow [ 18602 ] Fall 2019 Workflow Update [ 19085 ]
            Status Reopened [ 4 ] Open [ 1 ]
            ann.loraine Ann Loraine made changes -
            Sprint Summer 2019 Sprint 12, Fall 2019 Sprint 1 [ 71, 72 ] Summer 2019 Sprint 12, Fall 2019 Sprint 1, Fall 2019 Sprint 2 [ 71, 72, 73 ]
            ann.loraine Ann Loraine made changes -
            Rank Ranked higher
            ann.loraine Ann Loraine made changes -
            Sprint Summer 2019 Sprint 12, Fall 2019 Sprint 1, Fall 2019 Sprint 2 [ 71, 72, 73 ] Summer 2019 Sprint 12, Fall 2019 Sprint 1, Fall 2019 Sprint 2, Fall 2019 Sprint 4 [ 71, 72, 73, 74 ]
            ann.loraine Ann Loraine made changes -
            Rank Ranked higher
            ann.loraine Ann Loraine made changes -
            Sprint Summer 2019 Sprint 12, Fall 2019 Sprint 1, Fall 2019 Sprint 2, Fall 2019 Sprint 3 [ 71, 72, 73, 74 ] Summer 2019 Sprint 12, Fall 2019 Sprint 1, Fall 2019 Sprint 2, Fall 2019 Sprint 3, Fall 2019 Sprint 4 [ 71, 72, 73, 74, 75 ]
            ann.loraine Ann Loraine made changes -
            Rank Ranked higher
            ann.loraine Ann Loraine made changes -
            Sprint Summer 2019 Sprint 12, Fall 2019 Sprint 1, Fall 2019 Sprint 2, Fall 2019 Sprint 3, Fall 4 : 30 Sep to 11 Oct [ 71, 72, 73, 74, 75 ] Summer 2019 Sprint 12, Fall 2019 Sprint 1, Fall 2019 Sprint 2, Fall 2019 Sprint 3, Fall 4 : 30 Sep to 11 Oct, Fall 5 : 14 Oct to 25 Oct [ 71, 72, 73, 74, 75, 76 ]
            ann.loraine Ann Loraine made changes -
            Rank Ranked higher
            ann.loraine Ann Loraine made changes -
            Workflow Fall 2019 Workflow Update [ 19085 ] Revised Fall 2019 Workflow Update [ 20835 ]
            Status Open [ 1 ] To-Do [ 10305 ]
            ann.loraine Ann Loraine made changes -
            Fix Version/s 9.1.0 Major Release [ 10601 ]
            ann.loraine Ann Loraine made changes -
            Sprint Summer 2019 Sprint 12, Fall 2019 Sprint 1, Fall 2019 Sprint 2, Fall 2019 Sprint 3, Fall 4 : 30 Sep to 11 Oct, Fall 5 : 14 Oct to 25 Oct [ 71, 72, 73, 74, 75, 76 ] Summer 2019 Sprint 12, Fall 2019 Sprint 1, Fall 2019 Sprint 2, Fall 2019 Sprint 3, Fall 4 : 30 Sep to 11 Oct, Fall 5 : 14 Oct to 25 Oct, Fall 6 : 28 Oct to 8 Nov [ 71, 72, 73, 74, 75, 76, 77 ]
            ann.loraine Ann Loraine made changes -
            Rank Ranked higher
            shamika Shamika Gajanan Kulkarni (Inactive) made changes -
            Assignee Shamika Gajanan Kulkarni [ shamika ]
            shamika Shamika Gajanan Kulkarni (Inactive) made changes -
            Status To-Do [ 10305 ] In Progress [ 3 ]
            Hide
            shamika Shamika Gajanan Kulkarni (Inactive) added a comment -

            I have a doubt in this. Are these the files being referred to here:
            Galaxy : https://github.com/galaxyproject/galaxy/blob/dev/tool-data/shared/ucsc/builds.txt.sample
            IGB synonyms.txt : http://www.igbquickload.org/quickload/synonyms.txt
            and could you please elaborate on the purpose of http://igbquickload.org/quickload/contents.txt
            Thank you.

            Show
            shamika Shamika Gajanan Kulkarni (Inactive) added a comment - I have a doubt in this. Are these the files being referred to here: Galaxy : https://github.com/galaxyproject/galaxy/blob/dev/tool-data/shared/ucsc/builds.txt.sample IGB synonyms.txt : http://www.igbquickload.org/quickload/synonyms.txt and could you please elaborate on the purpose of http://igbquickload.org/quickload/contents.txt Thank you.
            Hide
            shamika Shamika Gajanan Kulkarni (Inactive) added a comment -

            Prof. [~aloraine], the Galaxy team has provided some links for reference. Could you please take a look at it and let me know if this is what we need?
            1) https://usegalaxy.org/api/genomes - They said that as every Galaxy is different, this list applies to so called Galaxy Main, Galaxy EU would have a (more or less) different list
            2) http://datacache.galaxyproject.org/ - They have a trove of genome refs that they share across some of the Galaxies that is usually available here

            Show
            shamika Shamika Gajanan Kulkarni (Inactive) added a comment - Prof. [~aloraine] , the Galaxy team has provided some links for reference. Could you please take a look at it and let me know if this is what we need? 1) https://usegalaxy.org/api/genomes - They said that as every Galaxy is different, this list applies to so called Galaxy Main, Galaxy EU would have a (more or less) different list 2) http://datacache.galaxyproject.org/ - They have a trove of genome refs that they share across some of the Galaxies that is usually available here
            Hide
            ann.loraine Ann Loraine added a comment -

            I would like to increase the scope to include Galaxy EU, since we also have a lot of users coming from Germany (mainly) and some other European countries. Can we do this as well?

            Show
            ann.loraine Ann Loraine added a comment - I would like to increase the scope to include Galaxy EU, since we also have a lot of users coming from Germany (mainly) and some other European countries. Can we do this as well?
            Hide
            shamika Shamika Gajanan Kulkarni (Inactive) added a comment -

            I'll ask them for that too. I think we should be able to.

            Show
            shamika Shamika Gajanan Kulkarni (Inactive) added a comment - I'll ask them for that too. I think we should be able to.
            Hide
            shamika Shamika Gajanan Kulkarni (Inactive) added a comment - - edited

            Prof. [~aloraine]
            They said that every Galaxy has that path.
            https://usegalaxy.eu/api/genomes
            And the second link has 2 links which lead to reference data. We do have a list of dbkeys there, I checked. So we should be good I think. Could you please check too?

            Show
            shamika Shamika Gajanan Kulkarni (Inactive) added a comment - - edited Prof. [~aloraine] They said that every Galaxy has that path. https://usegalaxy.eu/api/genomes And the second link has 2 links which lead to reference data. We do have a list of dbkeys there, I checked. So we should be good I think. Could you please check too?
            Hide
            ann.loraine Ann Loraine added a comment - - edited

            The REST endpoint looks much easier to parse and build into a workflow.

            My question:

            Where do we find the "dbkey" in the text returned by the REST endpoint?

            For example, here is one item in the text being returned from https://usegalaxy.org/api/genomes:

            ["Arabidopsis thaliana TAIR10 (Arabidopsis_thaliana_TAIR10) (Arabidopsis_thaliana_TAIR10)", "Arabidopsis_thaliana_TAIR10"]

            There are two items here. The second item looks like it is the dbkey. Is that correct?

            Show
            ann.loraine Ann Loraine added a comment - - edited The REST endpoint looks much easier to parse and build into a workflow. My question: Where do we find the "dbkey" in the text returned by the REST endpoint? For example, here is one item in the text being returned from https://usegalaxy.org/api/genomes: ["Arabidopsis thaliana TAIR10 (Arabidopsis_thaliana_TAIR10) (Arabidopsis_thaliana_TAIR10)", "Arabidopsis_thaliana_TAIR10"] There are two items here. The second item looks like it is the dbkey. Is that correct?
            Hide
            ann.loraine Ann Loraine added a comment -

            It would be great if this endpoint tells us exactly which genome versions are supported by a Galaxy instances.
            We can't support every Galaxy in the world, but I bet that if we make sure IGB recognizes every genome version reported by usegalaxy.org (Galaxy main), usegalaxy.eu (European Galaxy), and usegalaxy.au (Australian Galaxy), then we will probably cover all the major Galaxy user groups.

            Show
            ann.loraine Ann Loraine added a comment - It would be great if this endpoint tells us exactly which genome versions are supported by a Galaxy instances. We can't support every Galaxy in the world, but I bet that if we make sure IGB recognizes every genome version reported by usegalaxy.org (Galaxy main), usegalaxy.eu (European Galaxy), and usegalaxy.au (Australian Galaxy), then we will probably cover all the major Galaxy user groups.
            Hide
            ann.loraine Ann Loraine added a comment -

            Regarding this pair of strings served from https://usegalaxy.org/api/genomes:

            ["Arabidopsis thaliana TAIR10 (Arabidopsis_thaliana_TAIR10) (Arabidopsis_thaliana_TAIR10)", "Arabidopsis_thaliana_TAIR10"]

            It looks like the first string is the human-friendly string (i.e., "display name") that appears in Galaxy UI menus where a user can associate a genome version with a file. The second version looks like it could be the dbkey.

            For each, it would be super great if we can get:

            • list of chromosome names and their sizes (in base pairs)
            • genus name
            • species name
            • variety or cultivar (only if relevant or required to avoid name collisions)
            • month and year of release

            If we have this information, then we can build everything we need to show a Galaxy file in IGB, for each genome version supported in the three Galaxies.

            Show
            ann.loraine Ann Loraine added a comment - Regarding this pair of strings served from https://usegalaxy.org/api/genomes: ["Arabidopsis thaliana TAIR10 (Arabidopsis_thaliana_TAIR10) (Arabidopsis_thaliana_TAIR10)", "Arabidopsis_thaliana_TAIR10"] It looks like the first string is the human-friendly string (i.e., "display name") that appears in Galaxy UI menus where a user can associate a genome version with a file. The second version looks like it could be the dbkey. For each, it would be super great if we can get: list of chromosome names and their sizes (in base pairs) genus name species name variety or cultivar (only if relevant or required to avoid name collisions) month and year of release If we have this information, then we can build everything we need to show a Galaxy file in IGB, for each genome version supported in the three Galaxies.
            Hide
            shamika Shamika Gajanan Kulkarni (Inactive) added a comment - - edited

            Prof. [~aloraine]
            The second string in the list is the dbkey. So, for ["Arabidopsis thaliana TAIR10 (Arabidopsis_thaliana_TAIR10) (Arabidopsis_thaliana_TAIR10)", "Arabidopsis_thaliana_TAIR10"], "Arabidopsis_thaliana_TAIR10" is the dbkey.

            Also, if we hit /api/genomes/<dbkey>, we get the chromosome information from there (name and size). I am currently comparing some samples to verify (Like hg38). Can you please check and verify if this is the information for chromosome names and sizes?

            Show
            shamika Shamika Gajanan Kulkarni (Inactive) added a comment - - edited Prof. [~aloraine] The second string in the list is the dbkey. So, for ["Arabidopsis thaliana TAIR10 (Arabidopsis_thaliana_TAIR10) (Arabidopsis_thaliana_TAIR10)", "Arabidopsis_thaliana_TAIR10"] , "Arabidopsis_thaliana_TAIR10" is the dbkey. Also, if we hit /api/genomes/<dbkey>, we get the chromosome information from there (name and size). I am currently comparing some samples to verify (Like hg38). Can you please check and verify if this is the information for chromosome names and sizes?
            Hide
            ann.loraine Ann Loraine added a comment - - edited

            This looks right.

            However, I have a question about Arabidopsis_thaliana_TAIR10.

            The output from this endpoint - https://usegalaxy.org/api/genomes/Arabidopsis_thaliana_TAIR10 - reports chromosome names as "chr1", "chr2", etc with lower-case "c".

            The IGB Quickload site many years ago also used lower-case "c". But I changed us to using upper-case "C" when I realized that The Arabidopsis Information Resource and the entire Arabidopsis community was using upper-case "C". I'm not sure where Galaxy got Arabidopsis_thaliana_TAIR10 from, but they might have got it from me. If they got it from me, then they maybe used the lower-case "c" because that is what I was using. However, I also have a vague memory of someone at the last Galaxy Community Conference saying something about how Galaxy might be making all chromosome names lower-case by default.

            Could you ask them about this?

            I also checked a few other plant genomes that I think might be using upper-case letters and they were all lower-case in Galaxy also.

            Also, could you find the part of their code base (it's python) that supports this "genomes" endpoint.

            (By the way I'm really super-glad it is there!!!)

            Show
            ann.loraine Ann Loraine added a comment - - edited This looks right. However, I have a question about Arabidopsis_thaliana_TAIR10. The output from this endpoint - https://usegalaxy.org/api/genomes/Arabidopsis_thaliana_TAIR10 - reports chromosome names as "chr1", "chr2", etc with lower-case "c". The IGB Quickload site many years ago also used lower-case "c". But I changed us to using upper-case "C" when I realized that The Arabidopsis Information Resource and the entire Arabidopsis community was using upper-case "C". I'm not sure where Galaxy got Arabidopsis_thaliana_TAIR10 from, but they might have got it from me. If they got it from me, then they maybe used the lower-case "c" because that is what I was using. However, I also have a vague memory of someone at the last Galaxy Community Conference saying something about how Galaxy might be making all chromosome names lower-case by default. Could you ask them about this? I also checked a few other plant genomes that I think might be using upper-case letters and they were all lower-case in Galaxy also. Also, could you find the part of their code base (it's python) that supports this "genomes" endpoint. (By the way I'm really super-glad it is there!!!)
            Hide
            shamika Shamika Gajanan Kulkarni (Inactive) added a comment -

            Yes sure, I will ask them and let you know about the lower-case and I'll try to find the part of their code-base too. Will let you know soon. Thank you Prof. [~aloraine].

            Show
            shamika Shamika Gajanan Kulkarni (Inactive) added a comment - Yes sure, I will ask them and let you know about the lower-case and I'll try to find the part of their code-base too. Will let you know soon. Thank you Prof. [~aloraine] .
            ann.loraine Ann Loraine made changes -
            Sprint Summer 2019 Sprint 12, Fall 2019 Sprint 1, Fall 2019 Sprint 2, Fall 2019 Sprint 3, Fall 4 : 30 Sep to 11 Oct, Fall 5 : 14 Oct to 25 Oct, Fall 6 : 28 Oct to 8 Nov [ 71, 72, 73, 74, 75, 76, 77 ] Summer 2019 Sprint 12, Fall 2019 Sprint 1, Fall 2019 Sprint 2, Fall 2019 Sprint 3, Fall 4 : 30 Sep to 11 Oct, Fall 5 : 14 Oct to 25 Oct, Fall 6 : 28 Oct to 8 Nov, Fall 7 : 11 Nov to 22 Nov [ 71, 72, 73, 74, 75, 76, 77, 78 ]
            ann.loraine Ann Loraine made changes -
            Rank Ranked higher
            ann.loraine Ann Loraine made changes -
            Status In Progress [ 3 ] To-Do [ 10305 ]
            ann.loraine Ann Loraine made changes -
            Sprint Summer 2019 Sprint 12, Fall 2019 Sprint 1, Fall 2019 Sprint 2, Fall 2019 Sprint 3, Fall 4 : 30 Sep to 11 Oct, Fall 5 : 14 Oct to 25 Oct, Fall 6 : 28 Oct to 8 Nov, Fall 7 : 11 Nov to 22 Nov [ 71, 72, 73, 74, 75, 76, 77, 78 ] Summer 2019 Sprint 12, Fall 2019 Sprint 1, Fall 2019 Sprint 2, Fall 2019 Sprint 3, Fall 4 : 30 Sep to 11 Oct, Fall 5 : 14 Oct to 25 Oct, Fall 6 : 28 Oct to 8 Nov, Fall 7 : 11 Nov to 22 Nov, Fall 8 : 25 Nov to 6 Dec [ 71, 72, 73, 74, 75, 76, 77, 78, 79 ]
            ann.loraine Ann Loraine made changes -
            Rank Ranked higher
            ann.loraine Ann Loraine made changes -
            Fix Version/s 9.1.0 Major Release [ 10601 ]
            ann.loraine Ann Loraine made changes -
            Rank Ranked higher
            ann.loraine Ann Loraine made changes -
            Status To-Do [ 10305 ] In Progress [ 3 ]
            ann.loraine Ann Loraine made changes -
            Status In Progress [ 3 ] Needs 1st Level Review [ 10005 ]
            ann.loraine Ann Loraine made changes -
            Status Needs 1st Level Review [ 10005 ] First Level Review in Progress [ 10301 ]
            ann.loraine Ann Loraine made changes -
            Status First Level Review in Progress [ 10301 ] Needs 1st Level Review [ 10005 ]
            ann.loraine Ann Loraine made changes -
            Status Needs 1st Level Review [ 10005 ] First Level Review in Progress [ 10301 ]
            ann.loraine Ann Loraine made changes -
            Status First Level Review in Progress [ 10301 ] Ready for Pull Request [ 10304 ]
            ann.loraine Ann Loraine made changes -
            Status Ready for Pull Request [ 10304 ] Pull Request Submitted [ 10101 ]
            ann.loraine Ann Loraine made changes -
            Status Pull Request Submitted [ 10101 ] Reviewing Pull Request [ 10303 ]
            ann.loraine Ann Loraine made changes -
            Status Reviewing Pull Request [ 10303 ] Merged Needs Testing [ 10002 ]
            ann.loraine Ann Loraine made changes -
            Status Merged Needs Testing [ 10002 ] Post-merge Testing In Progress [ 10003 ]
            ann.loraine Ann Loraine made changes -
            Resolution Done [ 10000 ]
            Status Post-merge Testing In Progress [ 10003 ] Closed [ 6 ]
            Hide
            ann.loraine Ann Loraine added a comment -

            dbkey is "?" if a user is working with a custom genome in Galaxy.
            Galaxy has a manually defined static map that maps dbkeys.

            Show
            ann.loraine Ann Loraine added a comment - dbkey is "?" if a user is working with a custom genome in Galaxy. Galaxy has a manually defined static map that maps dbkeys.
            nfreese Nowlan Freese made changes -
            Link This issue relates to IGBF-781 [ IGBF-781 ]
            nfreese Nowlan Freese made changes -
            Link This issue relates to IGBF-552 [ IGBF-552 ]
            nfreese Nowlan Freese made changes -
            Assignee Shamika Gajanan Kulkarni [ shamika ] Nowlan Freese [ nfreese ]
            nfreese Nowlan Freese made changes -
            Resolution Done [ 10000 ]
            Status Closed [ 6 ] To-Do [ 10305 ]
            nfreese Nowlan Freese made changes -
            Sprint Summer 2019 Sprint 12, Fall 2019 Sprint 1, Fall 2019 Sprint 2, Fall 2019 Sprint 3, Fall 4 : 30 Sep to 11 Oct, Fall 5 : 14 Oct to 25 Oct, Fall 6 : 28 Oct to 8 Nov, Fall 7 : 11 Nov to 22 Nov, Fall 8 : 25 Nov to 6 Dec [ 71, 72, 73, 74, 75, 76, 77, 78, 79 ] Summer 2019 Sprint 12, Fall 2019 Sprint 1, Fall 2019 Sprint 2, Fall 2019 Sprint 3, Fall 4 : 30 Sep to 11 Oct, Fall 5 : 14 Oct to 25 Oct, Fall 6 : 28 Oct to 8 Nov, Fall 7 : 11 Nov to 22 Nov, Fall 8 : 25 Nov to 6 Dec, Spring 5 2021 May 17 - May 28 [ 71, 72, 73, 74, 75, 76, 77, 78, 79, 121 ]
            nfreese Nowlan Freese made changes -
            Status To-Do [ 10305 ] In Progress [ 3 ]
            Hide
            nfreese Nowlan Freese added a comment -

            I have included some updates from Galaxy as part of IGBF-781.

            Some notes:
            The dbkeys for genomes that are also found in UCSC appear to have the same key names. For example, Takifugu rubripes is fr1 in both Galaxy and UCSC. However, as UCSC does not provide plant genomes through its API, Galaxy uses a somewhat different set of keys. For example, the Galaxy EU key for Solanum lycopersicum is Solanum_lycopersicum_Sol_Genomics_itag2.4.

            Since the Galaxy keys are the same as UCSC, I focused on the non UCSC genomes that are available in IGB. This list is mostly plant genomes, as many of the non-plant genomes in IGB are provided by UCSC. While the Galaxy main API does not include many plant genomes, the Galaxy EU API does include many plant genomes. This news article discusses the newly added plant genomes and where the data came from.

            I manually added as many of the Galaxy EU keys to synonyms.txt as I could identify.

            Show
            nfreese Nowlan Freese added a comment - I have included some updates from Galaxy as part of IGBF-781 . Some notes: The dbkeys for genomes that are also found in UCSC appear to have the same key names. For example, Takifugu rubripes is fr1 in both Galaxy and UCSC. However, as UCSC does not provide plant genomes through its API , Galaxy uses a somewhat different set of keys. For example, the Galaxy EU key for Solanum lycopersicum is Solanum_lycopersicum_Sol_Genomics_itag2.4. Since the Galaxy keys are the same as UCSC, I focused on the non UCSC genomes that are available in IGB. This list is mostly plant genomes, as many of the non-plant genomes in IGB are provided by UCSC. While the Galaxy main API does not include many plant genomes, the Galaxy EU API does include many plant genomes. This news article discusses the newly added plant genomes and where the data came from. I manually added as many of the Galaxy EU keys to synonyms.txt as I could identify.
            nfreese Nowlan Freese made changes -
            Link This issue relates to IGBF-2862 [ IGBF-2862 ]
            Hide
            nfreese Nowlan Freese added a comment -

            I have created IGBF-2862 regarding an edge case I found while testing loading Galaxy data in IGB.

            Show
            nfreese Nowlan Freese added a comment - I have created IGBF-2862 regarding an edge case I found while testing loading Galaxy data in IGB.
            nfreese Nowlan Freese made changes -
            Assignee Nowlan Freese [ nfreese ] Shamika Gajanan Kulkarni [ shamika ]
            nfreese Nowlan Freese made changes -
            Status In Progress [ 3 ] Needs 1st Level Review [ 10005 ]
            nfreese Nowlan Freese made changes -
            Status Needs 1st Level Review [ 10005 ] First Level Review in Progress [ 10301 ]
            nfreese Nowlan Freese made changes -
            Status First Level Review in Progress [ 10301 ] Ready for Pull Request [ 10304 ]
            nfreese Nowlan Freese made changes -
            Status Ready for Pull Request [ 10304 ] Pull Request Submitted [ 10101 ]
            nfreese Nowlan Freese made changes -
            Status Pull Request Submitted [ 10101 ] Reviewing Pull Request [ 10303 ]
            nfreese Nowlan Freese made changes -
            Status Reviewing Pull Request [ 10303 ] Merged Needs Testing [ 10002 ]
            nfreese Nowlan Freese made changes -
            Status Merged Needs Testing [ 10002 ] Post-merge Testing In Progress [ 10003 ]
            nfreese Nowlan Freese made changes -
            Resolution Done [ 10000 ]
            Status Post-merge Testing In Progress [ 10003 ] Closed [ 6 ]
            ann.loraine Ann Loraine made changes -
            Link This issue blocks IGBF-2849 [ IGBF-2849 ]

              People

              • Assignee:
                shamika Shamika Gajanan Kulkarni (Inactive)
                Reporter:
                dcnorris David Norris (Inactive)
              • Votes:
                0 Vote for this issue
                Watchers:
                3 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved:

                  Time Tracking

                  Estimated:
                  Original Estimate - 4 hours
                  4h
                  Remaining:
                  Remaining Estimate - 4 hours
                  4h
                  Logged:
                  Time Spent - Not Specified
                  Not Specified