Details

    • Type: Documentation
    • Status: Closed (View Workflow)
    • Priority: Major
    • Resolution: Done
    • Affects Version/s: None
    • Fix Version/s: None
    • Labels:
      None
    • Story Points:
      0.5
    • Epic Link:
    • Sprint:
      Summer 2018 Part 2

      Description

      This page is unclear about the difference between species.txt vs synonyms.txt.

      https://wiki.transvar.org/display/igbman/Use+synonyms.txt+to+link+genome+version+names+to+each+other

      It also points to a file in the repo that is not there.

        Attachments

          Issue Links

            Activity

            ieclabau Ivory Blakley (Inactive) created issue -
            Hide
            ann.loraine Ann Loraine added a comment -

            Update as required.

            Show
            ann.loraine Ann Loraine added a comment - Update as required.
            ann.loraine Ann Loraine made changes -
            Field Original Value New Value
            Assignee Ann Loraine [ aloraine ] Ivory Blakley [ ieclabau ]
            ieclabau Ivory Blakley (Inactive) made changes -
            Link This issue relates to IGBF-1262 [ IGBF-1262 ]
            ieclabau Ivory Blakley (Inactive) made changes -
            Link This issue relates to IGBF-1229 [ IGBF-1229 ]
            ann.loraine Ann Loraine made changes -
            Epic Link IGBF-140 [ 14594 ]
            ann.loraine Ann Loraine made changes -
            Rank Ranked higher
            ann.loraine Ann Loraine made changes -
            Fix Version/s 9.0.2 [ 10600 ]
            ann.loraine Ann Loraine made changes -
            Story Points 0.5
            ann.loraine Ann Loraine made changes -
            Sprint Summer 2018 Part 2 [ 49 ]
            ann.loraine Ann Loraine made changes -
            Rank Ranked higher
            Show
            ann.loraine Ann Loraine added a comment - See https://wiki.transvar.org/display/igbman/Use+species.txt+to+link+species+names+to+IGB+genome+names
            ieclabau Ivory Blakley (Inactive) made changes -
            Status Open [ 1 ] In Progress [ 3 ]
            Hide
            ann.loraine Ann Loraine added a comment -

            Look at Mason Meyer's recent comments. Recently he wrote about this - possibly as comment for one of the issues. If you can't find it, call or text.

            Show
            ann.loraine Ann Loraine added a comment - Look at Mason Meyer's recent comments. Recently he wrote about this - possibly as comment for one of the issues. If you can't find it, call or text.
            Hide
            ieclabau Ivory Blakley (Inactive) added a comment -

            I think you are thinking of Mason's comment on issue: IGBF-1262

            That makes one mention of the species.txt document, and highlights how it is distinct from chromosome.txt; but does not clarify the difference between species.txt and synonyms.txt.

            Show
            ieclabau Ivory Blakley (Inactive) added a comment - I think you are thinking of Mason's comment on issue: IGBF-1262 That makes one mention of the species.txt document, and highlights how it is distinct from chromosome.txt; but does not clarify the difference between species.txt and synonyms.txt.
            Hide
            ieclabau Ivory Blakley (Inactive) added a comment -

            In IGB, data is displayed relative to a genome version. The term "genome" might refer to an individual's genome (John Smith's genome) or the aggregate genetic pool of a species (Homo sapiens genome), but a genome version refers to an exact set of sequences (H_sapiens_Dec_2013), usually associated with a particular publication, institution and/or time. In IGB, the naming convention for genome versions is <first letter of genus><species><month published>_<year published>.

            A genome version is an exact thing. It can have up to one latin species name, and up to one common name. It can have any number of synonyms--names for the same genome version. These are specified in a document called species.txt, which has this tab-delimited format:
            Column 1: binomial (Latin) names for species
            Column 2: common name for the species
            Column 3: IGB-friendly genome version name prefix (e.g., H_sapiens or A_gambiae)
            Column 4, 5, etc. (optional): Genome version name (synonyms)

            In this, column 3 is the key (identity) column. Columns 1 and 2 may have repeats (and we should expect a repeat in one to also be repeat in the other). Columns 4-end should not include any repeats.

            As a supplement to the species.txt file, IGB also reads a synonyms.txt file, which is the same format, minus the first two columns. The synonyms file includes the genome version name in the first column (what was was column 3 in species.txt) and then any number of synonyms.

            If you have data representing reads that were aligned to the H_sapiens_Dec_2013 genome, you'll want to display the data with the H_sapiens_Dec_2013 sequences, not the H_sapiens_Feb_2009 sequences. If you setting up a quickload site, you would put the files in a folder called "H_sapiens_Dec_2013" and include that folder name in the contents.txt file. Suppose you are sending the data from Galaxy, and you are constrained to using whatever name Galaxy uses to refer to that genome version--perhaps "Homo sapiens"; you create a synonyms file to link "Homo sapiens" data from Galaxy to the "H_sapiens_Dec_2013" genome version in IGB, like this:
            H_sapiens_Dec_2013<tab>Homo sapiens

            That will tell IGB that any data from "Homo sapiens" should be shown in the H_sapiens_Dec_2013 genome. IGB will still use its existing latin name and common name for the H_sapiens_Dec_2013 genome version.

            Show
            ieclabau Ivory Blakley (Inactive) added a comment - In IGB, data is displayed relative to a genome version. The term "genome" might refer to an individual's genome (John Smith's genome) or the aggregate genetic pool of a species (Homo sapiens genome), but a genome version refers to an exact set of sequences (H_sapiens_Dec_2013), usually associated with a particular publication, institution and/or time. In IGB, the naming convention for genome versions is <first letter of genus> <species> <month published>_<year published>. A genome version is an exact thing. It can have up to one latin species name, and up to one common name. It can have any number of synonyms--names for the same genome version. These are specified in a document called species.txt, which has this tab-delimited format: Column 1: binomial (Latin) names for species Column 2: common name for the species Column 3: IGB-friendly genome version name prefix (e.g., H_sapiens or A_gambiae) Column 4, 5, etc. (optional): Genome version name (synonyms) In this, column 3 is the key (identity) column. Columns 1 and 2 may have repeats (and we should expect a repeat in one to also be repeat in the other). Columns 4-end should not include any repeats. As a supplement to the species.txt file, IGB also reads a synonyms.txt file, which is the same format, minus the first two columns. The synonyms file includes the genome version name in the first column (what was was column 3 in species.txt) and then any number of synonyms. If you have data representing reads that were aligned to the H_sapiens_Dec_2013 genome, you'll want to display the data with the H_sapiens_Dec_2013 sequences, not the H_sapiens_Feb_2009 sequences. If you setting up a quickload site, you would put the files in a folder called "H_sapiens_Dec_2013" and include that folder name in the contents.txt file. Suppose you are sending the data from Galaxy, and you are constrained to using whatever name Galaxy uses to refer to that genome version--perhaps "Homo sapiens"; you create a synonyms file to link "Homo sapiens" data from Galaxy to the "H_sapiens_Dec_2013" genome version in IGB, like this: H_sapiens_Dec_2013<tab>Homo sapiens That will tell IGB that any data from "Homo sapiens" should be shown in the H_sapiens_Dec_2013 genome. IGB will still use its existing latin name and common name for the H_sapiens_Dec_2013 genome version.
            ann.loraine Ann Loraine made changes -
            Resolution Done [ 10000 ]
            Status In Progress [ 3 ] Closed [ 6 ]
            ann.loraine Ann Loraine made changes -
            Fix Version/s 9.0.2 Minor Release [ 10600 ]
            ann.loraine Ann Loraine made changes -
            Workflow Loraine Lab Workflow [ 17989 ] Fall 2019 Workflow Update [ 19888 ]
            ann.loraine Ann Loraine made changes -
            Workflow Fall 2019 Workflow Update [ 19888 ] Revised Fall 2019 Workflow Update [ 22007 ]

              People

              • Assignee:
                ieclabau Ivory Blakley (Inactive)
                Reporter:
                ieclabau Ivory Blakley (Inactive)
              • Votes:
                0 Vote for this issue
                Watchers:
                2 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: