Uploaded image for project: 'IGB'
  1. IGB
  2. IGBF-2790

Investigate: business logic that attempts to guess species and genome version from the file

    Details

    • Type: New Feature
    • Status: To-Do (View Workflow)
    • Priority: Major
    • Resolution: Unresolved
    • Affects Version/s: None
    • Fix Version/s: None
    • Labels:

      Description

      When users open the metadata panel for a file, the species and genome version is only shown if the user has previously set these values.

      This is fine, but it means that a user must manually annotate each file with this information. Also, sometimes it is not possible to do so because the metadata can only be edited by the file's owner.

      IGB Quickload sites already have file-based databases that map chromosome names and sizes onto genome versions. Some file types in bioinformatics (e.g., BAM) also contain this information. All genome data files contain location coordinates. Sometimes the structure of the data itself, esp. transcriptome data, can suggest what sort of creature or plant species provided the original biological material that was used to generate the data. In light of all these facts, it may be possible to build a simple computer system that will make a guess about which genome assembly to display the data on top of. I think we could implement a "species and genome version" guesser that fill in these metadata values if the user has not yet provided them.

        Attachments

          Activity

            People

            • Assignee:
              nfreese Nowlan Freese
              Reporter:
              ann.loraine Ann Loraine
            • Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

              • Created:
                Updated: