Details
-
Type: New Feature
-
Status: To-Do (View Workflow)
-
Priority: Major
-
Resolution: Unresolved
-
Affects Version/s: None
-
Fix Version/s: None
-
Labels:
-
Story Points:1
-
Epic Link:
Description
When users open the metadata panel for a file, the species and genome version is only shown if the user has previously set these values.
This is fine, but it means that a user must manually annotate each file with this information. Also, sometimes it is not possible to do so because the metadata can only be edited by the file's owner.
IGB Quickload sites already have file-based databases that map chromosome names and sizes onto genome versions. Some file types in bioinformatics (e.g., BAM) also contain this information. All genome data files contain location coordinates. Sometimes the structure of the data itself, esp. transcriptome data, can suggest what sort of creature or plant species provided the original biological material that was used to generate the data. In light of all these facts, it may be possible to build a simple computer system that will make a guess about which genome assembly to display the data on top of. I think we could implement a "species and genome version" guesser that fill in these metadata values if the user has not yet provided them.