Uploaded image for project: 'IGB'
  1. IGB
  2. IGBF-2790

Investigate: business logic that attempts to guess species and genome version from the file

    Details

    • Type: New Feature
    • Status: To-Do (View Workflow)
    • Priority: Major
    • Resolution: Unresolved
    • Affects Version/s: None
    • Fix Version/s: None
    • Labels:

      Description

      When users open the metadata panel for a file, the species and genome version is only shown if the user has previously set these values.

      This is fine, but it means that a user must manually annotate each file with this information. Also, sometimes it is not possible to do so because the metadata can only be edited by the file's owner.

      IGB Quickload sites already have file-based databases that map chromosome names and sizes onto genome versions. Some file types in bioinformatics (e.g., BAM) also contain this information. All genome data files contain location coordinates. Sometimes the structure of the data itself, esp. transcriptome data, can suggest what sort of creature or plant species provided the original biological material that was used to generate the data. In light of all these facts, it may be possible to build a simple computer system that will make a guess about which genome assembly to display the data on top of. I think we could implement a "species and genome version" guesser that fill in these metadata values if the user has not yet provided them.

        Attachments

          Activity

          ann.loraine Ann Loraine made changes -
          Assignee Nowlan Freese [ nfreese ]
          ann.loraine Ann Loraine made changes -
          Summary Write business logic that attempts to guess species and genome version from the file Investigate: business logic that attempts to guess species and genome version from the file
          nfreese Nowlan Freese made changes -
          Sprint Winter 4 Feb 8 - Feb 19 [ 114 ]
          nfreese Nowlan Freese made changes -
          Description When users open the metadata panel for a file, the species and genome version is only shown if the user has previously set these values.

          This is fine, but it means that a user must manually annotate each file with this information. Also, sometimes it is not possible to do so because the metadata can only be edited by the file's owner.

          IGB Quickload sites already have file-based databases that map chromosome names and sizes onto genome versions. Some file types in bioinformatics (e.g., BAM) also contain this information. All genome genome data files contain location coordinates. Sometimes the structure of the data itself, esp. transcriptome data, can suggest what sort of creature or plant species provided the original biological material that was used to generate the data. In light of all these facts, it may be possible to build a simple computer system that will make a guess about which genome assembly to display the data on top of. I think we could implement a "species and genome version" guesser that fill in these metadata values if the user has not yet provided them.
          When users open the metadata panel for a file, the species and genome version is only shown if the user has previously set these values.

          This is fine, but it means that a user must manually annotate each file with this information. Also, sometimes it is not possible to do so because the metadata can only be edited by the file's owner.

          IGB Quickload sites already have file-based databases that map chromosome names and sizes onto genome versions. Some file types in bioinformatics (e.g., BAM) also contain this information. All genome data files contain location coordinates. Sometimes the structure of the data itself, esp. transcriptome data, can suggest what sort of creature or plant species provided the original biological material that was used to generate the data. In light of all these facts, it may be possible to build a simple computer system that will make a guess about which genome assembly to display the data on top of. I think we could implement a "species and genome version" guesser that fill in these metadata values if the user has not yet provided them.
          ann.loraine Ann Loraine made changes -
          Summary Write Write business logic that attempts to guess species and genome version from the file
          Issue Type Task [ 3 ] New Feature [ 2 ]
          Description When users open the metadata panel for a file, the species and genome version is only shown if the user has previously set these values.

          This is fine, but it means that a user must manually annotate each file with this information. Also, sometimes it is not possible to do so because the metadata can only be edited by the file's owner.

          IGB Quickload sites already have file-based databases that map chromosome names and sizes onto genome versions. Some file types in bioinformatics (e.g., BAM) also contain this information. All genome genome data files contain location coordinates. Sometimes the structure of the data itself, esp. transcriptome data, can suggest what sort of creature or plant species provided the original biological material that was used to generate the data. In light of all these facts, it may be possible to build a simple computer system that will make a guess about which genome assembly to display the data on top of. I think we could implement a "species and genome version" guesser that fill in these metadata values if the user has not yet provided them.
          Story Points 1
          Sprint Winter 4 Feb 8 - Feb 19 [ 114 ]
          Labels Advanced
          Assignee Ann Loraine [ aloraine ]
          ann.loraine Ann Loraine made changes -
          Field Original Value New Value
          Epic Link IGBF-2376 [ 18533 ]
          ann.loraine Ann Loraine created issue -

            People

            • Assignee:
              nfreese Nowlan Freese
              Reporter:
              ann.loraine Ann Loraine
            • Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

              • Created:
                Updated: