Details
-
Type: Task
-
Status: Needs 1st Level Review (View Workflow)
-
Priority: Major
-
Resolution: Unresolved
-
Affects Version/s: None
-
Fix Version/s: None
-
Labels:None
-
Story Points:2
-
Epic Link:
-
Sprint:Fall 3, Fall 4
Description
Situation: There is a new version of the human genome referred to as either telomere to telomere or T2T or HS1. UCSC does provide this genome, link, and IGB is pulling in the genome through the UCSC REST API. As part of IGBF-3902, IGB is now including the hs1 genome under the "Homo sapiens" Species dropdown, but a Quickload for this genome still needs to be made.
Tasks: Create a bed14 annotation file for this new genome, and add it and the 2bit file (link) to IGB Quickload.
Attachments
Issue Links
Activity
Nowlan Freese
created issue -
Nowlan Freese
made changes -
Field | Original Value | New Value |
---|---|---|
Epic Link | IGBF-1765 [ 17855 ] |
Paige Kulzer
made changes -
Paige Kulzer
made changes -
Summary | Add Telomere to Telomere human genome to IGB | Create Telomere to Telomere human genome Quickload |
Description |
Situation: There is a new version of the human genome referred to as either telomere to telomere or T2T or HS1. UCSC does provide this genome, [link|https://genome.ucsc.edu/cgi-bin/hgGateway?db=hs1], and IGB is pulling in the genome through the UCSC REST API. However, IGB is not including the hs1 genome under the "Homo sapiens" Species dropdown, and is instead labeling it as "hs". The annotations for this genome version are also somewhat limited.
Tasks: Modify IGB (synonyms.txt maybe?) such that the "hs1" genome appears under Homo sapiens. Create a bed14 annotation file for this new genome, and add it and the 2bit file ([link|https://hgdownload.soe.ucsc.edu/downloads.html#human]) to IGB Quickload. |
Situation: There is a new version of the human genome referred to as either telomere to telomere or T2T or HS1. UCSC does provide this genome, [link|https://genome.ucsc.edu/cgi-bin/hgGateway?db=hs1], and IGB is pulling in the genome through the UCSC REST API. As part of Tasks: Create a bed14 annotation file for this new genome, and add it and the 2bit file ([link|https://hgdownload.soe.ucsc.edu/downloads.html#human]) to IGB Quickload. |
Assignee | Paige Kulzer [ pkulzer ] |
Ann Loraine
made changes -
Paige Kulzer
made changes -
Status | To-Do [ 10305 ] | In Progress [ 3 ] |
Paige Kulzer
made changes -
Status | In Progress [ 3 ] | Needs 1st Level Review [ 10005 ] |
Paige Kulzer
made changes -
Assignee | Paige Kulzer [ pkulzer ] | Nowlan Freese [ nfreese ] |
Nowlan Freese
made changes -
Status | Needs 1st Level Review [ 10005 ] | First Level Review in Progress [ 10301 ] |
Nowlan Freese
made changes -
Status | First Level Review in Progress [ 10301 ] | Needs 1st Level Review [ 10005 ] |
Nowlan Freese
made changes -
Sprint | Fall 2 [ 203 ] | Fall 3 [ 204 ] |
Ann Loraine
made changes -
Sprint | Fall 3 [ 204 ] | Fall 3, Fall 4 [ 204, 205 ] |
Ann Loraine
made changes -
Rank | Ranked higher |
As per discussion with Nowlan Freese after scrum, we would like to do these two things:
This is to allow this assembly to exist alongside the traditional hg38, hg19, etc assemblies. We want to do this because so many people are 100% using hg38 and hg19, not this new assembly just yet.
The version should be something like: H_sapiens_T2T_MMM_YYYY
Note that we need to be super duper careful about making sure that our version dates map correctly onto UCSC patch releases, or whatever they are doing to keep track of how the sequence itself (and all the constituent contigs) changes over time.
This will enable IGB to locally cache the genome file instead of always having to use the JSON REST API to retrieve sequence data all the time. Also, retrieving data from a 2bit file may be faster than getting sequence data from the JSON REST API.
Testing: Make sure that IGB can also retrieve sequence data from the JSON REST API in case the URL of the 2bit file changes or the UCSC Web site messes up somehow.
To do this, a tester can edit the "file" tag to point to a bogus location. If everything works the way it is supposed to work, then the user won't even notice that the 2bit file is missing and will simply default to getting data from the UCSC JSON API.
However, note that the "load priority" numbers are related to which data source IGB retrieves sequence data from.