[IGBF-2344] Normalize by entire genome instead of by individual chromosome - JIRA UNCC

Ann Loraine created issue - 22/Apr/20 10:47 AM

Ann Loraine made changes - 22/Apr/20 10:47 AM

Field	Original Value	New Value
Epic Link		~~IGBF-1919~~ [ 18010 ]

Ann Loraine made changes - 22/Apr/20 10:47 AM

Rank

Ranked higher

Ann Loraine made changes - 22/Apr/20 10:50 AM

Description

Right now, the graph made from a BAI file is scaled, with the lowest value at 0 and the largest value at 1.

The scaling is currently done by chromosome. That is, the y axis values are expressed as a fraction of the sum of values for an entire chromosome.

Instead, let's scale by values for the entire genome.

If we do this, then a user will be able to load the entire file and then click "genome" in the Current Sequence tab and then instantly recognize larger-scale differences in the sequenced genome versus the reference genome. This will be super-useful for helping researchers recognize when a patient (for example) has an extra copy of an entire chromosome. For example, some people with trisomy 21 (Down's Syndrome) have duplications of the entire chromosome. If we make the proposed change, then such cases will be very easy to recognize in IGB.

Right now, the graph made from a BAI file is scaled, with the lowest value being 0 and the largest value being 1.

The scaling is currently done by chromosome. That is, the y axis values appearing on the screen are a fraction of the sum of values for the chromosome being shown. For example, if the total values for a chromosome is 1000 and the value for a "bin" region on the chromosome is 100, then the y axis value for that bin would be 100/1000 = 0.1.

Instead, let's scale by values for the entire genome.

If we do this, then a user will be able to load the entire file and then click "genome" in the Current Sequence tab and then instantly recognize larger-scale differences in the sequenced genome versus the reference genome. This will be super-useful for helping researchers recognize when a patient (for example) has an extra copy of an entire chromosome. For example, some people with trisomy 21 (Down's Syndrome) have duplications of the entire chromosome. If we make the proposed change, then such cases will be very easy to recognize in IGB.

Ann Loraine made changes - 22/Apr/20 10:51 AM

Description

Right now, the graph made from a BAI file is scaled, with the lowest value being 0 and the largest value being 1.

The scaling is currently done by chromosome. That is, the y axis values appearing on the screen are a fraction of the sum of values for the chromosome being shown. For example, if the total values for a chromosome is 1000 and the value for a "bin" region on the chromosome is 100, then the y axis value for that bin would be 100/1000 = 0.1.

Instead, let's scale by values for the entire genome.

If we do this, then a user will be able to load the entire file and then click "genome" in the Current Sequence tab and then instantly recognize larger-scale differences in the sequenced genome versus the reference genome. This will be super-useful for helping researchers recognize when a patient (for example) has an extra copy of an entire chromosome. For example, some people with trisomy 21 (Down's Syndrome) have duplications of the entire chromosome. If we make the proposed change, then such cases will be very easy to recognize in IGB.

Right now, the graph made from a BAI file is scaled, with the lowest value being 0 and the largest value being 1.

The scaling is currently done by chromosome. That is, the y axis values appearing on the screen are a fraction of the sum of values for the chromosome being shown. For example, if the total values for a chromosome is 1000 and the value for a "bin" region on the chromosome is 100, then the y axis value for that bin would be 100/1000 = 0.1.

Instead, let's scale by values for the entire genome.

If we do this, then a user will be able to load the entire file and then click "genome" in the Current Sequence tab and then instantly recognize larger-scale differences in the sequenced genome versus the reference genome. This will be super-useful for helping researchers recognize when a patient (for example) has an extra copy of an entire chromosome. For example, some people with trisomy 21 (Down's Syndrome) have duplications of the entire chromosome. If we make the proposed change, then such cases will be very easy to recognize in IGB.

Suggestion that may be useful:
To avoid very small fractional values, multiple by a scaling factor? Or, we can leave that to the user. They can easily multiply the entire graph by a number to make a new one.

Ann Loraine made changes - 24/Apr/20 10:11 PM

Sprint

Spring 7 : 13 Apr to 24 Apr [ 92 ]

Spring 7 : 13 Apr to 24 Apr, Spring 8 : 24 Apr to 8 May [ 92, 93 ]

Ann Loraine made changes - 24/Apr/20 10:11 PM

Rank

Ranked higher

Ann Loraine made changes - 10/May/20 8:01 PM

Sprint

Spring 7 : 13 Apr to 24 Apr, Spring 8 : 24 Apr to 8 May [ 92, 93 ]

Spring 7 : 13 Apr to 24 Apr, Spring 8 : 24 Apr to 8 May, Spring 8 : 11 May to 25 May [ 92, 93, 94 ]

Ann Loraine made changes - 10/May/20 8:01 PM

Rank

Ranked higher