Details
-
Type: Bug
-
Status: Open (View Workflow)
-
Priority: Major
-
Resolution: Unresolved
-
Labels:None
-
Epic Link:
Description
*Version noticed: 8.2.0
THIS ISSUE WAS ONLY NOTICED FOR 1 GENE ONLY SO FAR!
(but still, what is it about this gene that causes the edge-matching to display inappropriately?)
*From Ivory:
IGB is being randomly selective about what edges to highlight.
See attached images EdgeMapA and B. Depending on which feature is selected, the edge mapping indicates that only some of the exons shown are the same (A) or that all exons shown are the same (B).
To see more instances of inconsistent edge matching in a case where you would really want to be able to use edge mapping, see this region: 3R:21,350,072-21,378,314
In the latest fruit fly genome (Drosophila melanogaster).
Attachments
Issue Links
Activity
Mason Meyer (Inactive)
created issue -
Mason Meyer (Inactive)
made changes -
Field | Original Value | New Value |
---|---|---|
Summary | Edge-matching is being randomly selective | Edge-matching is being randomly selective |
Mason Meyer (Inactive)
made changes -
Description |
*Version noticed: 8.2.0
THIS ISSUE WAS ONLY NOTICED FOR 1 GENE! (but still, what is it about this gene that causes the edge-matching to display inappropriately?) *From Ivory: IGB is being randomly selective about what edges to highlight. See attached images EdgeMapA and B. Depending on which feature is selected, the edge mapping indicates that only some of the exons shown are the same (A) or that all exons shown are the same (B). To see more instances of inconsistent edge matching in a case where you would really want to be able to use edge mapping, see this region: 3R:21,350,072-21,378,314 In the latest fruit fly genome (Drosophila melanogaster). |
*Version noticed: 8.2.0
THIS ISSUE WAS ONLY NOTICED FOR 1 GENE ONLY SO FAR! (but still, what is it about this gene that causes the edge-matching to display inappropriately?) *From Ivory: IGB is being randomly selective about what edges to highlight. See attached images EdgeMapA and B. Depending on which feature is selected, the edge mapping indicates that only some of the exons shown are the same (A) or that all exons shown are the same (B). To see more instances of inconsistent edge matching in a case where you would really want to be able to use edge mapping, see this region: 3R:21,350,072-21,378,314 In the latest fruit fly genome (Drosophila melanogaster). |
Mason Meyer (Inactive)
made changes -
Summary | Edge-matching is being randomly selective | Edge-matching is being randomly selective (for Fruit Fly Gene Model) |
Mason Meyer (Inactive)
made changes -
Mason Meyer (Inactive)
made changes -
Resolution | Unresolved [ 10101 ] |
Mason Meyer (Inactive)
made changes -
Mason Meyer (Inactive)
made changes -
Epic Link | IGBF-497 [ 15559 ] |
Mason Meyer (Inactive)
made changes -
Rank | Ranked lower |
Mason Meyer (Inactive)
made changes -
Rank | Ranked higher |
Mason Meyer (Inactive)
made changes -
Rank | Ranked higher |
Mason Meyer (Inactive)
made changes -
Resolution | Unresolved [ 10101 ] | |
Status | Open [ 1 ] | Open [ 1 ] |
Mason Meyer (Inactive)
made changes -
From Ivory:
These genes are all in the same general area: (grouped in sets of nearly identical genes)
FBgn0002781
FBgn0261837
FBgn0261838
FBgn0261839
FBgn0261840
FBgn0261841
FBgn0261842
FBgn0261843
FBgn0261844
FBgn0261845
FBgn0266170
FBgn0266171
FBgn0266172
FBgn0266174
FBgn0266175
FBgn0266176
FBgn0266177
FBgn0266178
FBgn0267648
FBgn0267649
FBgn0267650
FBgn0267651
FBgn0267652
And these are in one area:
FBgn0005630
FBgn0264817
I don’t understand how this gene (FBgn0264817) can have a trans-spiced exon, it only has one exon. Guess its translated in the revers direction.
So the comment is limited to two regions on the genome.
AND, it looks like the strand issue that I noticed with FBgn0002781 (the fact that it had some components listed as being on the + strand and some on the – strand) is unique to that gene. I assumed it was related to the comment, and that other genes with the comment would have the same issue, but for all of the other genes that have this comment, they note all of their components as begin on the same strand. (See "Note 1" below).
Maybe the strand issue is just a fluke? Maybe we should just manually curate this gene in the file and move on. (and see if the edge matching is fixed)
As a test, I downloaded the file and manually changed the FBgn0002781 transcripts that were to ,
The transcripts were: FBtr0084079 , FBtr0084085, FBtr0084084, FBtr0084080, FBtr0084081, FBtr0084082, FBtr0307759, FBtr0307760, FBtr0084083
Just changing the strand to be (–) didn’t solve it.
If I select transcript FBtr0084061, all transcripts match at almost all edges.
If I select transcript FBtr0084083, nearly half of the transcripts do not edge match.
This holds in the original quickload file as well.
Attached is a bed file that has ONLY gene FBgn0002781, taken directly from the igb quickload file, and a gtf for just this gene taken from the fly base file.
Both of these can be viewed in IGB. The gtf file shows all the transcripts on the negative strand.
Strangely, the two small files do not have the edge matching issue. (at least I didn’t notice it using the two transcripts that I noted in the last paragraph), so they may not be as helpful as I had hoped.
$ grep FBgn0002781 dmel-all-r6.03.gtf > FBgn0002781.gtf
$ gunzip -c D_melanogaster_Jul_2014.bed.gz | grep FBgn0002781 > FBgn0002781.bed
*Note 1:
for all of the other genes that have this comment, they note all of their components as begin on the same strand
$ for GENE in $(grep "SO:0000459:gene_with_trans_spliced_transcript" dmel-all-r6.03.gtf | grep -o FBgn[0-9]* | uniq); do echo $GENE; grep $GENE dmel-all-r6.03.gtf | cut -f7 | sort | uniq | wc -l; done
FBgn0002781
2
FBgn0005630
1
FBgn0261837
1
FBgn0261838
1
FBgn0261839
1
FBgn0261840
1
FBgn0261841
1
FBgn0261842
1
FBgn0261843
1
FBgn0261844
1
FBgn0261845
1
FBgn0264817
1
FBgn0266170
1
FBgn0266171
1
FBgn0266172
1
FBgn0266173
1
FBgn0266174
1
FBgn0266175
1
FBgn0266176
1
FBgn0266177
1
FBgn0266178
1
FBgn0267648
1
FBgn0267649
1
FBgn0267650
1
FBgn0267651
1
FBgn0267652
1