Details
-
Type:
Bug
-
Status: In Progress (View Workflow)
-
Priority:
Major
-
Resolution: Unresolved
-
Affects Version/s: None
-
Fix Version/s: None
-
Labels:None
-
Story Points:2
-
Epic Link:
-
Sprint:Winter 1
Description
Situation: A GFF file that has been converted to a BED file is throwing an error "WARNING!!! Lengths disagree: residues = 8043, seq = 8041.0". When uploading the converted BED file prodigal_Lambda_phage_sequences.bed
and running ProtAnnot, an error occurs.
Task: Investigate why the GFF-to-BED conversion is causing the error in ProtAnnot.
Attachments
Activity
| Field | Original Value | New Value |
|---|---|---|
| Epic Link | IGBF-1908 [ 17998 ] |
| Link | This issue relates to IGBF-4008 [ IGBF-4008 ] |
| Sprint | Winter 1 [ 209 ] | Winter 1, Spring 1 [ 209, 210 ] |
| Rank | Ranked higher |
| Attachment | prodigal_Lambda_phage_sequences.bed [ 18615 ] |
| Description |
Situation:
Task: |
Situation: A GFF file that has been converted to a BED file is throwing an error due to improper loading of residues. When uploading the converted BED file [^prodigal_Lambda_phage_sequences.bed] and running ProtAnnot, an error occurs. Upon checking the genomic sequence, it contains "-" symbols, which is unexpected.
Task: Investigate why the GFF-to-BED conversion is causing the error in ProtAnnot, specifically examining the genomic sequence for the unexpected "-" symbols. |
| Status | To-Do [ 10305 ] | In Progress [ 3 ] |
| Description |
Situation: A GFF file that has been converted to a BED file is throwing an error due to improper loading of residues. When uploading the converted BED file [^prodigal_Lambda_phage_sequences.bed] and running ProtAnnot, an error occurs. Upon checking the genomic sequence, it contains "-" symbols, which is unexpected.
Task: Investigate why the GFF-to-BED conversion is causing the error in ProtAnnot, specifically examining the genomic sequence for the unexpected "-" symbols. |
Situation: A GFF file that has been converted to a BED file is throwing an error ". When uploading the converted BED file [^prodigal_Lambda_phage_sequences.bed] and running ProtAnnot, an error occurs.
Task: Investigate why the GFF-to-BED conversion is causing the error in ProtAnnot, specifically examining the genomic sequence for the unexpected "-" symbols. |
| Description |
Situation: A GFF file that has been converted to a BED file is throwing an error ". When uploading the converted BED file [^prodigal_Lambda_phage_sequences.bed] and running ProtAnnot, an error occurs.
Task: Investigate why the GFF-to-BED conversion is causing the error in ProtAnnot, specifically examining the genomic sequence for the unexpected "-" symbols. |
Situation: A GFF file that has been converted to a BED file is throwing an error "WARNING!!! Lengths disagree: residues = 8043, seq = 8041.0". When uploading the converted BED file [^prodigal_Lambda_phage_sequences.bed] and running ProtAnnot, an error occurs.
Task: Investigate why the GFF-to-BED conversion is causing the error in ProtAnnot. |
| Status | In Progress [ 3 ] | To-Do [ 10305 ] |
| Sprint | Winter 1, Spring 1 [ 209, 210 ] | Winter 1, Spring 3 [ 209, 212 ] |
| Sprint | Winter 1, Spring 3 [ 209, 212 ] | Winter 1 [ 209 ] |
| Status | To-Do [ 10305 ] | In Progress [ 3 ] |
The genomic sequence is not loaded properly because the chromosome name expected is in the format "chr1," but it is in the format "NC_000001.11." After changing the chromosome name to the "chr1" format, I was able to load the genomic sequence. However, ProtAnnot is throwing the error:
"* WARNING!!! Lengths disagree: residues = 8043, seq = 8041.0 *"
I analyzed the Homo sapiens data from IGB and NCBI for chromosome 21. Below are the details:
IGB: chr21 20998408 21486285 NM_001352593.2 0 + 20998563 21484561 0 16 210,75,207,144,138,118,161,146,151,188,97,174,120,122,181,1831, 0,282169,285785,287860,293695,325974,337096,339980,375454,411865,420064,433699,468197,470253,478882,486046,
NCBI: chr21 20998408 21486285 NM_001352593.2 . + 20998408 21486285 0 16 210,75,207,144,138,118,161,146,151,188,97,174,120,122,181,1831, 0,282169,285785,287860,293695,325974,337096,339980,375454,411865,420064,433699,468197,470253,478882,486046,
The thick start (7th column) and thick end (8th column) values are different. This difference in values is causing the error mentioned above.
To confirm that the difference in column values is causing the error, I changed the thick start and thick end values in the NCBI file and ran ProtAnnot. This time it did not cause an error, but the end result is not the same as the IGB annotation end result.
NCBI annotation result:
https://www.ebi.ac.uk/Tools/services/rest/iprscan5/result/iprscan5-R20250123-025907-0912-81242886-p1m/xml
IGB annotation result:
https://www.ebi.ac.uk/Tools/services/rest/iprscan5/result/iprscan5-R20250123-030209-0277-88344788-p1m/xml