The genomic sequence is not loaded properly because the chromosome name expected is in the format "chr1," but it is in the format "NC_000001.11." After changing the chromosome name to the "chr1" format, I was able to load the genomic sequence. However, ProtAnnot is throwing the error:
"* WARNING!!! Lengths disagree: residues = 8043, seq = 8041.0 *"
I analyzed the Homo sapiens data from IGB and NCBI for chromosome 21. Below are the details:
IGB: chr21 20998408 21486285 NM_001352593.2 0 + 20998563 21484561 0 16 210,75,207,144,138,118,161,146,151,188,97,174,120,122,181,1831, 0,282169,285785,287860,293695,325974,337096,339980,375454,411865,420064,433699,468197,470253,478882,486046,
NCBI: chr21 20998408 21486285 NM_001352593.2 . + 20998408 21486285 0 16 210,75,207,144,138,118,161,146,151,188,97,174,120,122,181,1831, 0,282169,285785,287860,293695,325974,337096,339980,375454,411865,420064,433699,468197,470253,478882,486046,
The thick start (7th column) and thick end (8th column) values are different. This difference in values is causing the error mentioned above.
To confirm that the difference in column values is causing the error, I changed the thick start and thick end values in the NCBI file and ran ProtAnnot. This time it did not cause an error, but the end result is not the same as the IGB annotation end result.
NCBI annotation result:
https://www.ebi.ac.uk/Tools/services/rest/iprscan5/result/iprscan5-R20250123-025907-0912-81242886-p1m/xml
IGB annotation result:
https://www.ebi.ac.uk/Tools/services/rest/iprscan5/result/iprscan5-R20250123-030209-0277-88344788-p1m/xml
The genomic sequence is not loaded properly because the chromosome name expected is in the format "chr1," but it is in the format "NC_000001.11." After changing the chromosome name to the "chr1" format, I was able to load the genomic sequence. However, ProtAnnot is throwing the error:
"* WARNING!!! Lengths disagree: residues = 8043, seq = 8041.0 *"
I analyzed the Homo sapiens data from IGB and NCBI for chromosome 21. Below are the details:
IGB: chr21 20998408 21486285 NM_001352593.2 0 + 20998563 21484561 0 16 210,75,207,144,138,118,161,146,151,188,97,174,120,122,181,1831, 0,282169,285785,287860,293695,325974,337096,339980,375454,411865,420064,433699,468197,470253,478882,486046,
NCBI: chr21 20998408 21486285 NM_001352593.2 . + 20998408 21486285 0 16 210,75,207,144,138,118,161,146,151,188,97,174,120,122,181,1831, 0,282169,285785,287860,293695,325974,337096,339980,375454,411865,420064,433699,468197,470253,478882,486046,
The thick start (7th column) and thick end (8th column) values are different. This difference in values is causing the error mentioned above.
To confirm that the difference in column values is causing the error, I changed the thick start and thick end values in the NCBI file and ran ProtAnnot. This time it did not cause an error, but the end result is not the same as the IGB annotation end result.
NCBI annotation result:
https://www.ebi.ac.uk/Tools/services/rest/iprscan5/result/iprscan5-R20250123-025907-0912-81242886-p1m/xml
IGB annotation result:
https://www.ebi.ac.uk/Tools/services/rest/iprscan5/result/iprscan5-R20250123-030209-0277-88344788-p1m/xml