Details
-
Type: Bug
-
Status: To-Do (View Workflow)
-
Priority: Major
-
Resolution: Unresolved
-
Labels:
-
Story Points:1.5
Description
Email from user, via sourceforge, from: sjack93@users.sourceforge.net
The IGB website says their program supports many different file formats, including gff. I have saved a genome from NCBI in several different file formats to my computer (using Ubuntu), but when I go to open the custom genome from IGB, it only recognizes the .fna file type, when I have genomes saved as .gbff and .gff in the same folder. It's like the program won't recognize these file types saved on my computer. Does anyone know why this might be happening? Thanks in advance.
and direct email from user:
Hi Ann,
I found your e-mail address on the troubleshooting page of the IGB (integrated genome browser) website and I'm hoping you can help me. The IGB user guide states that many different file formats are supported for this program (>20 file types including gff, gbff, fna and more). However, when I attempt to open genome from file, it recognizes only about half the file types it claims to be compatible with. I am running IGB on Ubuntu. Do you have any idea why this might be happening?
from: stephanie.jack@unb.ca
Reply:
Reply from Dr. Loraine:
Thank you for getting in touch.
Can you send me URLs of the GFF files you are trying to open?
We've had some issues with NCBI's GFF files in the past and I think we may have fixed those problems ... but this may need to be updated!
If we are able to fix the problem, we should be able to roll it out to you fairly quickly as an "early access" IGB release. We are setting the early access mechanism in the next few weeks, so hopefully we can get your problem addressed in a few weeks, as well.
Attachments
Issue Links
- relates to
-
IGBF-1546 Add American eel genome
- Closed
User help thread on sourceforge :
Stephanie:
The IGB website says their program supports many different file formats, including gff. I have saved a genome from NCBI in several different file formats to my computer (using Ubuntu), but when I go to open the custom genome from IGB, it only recognizes the .fna file type, when I have genomes saved as .gbff and .gff in the same folder. It's like the program won't recognize these file types saved on my computer. Does anyone know why this might be happening? Thanks in advance.
Nowlan:
To load a custom genome, IGB is looking for sequence files such as fasta, fna, or 2bit. If the genome for your data is not available, and you do not have a sequence file, you can drag and drop the .gff file directly into IGB and then click Load Data.
Stephanie:
Thanks for the info. I loaded the genome as a fna file, then the annotation as a .gff file, which was recommended to me on another online forum. The program has been "retreiving chromosomes" for many hours, do you know why this might be? Also, do you know what the "maximum heap size" is? It's displayed on the bottom right corner of the IGB interface, and the proportin of max heap size being used contnues to change as the program attempts to retreive chromosomes.
Dr. Loraine:
Are there a lot of reference sequences mentioned in the GFF file?
I would recommend opening the same file sequence in anIDE with debugger to see where the hang up occurs.
It would be nice if IGB could handle the various issues that come up with NCBI gff — NCBI is a major clearinghouse for genomic data that many people use.
Nowlan:
The issue appears to be with the gff file you are trying to view. The file does not appear to contain gene annotations mapped to a genome.
The Anguilla rostrata (American eel) annotation is available from dryad. If you unpack the file, there is a file called american_eel_genome_v5.gff that appears to contain the annotation. I sorted and compressed the file and index (attached). Try loading it (american_eel_genome_v5.sorted.gff.gz) in IGB and let me know if it works for you.