Notes from a discussion with Nowlan Freese on Wednesday:
Given the large number of available genome assemblies, we should not read them all into memory at once or add them all to user interface component for the user to choose from.
Instead, we need some kind of query interface that would allow a user to search for their genome assembly (or assemblies) of interest. This could be implemented as part of IGB itself, or, we could simply ask the user to enter some specific, named identifier for their assembly of interest into IGB, and then, IGB would use that identifier to retrieve and display the requested genome assembly. For example, we could ask the user to enter a "Nucleotide accession" which should uniquely identify a single assembly and assembly version. Indeed, the word "accession" is even included in the REST URL path described above!
To start, we can implement a very simple interface where the user will enter the accession for their genome of interest, e.g., GCF_000001405.40.
Then, IGB would reach out to the APIs described above to retrieve the number of chromosomes (also called "contigs") and their sizes. Or, if the accession entered is not found, the interface would report that the accession has not been found, and invite the user to try again. The interface would also provide a link to the "genomes" page at NCBI and so that the user could use the query interface there to look up the accession for the genome assembly of interest.
One easy way to do this would be to ask the user to provide a unique identifier for the genome assembly of their interest. Then, IGB could send a request to one of the NCBI APIs to retrieve the data and load the requested assembly, or reply to the user that the requested assembly could not be found.
In this way, we could leverage the query interfaces hosted at the NCBI Web site, which will always be well-maintained and will improve over time. Also, I believe most users will be learning and becoming expert with the identifier systems used at NCBI. Pretty much everybody in biology understands their are these things called "accessions" that identify sequences, so I think we are in good shape with this approach.
Here is the Web page at NCBI with information about the genome assemblies available at the site - we would include this link in our interface to help the user find accessions and also learn about the vast amount of information available at NCBI:
Some of the services of NCBI are down and the link provided in the ticket is also not available currently, will start working on this once it is back.