Currently, the search works using features from a genome card's html content. This means that, since genome versions are present in the card container, the current search implementation could be extended to include them. The current search, however, doesn't have access to the main backend data structure, where the synonym data is. One solution would be to pass this data into the front end by embedding it as attributes to the genome card, but this isn't very clean. I think that a better solution would be to create a search api endpoint for genome dashboard and pass to it the genome id and search terms as query parameters, which could be used in a backend search.
Some reasoning behind this:
- not all genome versions e.g. A_gambiae_Oct_2006 have an associated entry in the synonyms.
- there are also other discrepancies between the three files used in generating the backend data structure (species, synonyms, allquickloads)
Prior to implementing backend search, it would be good to transport all the data in the files mentioned above into database. I think that amazon dynamodb would work well for this.
The main task would then be determining the structure for the database, which I propose below. Also, I think it would be good to use it as the single source of truth off of which both IGB and genome dashboard would be based.
The backend data structure currently looks like so:
https://pastebin.com/raw/JieyQpTx
I think that in designing the database, it would be better to have a list of genome objects as outlined below:
https://pastebin.com/raw/M7GY9BCF
This groups data better and makes it easier to identify missing parts. Please let me know what you think, [~aloraine].
Currently, the search works using features from a genome card's html content. This means that, since genome versions are present in the card container, the current search implementation could be extended to include them. The current search, however, doesn't have access to the main backend data structure, where the synonym data is. One solution would be to pass this data into the front end by embedding it as attributes to the genome card, but this isn't very clean. I think that a better solution would be to create a search api endpoint for genome dashboard and pass to it the genome id and search terms as query parameters, which could be used in a backend search.
Some reasoning behind this:
Prior to implementing backend search, it would be good to transport all the data in the files mentioned above into database. I think that amazon dynamodb would work well for this.
The main task would then be determining the structure for the database, which I propose below. Also, I think it would be good to use it as the single source of truth off of which both IGB and genome dashboard would be based.
The backend data structure currently looks like so:
https://pastebin.com/raw/JieyQpTx
I think that in designing the database, it would be better to have a list of genome objects as outlined below:
https://pastebin.com/raw/M7GY9BCF
This groups data better and makes it easier to identify missing parts. Please let me know what you think, [~aloraine].