[IGBF-2686] Add creation of BioViz role and DynomoDB needed by bar.html - JIRA UNCC

Hide

Permalink

Ann Loraine added a comment - 16/Dec/20 10:33 AM

A role "Bioviz" needs to be created that has AmazonDynamoDBReadOnlyAccess policy.

Show

Ann Loraine added a comment - 16/Dec/20 10:33 AM A role "Bioviz" needs to be created that has AmazonDynamoDBReadOnlyAccess policy.

Hide

Permalink

Ann Loraine added a comment - 16/Dec/20 10:34 AM

A table named "Araport11" needs to be created and populated. FYI: It will likely never change and it is very small.

Show

Ann Loraine added a comment - 16/Dec/20 10:34 AM A table named "Araport11" needs to be created and populated. FYI: It will likely never change and it is very small.

Hide

Permalink

Ann Loraine added a comment - 16/Dec/20 10:41 AM

Code used to create the table: https://bitbucket.org/lorainelab/genomesource/src/master/IGBF-1495/

Show

Ann Loraine added a comment - 16/Dec/20 10:41 AM Code used to create the table: https://bitbucket.org/lorainelab/genomesource/src/master/IGBF-1495/

Hide

Permalink

Ann Loraine added a comment - 16/Dec/20 10:44 AM - edited

Data used to populate the table: http://igbquickload.org/quickload/A_thaliana_Jun_2009/Araport11.bed.gz

Data file is also version-controlled in a subversion repo: https://svn.bioviz.org/viewvc/genomes/quickload/

Show

Ann Loraine added a comment - 16/Dec/20 10:44 AM - edited Data used to populate the table: http://igbquickload.org/quickload/A_thaliana_Jun_2009/Araport11.bed.gz Data file is also version-controlled in a subversion repo: https://svn.bioviz.org/viewvc/genomes/quickload/

Hide

Permalink

Ann Loraine added a comment - 16/Dec/20 10:51 AM

To test whether the dynamoDB access is working, just hit this URL, substituting your bioviz hostname:

https://www.bioviz.org/cgi-bin/geneIdLookup.py

Show

Ann Loraine added a comment - 16/Dec/20 10:51 AM To test whether the dynamoDB access is working, just hit this URL, substituting your bioviz hostname: https://www.bioviz.org/cgi-bin/geneIdLookup.py

Hide

Permalink

Chester Dias (Inactive) added a comment - 17/Dec/20 2:56 PM

[~aloraine] This is the command to create a json of all the content of dynamo db
aws dynamodb scan --table-name TABLE_NAME > export.json --region us-east-1

The above command has to be run from an ec2 server with a role to download the content of dynamo DB
Please share that file with me.

Show

Chester Dias (Inactive) added a comment - 17/Dec/20 2:56 PM [~aloraine] This is the command to create a json of all the content of dynamo db aws dynamodb scan --table-name TABLE_NAME > export.json --region us-east-1 The above command has to be run from an ec2 server with a role to download the content of dynamo DB Please share that file with me.

Hide

Permalink

Ann Loraine added a comment - 18/Dec/20 9:02 AM

Data are available (temporarily) at the following URL:

https://www.bioviz.org/export.json

Show

Ann Loraine added a comment - 18/Dec/20 9:02 AM Data are available (temporarily) at the following URL: https://www.bioviz.org/export.json

Hide

Permalink

Chester Dias (Inactive) added a comment - 18/Dec/20 4:06 PM

CLI doesnt allow more than 25 items to be loaded into the table
I have started using the .py scripts mentioned earlier to try and integrate that with the playbooks

Show

Chester Dias (Inactive) added a comment - 18/Dec/20 4:06 PM CLI doesnt allow more than 25 items to be loaded into the table I have started using the .py scripts mentioned earlier to try and integrate that with the playbooks

Hide

Permalink

Ann Loraine added a comment - 18/Dec/20 4:11 PM

Thanks for the update!

Show

Ann Loraine added a comment - 18/Dec/20 4:11 PM Thanks for the update!

Hide

Permalink

Chester Dias (Inactive) added a comment - 23/Dec/20 9:42 AM

Process
1. Grant dynamo DB admin role priv to ec2 to provision infrastructure
2. Create dynamo db table if not present
3. Copy data and python script to EC2
4. Load data using python script if the data is not present
5. Remove admin from ec2 role and grant read-only access level to ec2

Please review: https://bitbucket.org/chesterdias/appstore-playbooks/branch/IGBF-2686#diff

Show

Chester Dias (Inactive) added a comment - 23/Dec/20 9:42 AM Process 1. Grant dynamo DB admin role priv to ec2 to provision infrastructure 2. Create dynamo db table if not present 3. Copy data and python script to EC2 4. Load data using python script if the data is not present 5. Remove admin from ec2 role and grant read-only access level to ec2 Please review: https://bitbucket.org/chesterdias/appstore-playbooks/branch/IGBF-2686#diff

Hide

Permalink

Ann Loraine added a comment - 23/Dec/20 9:55 AM - edited

Chester Dias - Please confirm that the playbook "setup.yml" can run from start to finish in a non-lorainelab account.

Show

Ann Loraine added a comment - 23/Dec/20 9:55 AM - edited Chester Dias - Please confirm that the playbook "setup.yml" can run from start to finish in a non-lorainelab account.

Hide

Permalink

Ann Loraine added a comment - 23/Dec/20 9:56 AM

Please also check that the site works, as well – main one to check is that the bar.js and bar.html functionality is working properly, since these are the parts of the site that are using the dynamodb.

Show

Ann Loraine added a comment - 23/Dec/20 9:56 AM Please also check that the site works, as well – main one to check is that the bar.js and bar.html functionality is working properly, since these are the parts of the site that are using the dynamodb.

Hide

Permalink

Ann Loraine added a comment - 23/Dec/20 9:58 AM - edited

To test, please team up with another developer for a zoom session. Working together, walk through the key parts of the site to ensure they are working correctly.
To know how to test bar.hmtl and bar.js, read the relevant sections of this paper which describes the functionality: https://pubmed.ncbi.nlm.nih.gov/31350781/

To test with the efp-browser mentioned in the paper, copy the urls for "view in IGB" show in the efp-browser. Enter these links into a web browser, replacing them with the address of your bioviz mirror site. You may also need to modify your local /etc/hosts file to "trick" your browser into thinking that the your bioviz mirror site's "hostname" resolves to its public IP address. (This may be necessary if your bioviz mirror's hostname is not current register in DNS.)

Show

Ann Loraine added a comment - 23/Dec/20 9:58 AM - edited To test, please team up with another developer for a zoom session. Working together, walk through the key parts of the site to ensure they are working correctly. To know how to test bar.hmtl and bar.js, read the relevant sections of this paper which describes the functionality: https://pubmed.ncbi.nlm.nih.gov/31350781/ To test with the efp-browser mentioned in the paper, copy the urls for "view in IGB" show in the efp-browser. Enter these links into a web browser, replacing them with the address of your bioviz mirror site. You may also need to modify your local /etc/hosts file to "trick" your browser into thinking that the your bioviz mirror site's "hostname" resolves to its public IP address. (This may be necessary if your bioviz mirror's hostname is not current register in DNS.)

Hide

Permalink

Chester Dias (Inactive) added a comment - 23/Dec/20 2:02 PM

[~aloraine] I have noticed the data load takes more than 12 minutes with the whole data using the py script. Would you like me to see if we can improve that script? It gives a feeling that the ansible session has got stuck.

Show

Chester Dias (Inactive) added a comment - 23/Dec/20 2:02 PM [~aloraine] I have noticed the data load takes more than 12 minutes with the whole data using the py script. Would you like me to see if we can improve that script? It gives a feeling that the ansible session has got stuck.

Hide

Permalink

Ann Loraine added a comment - 23/Dec/20 2:08 PM

Sure, I would say that counts as a bug that ought to be addressed. The table itself is tiny, so it does not make sense that it takes so long.

Show

Ann Loraine added a comment - 23/Dec/20 2:08 PM Sure, I would say that counts as a bug that ought to be addressed. The table itself is tiny, so it does not make sense that it takes so long.

Hide

Permalink

Chester Dias (Inactive) added a comment - 08/Jan/21 9:29 AM

I was able to get the configuration modified to increase the write speed. [~aloraine] Could you please let me know how often we would be writing to this table. There appears to be a monthly charge associated to the write capacity units. If we are not using we can dial down that value after populating all the data

Show

Chester Dias (Inactive) added a comment - 08/Jan/21 9:29 AM I was able to get the configuration modified to increase the write speed. [~aloraine] Could you please let me know how often we would be writing to this table. There appears to be a monthly charge associated to the write capacity units. If we are not using we can dial down that value after populating all the data

Hide

Permalink

Chester Dias (Inactive) added a comment - 08/Jan/21 9:36 AM

Speed of Read/ write is controlled by below
Provisioned read capacity units 5 (Auto Scaling Disabled)
Provisioned write capacity units 50 (Auto Scaling Disabled)

Overall Speed as per specification of aws is that A single call to BatchWriteItem can write up to 16 MB of data, which can comprise as many as 25 put or delete requests.

Reference: https://docs.aws.amazon.com/amazondynamodb/latest/APIReference/API_BatchWriteItem.html

Show

Chester Dias (Inactive) added a comment - 08/Jan/21 9:36 AM Speed of Read/ write is controlled by below Provisioned read capacity units 5 (Auto Scaling Disabled) Provisioned write capacity units 50 (Auto Scaling Disabled) Overall Speed as per specification of aws is that A single call to BatchWriteItem can write up to 16 MB of data, which can comprise as many as 25 put or delete requests. Reference: https://docs.aws.amazon.com/amazondynamodb/latest/APIReference/API_BatchWriteItem.html

Hide

Permalink

Chester Dias (Inactive) added a comment - 08/Jan/21 10:10 AM

[~aloraine] Could you please review the write capacity units of https://console.aws.amazon.com/dynamodb/home?region=us-east-1#tables:selected=Araport11;tab=capacity

it is set to 20(10$ a month charge) at the moment but if we are not using writes then we can dial this down to 5 which is the default.

Show

Chester Dias (Inactive) added a comment - 08/Jan/21 10:10 AM [~aloraine] Could you please review the write capacity units of https://console.aws.amazon.com/dynamodb/home?region=us-east-1#tables:selected=Araport11;tab=capacity it is set to 20(10$ a month charge) at the moment but if we are not using writes then we can dial this down to 5 which is the default.

Hide

Permalink

Ann Loraine added a comment - 08/Jan/21 11:12 AM

Thank you. I have set the write capacity to 1. Is that OK?

Show

Ann Loraine added a comment - 08/Jan/21 11:12 AM Thank you. I have set the write capacity to 1. Is that OK?

Hide

Permalink

Chester Dias (Inactive) added a comment - 08/Jan/21 11:14 AM

Should be fine to set it to 5. As long as we are not writing. There could be some internal meta writes

Show

Chester Dias (Inactive) added a comment - 08/Jan/21 11:14 AM Should be fine to set it to 5. As long as we are not writing. There could be some internal meta writes

Hide

Permalink

Chester Dias (Inactive) added a comment - 08/Jan/21 4:20 PM

Review: https://bitbucket.org/chesterdias/appstore-playbooks/commits/2aa3297973d9671b27c15ae06c9ae56687bf2dab?at=IGBF-2686

Chirag Chandrahas Shetty Please try running this from your system as a fresh run E2E, as Dr. Loraine suggested earlier before the holiday break. I would get an idea then of what more needs to be done to simplify this and improve where we can, based on your execution difficulties. I will remain available during the next week so message me any time about any issues you come across.

Show

Chester Dias (Inactive) added a comment - 08/Jan/21 4:20 PM Review: https://bitbucket.org/chesterdias/appstore-playbooks/commits/2aa3297973d9671b27c15ae06c9ae56687bf2dab?at=IGBF-2686 Chirag Chandrahas Shetty Please try running this from your system as a fresh run E2E, as Dr. Loraine suggested earlier before the holiday break. I would get an idea then of what more needs to be done to simplify this and improve where we can, based on your execution difficulties. I will remain available during the next week so message me any time about any issues you come across.

Hide

Permalink

Ann Loraine added a comment - 13/Jan/21 9:56 AM

Requests:

Please modify the loading strategy to use the original BED file and not a JSON file to load the data; please see scripts written by Charan V. (https://bitbucket.org/lorainelab/genomesource/src/master/IGBF-1495/)
Use an ansible svn module to obtain the BED file by "checking out" a read-only single directory where the bed file resides. Do NOT check out the entire repository as it is huge.
To support the above, add new default configurations to (a) specify the specific svn directory to be checked-out (b) specify the checked out bed file to be used (Araport11.bed.gz). Use the current convention of specifying defaults by adding these to the roles "defaults" directory (for example, see https://bitbucket.org/lorainelab/bioviz-playbooks/src/master/roles/clone/defaults/main.yml)

Show

Ann Loraine added a comment - 13/Jan/21 9:56 AM Requests: Please modify the loading strategy to use the original BED file and not a JSON file to load the data; please see scripts written by Charan V. ( https://bitbucket.org/lorainelab/genomesource/src/master/IGBF-1495/ ) Use an ansible svn module to obtain the BED file by "checking out" a read-only single directory where the bed file resides. Do NOT check out the entire repository as it is huge. To support the above, add new default configurations to (a) specify the specific svn directory to be checked-out (b) specify the checked out bed file to be used (Araport11.bed.gz). Use the current convention of specifying defaults by adding these to the roles "defaults" directory (for example, see https://bitbucket.org/lorainelab/bioviz-playbooks/src/master/roles/clone/defaults/main.yml )

Hide

Permalink

Chester Dias (Inactive) added a comment - 14/Jan/21 3:21 PM

Please review: https://bitbucket.org/chesterdias/appstore-playbooks/commits/55b6824dbec009385a0eb4ef7df0cdac54f13c0a

Show

Chester Dias (Inactive) added a comment - 14/Jan/21 3:21 PM Please review: https://bitbucket.org/chesterdias/appstore-playbooks/commits/55b6824dbec009385a0eb4ef7df0cdac54f13c0a

Hide

Permalink

Ann Loraine added a comment - 14/Feb/21 11:03 AM - edited

Looks like the new code is referring to the appstore playbooks. The appstore actually does not need any dynamodb code. Please get the latest copy master branch of the bioviz-playbooks repository and transfer the new code over to there.

I'm in the middle of restructuring the bioviz-playbooks and will push my new code later today or Monday. Wait for that to happen before creating your new bioviz-playbooks branch.

Show

Ann Loraine added a comment - 14/Feb/21 11:03 AM - edited Looks like the new code is referring to the appstore playbooks. The appstore actually does not need any dynamodb code. Please get the latest copy master branch of the bioviz-playbooks repository and transfer the new code over to there. I'm in the middle of restructuring the bioviz-playbooks and will push my new code later today or Monday. Wait for that to happen before creating your new bioviz-playbooks branch.

Hide

Permalink

Chester Dias (Inactive) added a comment - 19/Feb/21 3:31 PM

Please review :https://bitbucket.org/chesterdias/bioviz-playbooks-chester-local/branch/IGBF-2686#diff

Show

Chester Dias (Inactive) added a comment - 19/Feb/21 3:31 PM Please review : https://bitbucket.org/chesterdias/bioviz-playbooks-chester-local/branch/IGBF-2686#diff

Hide

Permalink

Ann Loraine added a comment - 22/Feb/21 5:50 PM

question about:

       if not request:
+            request.append({
+                'Chromosome': Chromosome,
+                'Start': Start,
+                'End': End,
+                'GeneId': GeneId,
+            })

in https://bitbucket.org/chesterdias/bioviz-playbooks-chester-local/branch/IGBF-2686#chg-roles/dynamodb/templates/load_svn_data.py.j2

Show

Ann Loraine added a comment - 22/Feb/21 5:50 PM question about: if not request: + request.append({ + 'Chromosome': Chromosome, + 'Start': Start, + 'End': End, + 'GeneId': GeneId, + }) in https://bitbucket.org/chesterdias/bioviz-playbooks-chester-local/branch/IGBF-2686#chg-roles/dynamodb/templates/load_svn_data.py.j2

Hide

Permalink

Chester Dias (Inactive) added a comment - 22/Feb/21 7:04 PM

That came from https://bitbucket.org/lorainelab/genomesource/src/master/IGBF-1495/createTable.py

if not data:
            data.append({
                'Chromosome': Chromosome,
                'Start': Start,
                'End': End,
                'GeneId': GeneId,
            })

My understanding is that it was possibly done to prevent adding duplicate entries with line 28 to 53 in my code since for execution can not occur in when the request is empty

Show

Chester Dias (Inactive) added a comment - 22/Feb/21 7:04 PM That came from https://bitbucket.org/lorainelab/genomesource/src/master/IGBF-1495/createTable.py if not data: data.append({ 'Chromosome': Chromosome, 'Start': Start, 'End': End, 'GeneId': GeneId, }) My understanding is that it was possibly done to prevent adding duplicate entries with line 28 to 53 in my code since for execution can not occur in when the request is empty

Hide

Permalink

Ann Loraine added a comment - 22/Feb/21 7:27 PM

Please investigate: What happens if "data" (or "request") is "None"?

Show

Ann Loraine added a comment - 22/Feb/21 7:27 PM Please investigate: What happens if "data" (or "request") is "None"?

Hide

Permalink

Chester Dias (Inactive) added a comment - 24/Feb/21 11:05 AM

data is declared as an empty array on line 9. Since there is no first element in `data` for the first execution, running a for loop will throw an error. The IF part of the logic is only to handle first element. All the remaining executions will go through the else part of the logic running the for loop.

Show

Chester Dias (Inactive) added a comment - 24/Feb/21 11:05 AM data is declared as an empty array on line 9. Since there is no first element in `data` for the first execution, running a for loop will throw an error. The IF part of the logic is only to handle first element. All the remaining executions will go through the else part of the logic running the for loop.

Hide

Permalink

Ann Loraine added a comment - 24/Feb/21 12:30 PM

OK I understand. Please modify the "if" statement to literally test the size of the array instead of relying on a quirk of the language in which an empty array evaluates to False in "if" statements.

Show

Ann Loraine added a comment - 24/Feb/21 12:30 PM OK I understand. Please modify the "if" statement to literally test the size of the array instead of relying on a quirk of the language in which an empty array evaluates to False in "if" statements.

Hide

Permalink

Chester Dias (Inactive) added a comment - 03/Mar/21 12:18 PM

Please review : https://bitbucket.org/chesterdias/bioviz-playbooks-chester-local/branch/IGBF-2686#chg-roles/dynamodb/templates/load_svn_data.py.j2

Show

Chester Dias (Inactive) added a comment - 03/Mar/21 12:18 PM Please review : https://bitbucket.org/chesterdias/bioviz-playbooks-chester-local/branch/IGBF-2686#chg-roles/dynamodb/templates/load_svn_data.py.j2

Hide

Permalink

Chester Dias (Inactive) added a comment - 03/Mar/21 2:00 PM

https://bitbucket.org/lorainelab/bioviz-playbooks/pull-requests/9/igbf-2686

Show

Chester Dias (Inactive) added a comment - 03/Mar/21 2:00 PM https://bitbucket.org/lorainelab/bioviz-playbooks/pull-requests/9/igbf-2686

Hide

Permalink

Ann Loraine added a comment - 03/Mar/21 2:18 PM

Merged.

Show

Ann Loraine added a comment - 03/Mar/21 2:18 PM Merged.

Hide

Permalink

Ann Loraine added a comment - 31/Mar/21 8:46 PM

Making some changes / updates to the code:

Define dynamo_db_table_name to be Araport11
Add role to aws.yml

Show

Ann Loraine added a comment - 31/Mar/21 8:46 PM Making some changes / updates to the code: Define dynamo_db_table_name to be Araport11 Add role to aws.yml

Hide

Permalink

Ann Loraine added a comment - 31/Mar/21 9:58 PM

The loading script appears to search through the entire "results" list each time it reads a line. From start to finish.
Fixing this.

Show

Ann Loraine added a comment - 31/Mar/21 9:58 PM The loading script appears to search through the entire "results" list each time it reads a line. From start to finish. Fixing this.

Hide

Permalink

Ann Loraine added a comment - 31/Mar/21 10:00 PM

Also will enable AWS and other variables to be passed as command line options. This will enable the script to be run with no editing required.

Show

Ann Loraine added a comment - 31/Mar/21 10:00 PM Also will enable AWS and other variables to be passed as command line options. This will enable the script to be run with no editing required.

Hide

Permalink

Ann Loraine added a comment - 31/Mar/21 11:49 PM - edited

How to get a single file out of a subversion repository without checking out an entire directory:

svn export https://svn.bioviz.org/repos/genomes/quickload/A_thaliana_Jun_2009/Araport11.bed.gz --username=guest --password=guest

Show

Ann Loraine added a comment - 31/Mar/21 11:49 PM - edited How to get a single file out of a subversion repository without checking out an entire directory: svn export https: //svn.bioviz.org/repos/genomes/quickload/A_thaliana_Jun_2009/Araport11.bed.gz --username=guest --password=guest

Add creation of BioViz role and DynomoDB needed by bar.html

Details

Description

Attachments

Activity

People

Dates