Uploaded image for project: 'IGB'
  1. IGB
  2. IGBF-4266

Synchronize species.txt and synonyms.txt between IGB and SVN

    Details

    • Type: Improvement
    • Status: Closed (View Workflow)
    • Priority: Major
    • Resolution: Done
    • Affects Version/s: None
    • Fix Version/s: 10.2.0
    • Labels:
      None

      Description

      Situation: The species.txt and synonyms.txt files are used to display human readable species names and genome versions when selecting a genome. They are also used to identify data to the correct genome from UCSC and Galaxy. We are currently storing species.txt and synonyms.txt on our SVN repository and separately within IGB itself. The files are not synced and have diverged significantly.

      Task: Synchronize the files between the SVN repository and IGB.

      Links:
      IGB synonyms.txt file: link
      IGB species.txt file: link
      SVN repository: link
      Species.txt documentation: link
      Synonyms.txt documentation: link

      Note: There are several entries in the SVN synonyms.txt that are incorrect and will need to be fixed.

        Attachments

          Issue Links

            Activity

            Hide
            pkulzer Paige Kulzer added a comment -

            I've combed through both copies of species.txt and synonyms.txt in IGB and the SVN server. They are all identical now. I briefly tested that I didn't mess up any of the genomes during this process by building IGB from my branch and checking that all of the genomes are displaying correctly in the species drop-down menu.

            I'm sure there are still some lingering issues, so please carefully review these files and let me know if anything needs to be changed. One thing I would like to point out to the reviewer is line 166 of synonyms.txt - this line seems out of sorts but I'm not sure the best way to fix it.

            Branch: https://bitbucket.org/pkulzer-lorainelab/integrated-genome-browser/branch/IGBF-4266

            Show
            pkulzer Paige Kulzer added a comment - I've combed through both copies of species.txt and synonyms.txt in IGB and the SVN server. They are all identical now. I briefly tested that I didn't mess up any of the genomes during this process by building IGB from my branch and checking that all of the genomes are displaying correctly in the species drop-down menu. I'm sure there are still some lingering issues, so please carefully review these files and let me know if anything needs to be changed. One thing I would like to point out to the reviewer is line 166 of synonyms.txt - this line seems out of sorts but I'm not sure the best way to fix it. Branch: https://bitbucket.org/pkulzer-lorainelab/integrated-genome-browser/branch/IGBF-4266
            Hide
            nfreese Nowlan Freese added a comment - - edited

            Testing process:

            1. Pulled Paige's branch, mvn clean install
            2. Copy the IGB species.txt and synonyms.txt into my local SVN quickload.
            3. Start IGB and disable all data providers except for UCSC Rest.
            4. Add the local SVN quickload as a quickload data provider.
            5. Close IGB.
            6. Start IGB and look for errors in the logs.
            7. Check that each genome that had a modification in the species/synonyms files is able to load correctly.

            I found an issue with tardigrade not loading correctly. Because they had a shared common name, hypsibius was appearing as a genome version for ramazzotius, which is not correct. I modified it so one of them has a common name of tardigrade and the other is tardigrade water bear. This issue should be addressed in IGBF-2861.

            Next I noticed that there were some other things that could be fixed, such as double tabs, or igb prefix names in species.txt not in the 3rd column (unclear if this makes a difference). When I initially corrected these issues this led to a bug that caused IGB to load no genomes, with multiple errors in the logs. I was unable to discern why this was occurring, so I redid the changes and saw no issues. Unclear why this occurred.

            Paige Kulzer - please test my branch following the steps above to double check that genomes are able to load in IGB: https://bitbucket.org/nfreese/nowlanfork-igb/branch/IGBF-4266-squash

            Show
            nfreese Nowlan Freese added a comment - - edited Testing process: Pulled Paige's branch, mvn clean install Copy the IGB species.txt and synonyms.txt into my local SVN quickload. Start IGB and disable all data providers except for UCSC Rest. Add the local SVN quickload as a quickload data provider. Close IGB. Start IGB and look for errors in the logs. Check that each genome that had a modification in the species/synonyms files is able to load correctly. I found an issue with tardigrade not loading correctly. Because they had a shared common name, hypsibius was appearing as a genome version for ramazzotius, which is not correct. I modified it so one of them has a common name of tardigrade and the other is tardigrade water bear. This issue should be addressed in IGBF-2861 . Next I noticed that there were some other things that could be fixed, such as double tabs, or igb prefix names in species.txt not in the 3rd column (unclear if this makes a difference). When I initially corrected these issues this led to a bug that caused IGB to load no genomes, with multiple errors in the logs. I was unable to discern why this was occurring, so I redid the changes and saw no issues. Unclear why this occurred. Paige Kulzer - please test my branch following the steps above to double check that genomes are able to load in IGB: https://bitbucket.org/nfreese/nowlanfork-igb/branch/IGBF-4266-squash
            Hide
            pkulzer Paige Kulzer added a comment -

            Tested on my Mac with IGB 10.2.0 following Dr. Freese's instructions above. Several genomes aren't appearing in IGB but that shouldn't be breaking anything. We believe these to be genomes added during the Ensembl integration.

            I just checked in these changes to SVN (revision 230), ready for PR!

            Show
            pkulzer Paige Kulzer added a comment - Tested on my Mac with IGB 10.2.0 following Dr. Freese's instructions above. Several genomes aren't appearing in IGB but that shouldn't be breaking anything. We believe these to be genomes added during the Ensembl integration. I just checked in these changes to SVN (revision 230), ready for PR!
            Show
            nfreese Nowlan Freese added a comment - PR: https://bitbucket.org/lorainelab/integrated-genome-browser/pull-requests/1081
            Hide
            ann.loraine Ann Loraine added a comment - - edited

            PR is merged. New installers are built and deployed to bioviz.org.
            Ready for testing.

            Show
            ann.loraine Ann Loraine added a comment - - edited PR is merged. New installers are built and deployed to bioviz.org. Ready for testing.
            Hide
            pkulzer Paige Kulzer added a comment -

            Prior to testing, the SVN changes need to be deployed to IGB quickload. Ann Loraine could you help me with that?

            Show
            pkulzer Paige Kulzer added a comment - Prior to testing, the SVN changes need to be deployed to IGB quickload. Ann Loraine could you help me with that?
            Hide
            ann.loraine Ann Loraine added a comment -

            Yes, I can help!

            I have a suggestion:

            I feel like we should create a "testing" quickload.

            I can "checkout" a copy of the up-to-date Quickload main (with new files ready to go) onto RENCI hosting and then set up a new Web URL for it in DNS. You could then de-activate the deployed-into-production Quickload-main, restart IGB (to force a new loading of synonyms from the new file) and proceed with testing.

            cc:

            Paige Kulzer
            Nowlan Freese

            Show
            ann.loraine Ann Loraine added a comment - Yes, I can help! I have a suggestion: I feel like we should create a "testing" quickload. I can "checkout" a copy of the up-to-date Quickload main (with new files ready to go) onto RENCI hosting and then set up a new Web URL for it in DNS. You could then de-activate the deployed-into-production Quickload-main, restart IGB (to force a new loading of synonyms from the new file) and proceed with testing. cc: Paige Kulzer Nowlan Freese
            Hide
            pkulzer Paige Kulzer added a comment -

            That sounds just fine, thank you! Let me know when it's complete and I will begin testing.

            Show
            pkulzer Paige Kulzer added a comment - That sounds just fine, thank you! Let me know when it's complete and I will begin testing.
            Hide
            ann.loraine Ann Loraine added a comment - - edited

            Update:

            Show
            ann.loraine Ann Loraine added a comment - - edited Update: I logged onto the RENCI host as usual, but discovered that the "svn" program had not been installed. So, I changed to the root user and installed svn by running "yum install subversion" (The host is running this OS: Red Hat Enterprise Linux 9 for x86_64) This installed subversion version 1.14.1-5.el9_0. I then started the checkout of the main quickload repo with: svn co --username=guest https://svn.bioviz.org/repos/genomes/quickload (password "guest") as per the README file available from here: https://svn.bioviz.org/viewvc/genomes/quickload/HEADER.md?revision=180&view=markup The checkout is proceeding very slowly - I'm not sure what is making it so slow, however. When complete, the address of the new testing quickload site will be: http://igbquickload-main.bioviz.org/IGBF-4266/quickload/ The location of the site in the VM's file system is: /projects/igbquickload/lorainelab/www/main/htdocs/ IGBF-4266
            Hide
            ann.loraine Ann Loraine added a comment -

            Checking space on the RENCI host:

            [aloraine@lorainelab-quickload IGBF-4266]$ df -h
            Filesystem                  Size  Used Avail Use% Mounted on
            devtmpfs                    4.0M     0  4.0M   0% /dev
            tmpfs                       3.8G  4.0K  3.8G   1% /dev/shm
            tmpfs                       1.5G   87M  1.5G   6% /run
            efivarfs                    256K   56K  196K  23% /sys/firmware/efi/efivars
            /dev/mapper/VGos-root       8.0G  2.1G  5.9G  27% /
            /dev/sda2                   960M  376M  585M  40% /boot
            /dev/sda1                   200M  7.1M  193M   4% /boot/efi
            /dev/mapper/VGos-tmp        8.0G   90M  7.9G   2% /tmp
            /dev/mapper/VGos-home       8.0G   90M  7.9G   2% /home
            /dev/mapper/VGos-var         16G  1.8G   15G  11% /var
            /dev/mapper/VGos-opt        8.0G  199M  7.8G   3% /opt
            /dev/mapper/VGos-log        4.0G  540M  3.5G  14% /var/log
            /dev/mapper/VGos-var_tmp    8.0G   90M  7.9G   2% /var/tmp
            /dev/mapper/VGos-audit      2.0G   83M  1.9G   5% /var/log/audit
            172.25.8.101:/igbquickload  2.9T  2.7T  243G  92% /projects/igbquickload
            tmpfs                       765M     0  765M   0% /run/user/4822
            
            Show
            ann.loraine Ann Loraine added a comment - Checking space on the RENCI host: [aloraine@lorainelab-quickload IGBF-4266]$ df -h Filesystem Size Used Avail Use% Mounted on devtmpfs 4.0M 0 4.0M 0% /dev tmpfs 3.8G 4.0K 3.8G 1% /dev/shm tmpfs 1.5G 87M 1.5G 6% /run efivarfs 256K 56K 196K 23% /sys/firmware/efi/efivars /dev/mapper/VGos-root 8.0G 2.1G 5.9G 27% / /dev/sda2 960M 376M 585M 40% /boot /dev/sda1 200M 7.1M 193M 4% /boot/efi /dev/mapper/VGos-tmp 8.0G 90M 7.9G 2% /tmp /dev/mapper/VGos-home 8.0G 90M 7.9G 2% /home /dev/mapper/VGos- var 16G 1.8G 15G 11% / var /dev/mapper/VGos-opt 8.0G 199M 7.8G 3% /opt /dev/mapper/VGos-log 4.0G 540M 3.5G 14% / var /log /dev/mapper/VGos-var_tmp 8.0G 90M 7.9G 2% / var /tmp /dev/mapper/VGos-audit 2.0G 83M 1.9G 5% / var /log/audit 172.25.8.101:/igbquickload 2.9T 2.7T 243G 92% /projects/igbquickload tmpfs 765M 0 765M 0% /run/user/4822
            Hide
            ann.loraine Ann Loraine added a comment - - edited

            Sorry for the delay. The testing-only Quickload is now available at URL http://igbquickload-main.bioviz.org/IGBF-4266/quickload/

            cc: Paige Kulzer

            Show
            ann.loraine Ann Loraine added a comment - - edited Sorry for the delay. The testing-only Quickload is now available at URL http://igbquickload-main.bioviz.org/IGBF-4266/quickload/ cc: Paige Kulzer
            Hide
            pkulzer Paige Kulzer added a comment -

            We've tested using the test Quickload in IGB 10.1.0 and 10.2.0. All genomes and sequences are loading as expected. Ann Loraine, it should be safe to deploy these changes to the main IGB Quickload for final testing now.

            Show
            pkulzer Paige Kulzer added a comment - We've tested using the test Quickload in IGB 10.1.0 and 10.2.0. All genomes and sequences are loading as expected. Ann Loraine , it should be safe to deploy these changes to the main IGB Quickload for final testing now.
            Hide
            ann.loraine Ann Loraine added a comment -

            Thank you Paige Kulzer and Nowlan Freese! I am proceeding to deploy the updates to RENCI and UNC Charlotte hosting.

            Show
            ann.loraine Ann Loraine added a comment - Thank you Paige Kulzer and Nowlan Freese ! I am proceeding to deploy the updates to RENCI and UNC Charlotte hosting.
            Hide
            ann.loraine Ann Loraine added a comment - - edited

            Both quickload main's (primary and mirror) are updated with the new files species.txt and synonyms.txt.

            Ready for final testing.

            attn: Paige Kulzer and Nowlan Freese

            Show
            ann.loraine Ann Loraine added a comment - - edited Both quickload main's (primary and mirror) are updated with the new files species.txt and synonyms.txt. Ready for final testing. attn: Paige Kulzer and Nowlan Freese
            Hide
            pkulzer Paige Kulzer added a comment -

            Completed final testing on Mac and found no issues. Closing ticket!

            Show
            pkulzer Paige Kulzer added a comment - Completed final testing on Mac and found no issues. Closing ticket!

              People

              • Assignee:
                nfreese Nowlan Freese
                Reporter:
                nfreese Nowlan Freese
              • Votes:
                0 Vote for this issue
                Watchers:
                3 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: