Uploaded image for project: 'IGB'
  1. IGB
  2. IGBF-4206

Add jojoba (Simmondsia chinensis) genome to IGB

    Details

    • Type: Task
    • Status: Closed (View Workflow)
    • Priority: Major
    • Resolution: Done
    • Affects Version/s: None
    • Fix Version/s: None
    • Labels:
      None

      Description

      Situation: Dr. Kent Chapman, a professor from the University of North Texas, attended my talk at the SS-ASPB meeting earlier this year. He's followed up with me after the conference to request that the jojoba genome be added to IGB. We recently integrated GenArk, a genome-rich datasource, with IGB, but they don't yet host this genome, either.

      Task: Add the jojoba (Simmondsia chinensis) genome to IGB. Kent co-authored a paper describing this new genome assembly which I'll link below. The genome assembly and meta-data has been deposited in the Beijing Institute of Genomics (BIG) data center under the accession no. GWHAASQ00000000.

      Link to the paper: https://doi.org/10.1126/sciadv.aay3240
      Link to the genome/annotations: https://ngdc.cncb.ac.cn/gwh/Assembly/486/show

        Attachments

          Activity

          pkulzer Paige Kulzer created issue -
          pkulzer Paige Kulzer made changes -
          Field Original Value New Value
          Epic Link IGBF-1765 [ 17855 ]
          Hide
          pkulzer Paige Kulzer added a comment -

          Kent's email from March 30th, 2025:

          Paige, Thank you again for your excellent presentation about your integrated genomics resources. If you could include jojoba (Simmondsia chinensis) in your database and web interface, that might be useful.

          I think everything is referenced here in this paper and/or accessible online or through supplemental materials. If you need something else, let me know.

          https://www.science.org/doi/10.1126/sciadv.aay3240

          Best,
          Kent

          Kent D. Chapman, Ph.D.
          Regents Professor of Biochemistry
          Member, BioDiscovery Institute

          University of North Texas
          Department of Biological Sciences
          1155 Union Circle #305220
          Denton, TX 76203-5017
          +1-940-565-2969 (office)
          +1-940-300-6961 (cell)
          https://bdi.unt.edu/kent-chapman

          Show
          pkulzer Paige Kulzer added a comment - Kent's email from March 30th, 2025: Paige, Thank you again for your excellent presentation about your integrated genomics resources. If you could include jojoba (Simmondsia chinensis) in your database and web interface, that might be useful. I think everything is referenced here in this paper and/or accessible online or through supplemental materials. If you need something else, let me know. https://www.science.org/doi/10.1126/sciadv.aay3240 Best, Kent Kent D. Chapman, Ph.D. Regents Professor of Biochemistry Member, BioDiscovery Institute University of North Texas Department of Biological Sciences 1155 Union Circle #305220 Denton, TX 76203-5017 +1-940-565-2969 (office) +1-940-300-6961 (cell) https://bdi.unt.edu/kent-chapman
          pkulzer Paige Kulzer made changes -
          Rank Ranked higher
          pkulzer Paige Kulzer made changes -
          Sprint Fall 4 [ 227 ]
          pkulzer Paige Kulzer made changes -
          Status To-Do [ 10305 ] In Progress [ 3 ]
          Hide
          pkulzer Paige Kulzer added a comment -

          Below is an outline of the steps I followed to create the Simmondsia chinensis Quickload:
          1. Convert genome .fna to .2bit

          ./faToTwoBit GWHAASQ00000000.genome.fasta S_chinensis_Apr_2019.2bit
          

          2. Create genome.txt

          ./twoBitInfo S_chinensis_Apr_2019.2bit genome.txt 
          

          3. Convert gene models from .gff to .bed

          ~/Documents/Repos/genomesource/gff3ToBedDetail.py -g GWHAASQ00000000.gff -b S_chinensis_Apr_2019.bed
          

          4. Sort, gzip, and tabix the .bed file

          sort -k1,1 -k2,2n S_chinensis_Apr_2019.bed | bgzip > S_chinensis_Apr_2019.bed.gz
          tabix -0 -s 1 -b 2 -e 3 S_chinensis_Apr_2019.bed.gz
          

          5. Sanity check the .bed and .2bit files - Add the .2bit file as a reference, then drag/drop the .bed file into IGB. Confirm that gene models are present, labeled correctly, and the chromosomes listed are in a logical order. Also check that no error messages are present in the Log.

          6. Create annots.xml and add _S_chinensis_ to contents.txt and .htaccess

          mkdir S_chinensis_Apr_2019
          cp V_cardui_Feb_2021/annots.xml S_chinensis_Apr_2019
          cp H_vulgaris_Apr_2024/HEADER.md S_chinensis_Apr_2019
          nano contents.txt 
          nano .htaccess 
          nano synonyms.txt 
          nano species.txt 
          
          Show
          pkulzer Paige Kulzer added a comment - Below is an outline of the steps I followed to create the Simmondsia chinensis Quickload: 1. Convert genome .fna to .2bit ./faToTwoBit GWHAASQ00000000.genome.fasta S_chinensis_Apr_2019.2bit 2. Create genome.txt ./twoBitInfo S_chinensis_Apr_2019.2bit genome.txt 3. Convert gene models from .gff to .bed ~/Documents/Repos/genomesource/gff3ToBedDetail.py -g GWHAASQ00000000.gff -b S_chinensis_Apr_2019.bed 4. Sort, gzip, and tabix the .bed file sort -k1,1 -k2,2n S_chinensis_Apr_2019.bed | bgzip > S_chinensis_Apr_2019.bed.gz tabix -0 -s 1 -b 2 -e 3 S_chinensis_Apr_2019.bed.gz 5. Sanity check the .bed and .2bit files - Add the .2bit file as a reference, then drag/drop the .bed file into IGB. Confirm that gene models are present, labeled correctly, and the chromosomes listed are in a logical order. Also check that no error messages are present in the Log. 6. Create annots.xml and add _S_chinensis_ to contents.txt and .htaccess mkdir S_chinensis_Apr_2019 cp V_cardui_Feb_2021/annots.xml S_chinensis_Apr_2019 cp H_vulgaris_Apr_2024/HEADER.md S_chinensis_Apr_2019 nano contents.txt nano .htaccess nano synonyms.txt nano species.txt
          pkulzer Paige Kulzer made changes -
          Description *Situation:* Dr. Kent Chapman, a professor from the University of North Texas, attended my talk at the SS-ASPB meeting earlier this year. He's followed up with me after the conference to request that the jojoba genome be added to IGB. We recently integrated GenArk, a genome-rich datasource, with IGB, but they don't yet host this genome, either.

          *Task:* Add the jojoba (_Simmondsia chinensis_) genome to IGB. Kent co-authored a paper describing this new genome assembly which I'll link below. The genome assembly and meta-data has been deposited in the Beijing Institute of Genomics (BIG) data center under the accession no. *GWHAASQ00000000*.

          *Link to the paper:* https://doi.org/10.1126/sciadv.aay3240
          *Link to the database:* https://ngdc.cncb.ac.cn/gwh/
          *Situation:* Dr. Kent Chapman, a professor from the University of North Texas, attended my talk at the SS-ASPB meeting earlier this year. He's followed up with me after the conference to request that the jojoba genome be added to IGB. We recently integrated GenArk, a genome-rich datasource, with IGB, but they don't yet host this genome, either.

          *Task:* Add the jojoba (_Simmondsia chinensis_) genome to IGB. Kent co-authored a paper describing this new genome assembly which I'll link below. The genome assembly and meta-data has been deposited in the Beijing Institute of Genomics (BIG) data center under the accession no. *GWHAASQ00000000*.

          *Link to the paper:* https://doi.org/10.1126/sciadv.aay3240
          *Link to the genome/annotations:* https://ngdc.cncb.ac.cn/gwh/Assembly/486/show
          Hide
          pkulzer Paige Kulzer added a comment -

          I've placed a zipped version of the new quickload folder in Google Drive for the reviewer to take a look at:
          Path: research-big-lorainelab > IGB Project Documentation and Plans > IGB Genomes > S_chinensis.zip
          Link: https://drive.google.com/drive/folders/1bFRx4PqldxNf400n7Vr9SD_dNeNmtpvk?usp=drive_link

          Ready for review! Please note that I'm not confident in the annots.xml file as this is the first time I've added a genome that's not from NCBI or UCSC.

          Show
          pkulzer Paige Kulzer added a comment - I've placed a zipped version of the new quickload folder in Google Drive for the reviewer to take a look at: Path: research-big-lorainelab > IGB Project Documentation and Plans > IGB Genomes > S_chinensis.zip Link: https://drive.google.com/drive/folders/1bFRx4PqldxNf400n7Vr9SD_dNeNmtpvk?usp=drive_link Ready for review! Please note that I'm not confident in the annots.xml file as this is the first time I've added a genome that's not from NCBI or UCSC.
          pkulzer Paige Kulzer made changes -
          Status In Progress [ 3 ] Needs 1st Level Review [ 10005 ]
          pkulzer Paige Kulzer made changes -
          Assignee Paige Kulzer [ pkulzer ] Nowlan Freese [ nfreese ]
          Hide
          pkulzer Paige Kulzer added a comment - - edited

          I attempted to add more information to the BED file by converting it to BED14 format via the ucsctoBedDetail.py script from the GenomeSource Repo. Here's the code I used:

          ~/Documents/Repos/genomesource/ucscToBedDetail.py -a ../../3999.gene2accession.txt -g ../../3999.gene_info.txt ~/Documents/Repos/quickload/S_chinensis_Apr_2019/S_chinensis_Apr_2019.bed.gz S_chinensis.bed
          

          I was running into an error here that seemed to be an issue with the script. However, it turned out to be a Python version issue. To switch to an earlier version of Python, I used a tool provided by Homebrew called pyenv:

          pyenv global 2.7.18
          

          Then I was able to run the script without issue.

          No additional information was available for the BED file so the .zip file above should still contain the right files to be tested.

          Show
          pkulzer Paige Kulzer added a comment - - edited I attempted to add more information to the BED file by converting it to BED14 format via the ucsctoBedDetail.py script from the GenomeSource Repo. Here's the code I used: ~/Documents/Repos/genomesource/ucscToBedDetail.py -a ../../3999.gene2accession.txt -g ../../3999.gene_info.txt ~/Documents/Repos/quickload/S_chinensis_Apr_2019/S_chinensis_Apr_2019.bed.gz S_chinensis.bed I was running into an error here that seemed to be an issue with the script. However, it turned out to be a Python version issue. To switch to an earlier version of Python, I used a tool provided by Homebrew called pyenv : pyenv global 2.7.18 Then I was able to run the script without issue. No additional information was available for the BED file so the .zip file above should still contain the right files to be tested.
          Hide
          nfreese Nowlan Freese added a comment -

          Testing the S chinensis quickload found in the Google Drive link.

          Only issue I could find was that the annots.xml "name" field for the reference sequence file is pointing at a gzipped fasta file on the web and I was unable to load the sequence in IGB. It should be pointing at the 2bit file.

          Synonyms/species look good.
          Gene models look good.
          Header markdown looks good.

          Show
          nfreese Nowlan Freese added a comment - Testing the S chinensis quickload found in the Google Drive link. Only issue I could find was that the annots.xml "name" field for the reference sequence file is pointing at a gzipped fasta file on the web and I was unable to load the sequence in IGB. It should be pointing at the 2bit file. Synonyms/species look good. Gene models look good. Header markdown looks good.
          nfreese Nowlan Freese made changes -
          Status Needs 1st Level Review [ 10005 ] First Level Review in Progress [ 10301 ]
          nfreese Nowlan Freese made changes -
          Status First Level Review in Progress [ 10301 ] To-Do [ 10305 ]
          nfreese Nowlan Freese made changes -
          Assignee Nowlan Freese [ nfreese ] Paige Kulzer [ pkulzer ]
          pkulzer Paige Kulzer made changes -
          Status To-Do [ 10305 ] In Progress [ 3 ]
          Hide
          pkulzer Paige Kulzer added a comment -

          I've updated the annots.xml file locally and tested that change. I'm able to load sequence in IGB now.

          Ann Loraine, could you please restart the svn server? I will check in my changes once that's done.

          Show
          pkulzer Paige Kulzer added a comment - I've updated the annots.xml file locally and tested that change. I'm able to load sequence in IGB now. Ann Loraine , could you please restart the svn server? I will check in my changes once that's done.
          ann.loraine Ann Loraine made changes -
          Sprint Fall 4 [ 227 ] Fall 4, Fall 5 [ 227, 228 ]
          ann.loraine Ann Loraine made changes -
          Rank Ranked higher
          Hide
          pkulzer Paige Kulzer added a comment -

          The Simmondsia chinensis genome has been pushed to the SVN repo.

          Ready for final review!

          Show
          pkulzer Paige Kulzer added a comment - The Simmondsia chinensis genome has been pushed to the SVN repo. Ready for final review!
          pkulzer Paige Kulzer made changes -
          Status In Progress [ 3 ] Needs 1st Level Review [ 10005 ]
          pkulzer Paige Kulzer made changes -
          Status Needs 1st Level Review [ 10005 ] First Level Review in Progress [ 10301 ]
          pkulzer Paige Kulzer made changes -
          Status First Level Review in Progress [ 10301 ] Ready for Pull Request [ 10304 ]
          pkulzer Paige Kulzer made changes -
          Status Ready for Pull Request [ 10304 ] Pull Request Submitted [ 10101 ]
          pkulzer Paige Kulzer made changes -
          Assignee Paige Kulzer [ pkulzer ]
          nfreese Nowlan Freese made changes -
          Status Pull Request Submitted [ 10101 ] Reviewing Pull Request [ 10303 ]
          nfreese Nowlan Freese made changes -
          Status Reviewing Pull Request [ 10303 ] Merged Needs Testing [ 10002 ]
          nfreese Nowlan Freese made changes -
          Status Merged Needs Testing [ 10002 ] Post-merge Testing In Progress [ 10003 ]
          Hide
          nfreese Nowlan Freese added a comment - - edited

          Looks good, able to load annotations and sequence. Synonyms show up correctly.

          Ann Loraine - Ready for deployment to Quickload servers.

          Show
          nfreese Nowlan Freese added a comment - - edited Looks good, able to load annotations and sequence. Synonyms show up correctly. Ann Loraine - Ready for deployment to Quickload servers.
          nfreese Nowlan Freese made changes -
          Status Post-merge Testing In Progress [ 10003 ] Merged Needs Testing [ 10002 ]
          nfreese Nowlan Freese made changes -
          Assignee Ann Loraine [ aloraine ]
          Hide
          ann.loraine Ann Loraine added a comment -

          Updated data is deployed to Quickload servers hosted at RENCI and UNC Charlotte.
          Ready for final testing.

          Show
          ann.loraine Ann Loraine added a comment - Updated data is deployed to Quickload servers hosted at RENCI and UNC Charlotte. Ready for final testing.
          ann.loraine Ann Loraine made changes -
          Assignee Ann Loraine [ aloraine ]
          Hide
          nfreese Nowlan Freese added a comment -

          Tested on IGB 10.1.0 release. Able to load annotations and sequence. Synonyms appear correctly. Linkout goes to Simmondsia chinensis quickload page.

          Closing ticket.

          Show
          nfreese Nowlan Freese added a comment - Tested on IGB 10.1.0 release. Able to load annotations and sequence. Synonyms appear correctly. Linkout goes to Simmondsia chinensis quickload page. Closing ticket.
          nfreese Nowlan Freese made changes -
          Assignee Paige Kulzer [ pkulzer ]
          nfreese Nowlan Freese made changes -
          Status Merged Needs Testing [ 10002 ] Post-merge Testing In Progress [ 10003 ]
          nfreese Nowlan Freese made changes -
          Resolution Done [ 10000 ]
          Status Post-merge Testing In Progress [ 10003 ] Closed [ 6 ]

            People

            • Assignee:
              pkulzer Paige Kulzer
              Reporter:
              pkulzer Paige Kulzer
            • Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: