Details
-
Type: Task
-
Status: Closed (View Workflow)
-
Priority: Major
-
Resolution: Done
-
Affects Version/s: None
-
Fix Version/s: None
-
Labels:None
-
Story Points:0.25
-
Epic Link:
-
Sprint:Summer 5 2022 July 18, Summer 6 2022 Aug 1, Fall 1 2022 Aug 15, Fall 2 2022 Sep 5, Fall 3 2022 Sep 26, Fall 4 2022 Oct 10, Fall 5 2022 Oct 24, Fall 6 2022 Nov 7
Description
Git repository for this sub-project: https://bitbucket.org/hotpollen/splicing-analysis/src/master/
Use gene mapping table to add gene descriptions to new annotations bed-detail file for SL5.0 genome assembly release and annotations.
Specifically:
- Get the BED-detail gene model file from IGBQuickload repository: http://lorainelab-quickload.scidas.org/quickload/S_lycopersicum_Jun_2022/S_lycopersicum_Jun_2022.bed.gz
The 13th column of this file contains a gene identifier. The 4th column contains a transcript identifier. The final (14th) column contains "NA" for "Not Available." We would like it to instead contain a description of the gene. Since we don't have that, we would like to insert the description of gene counterparts present in the SL4.0 annotations.
- Note that the SL4 annotations can be found here: http://lorainelab-quickload.scidas.org/quickload/S_lycopersicum_Sep_2019/S_lycopersicum_Sep_2019.bed.gz
To map genes from SL4 onto SL5, you need a mapping file. The people who made SL5.0 gave us that mapping file back in July 2022:
Dear Ann,
Yes, we do have tables for the conversion between different versions. They are now avaiable on our website (http://solomics.agis.org.cn/tomato/ftp/ID_convert/).
Thanks so much for the intergration! I would be happy to forward it to my colleagues when it is avaiable.
Best regards,
Yao
Mapping files needed are also available in: /nobackup/tomato_genome/alt_splicing/mappingfiles
Potentially useful code is available here:
Check out the code used to read the BED annotation files.