Uploaded image for project: 'IGB'
  1. IGB
  2. IGBF-3145

Transfer SL4.0 gene descriptions to SL5.0 annotations

    Details

    • Type: Task
    • Status: Closed (View Workflow)
    • Priority: Major
    • Resolution: Done
    • Affects Version/s: None
    • Fix Version/s: None
    • Labels:
      None
    • Story Points:
      0.25
    • Sprint:
      Summer 5 2022 July 18, Summer 6 2022 Aug 1, Fall 1 2022 Aug 15, Fall 2 2022 Sep 5, Fall 3 2022 Sep 26, Fall 4 2022 Oct 10, Fall 5 2022 Oct 24, Fall 6 2022 Nov 7

      Description

      Git repository for this sub-project: https://bitbucket.org/hotpollen/splicing-analysis/src/master/

      Use gene mapping table to add gene descriptions to new annotations bed-detail file for SL5.0 genome assembly release and annotations.

      Specifically:

      The 13th column of this file contains a gene identifier. The 4th column contains a transcript identifier. The final (14th) column contains "NA" for "Not Available." We would like it to instead contain a description of the gene. Since we don't have that, we would like to insert the description of gene counterparts present in the SL4.0 annotations.

      To map genes from SL4 onto SL5, you need a mapping file. The people who made SL5.0 gave us that mapping file back in July 2022:

      Dear Ann,
      Yes, we do have tables for the conversion between different versions. They are now avaiable on our website (http://solomics.agis.org.cn/tomato/ftp/ID_convert/).
      Thanks so much for the intergration! I would be happy to forward it to my colleagues when it is avaiable.
      Best regards,
      Yao

      Mapping files needed are also available in: /nobackup/tomato_genome/alt_splicing/mappingfiles

        Attachments

          Issue Links

            Activity

            ann.loraine Ann Loraine created issue -
            ann.loraine Ann Loraine made changes -
            Field Original Value New Value
            Epic Link IGBF-2993 [ 21429 ]
            ann.loraine Ann Loraine made changes -
            Link This issue is blocked by IGBF-3135 [ IGBF-3135 ]
            ann.loraine Ann Loraine made changes -
            Sprint Summer 5 2022 July 18 [ 151 ] Summer 5 2022 July 18, Summer 6 2022 Aug 1 [ 151, 152 ]
            ann.loraine Ann Loraine made changes -
            Rank Ranked higher
            ann.loraine Ann Loraine made changes -
            Rank Ranked higher
            ann.loraine Ann Loraine made changes -
            Description Use gene mapping table to add gene descriptions to new annotations bed-detail file for SL5.0 genome assembly release and annotations. Use gene mapping table to add gene descriptions to new annotations bed-detail file for SL5.0 genome assembly release and annotations.

            Specifically:

            * Get the BED-detail gene model file from IGBQuickload repository: http://lorainelab-quickload.scidas.org/quickload/S_lycopersicum_Jun_2022/S_lycopersicum_Jun_2022.bed.gz

            The 13th column of this file contains a gene identifier. The 4th column contains a transcript identifier. The final (14th) column contains "NA" for "Not Available." We would like it to instead contain a description of the gene. Since we don't have that, we would like to insert the description of gene counterparts present in the SL4.0 annotations.

            Note that the SL4 annotations can be found here: http://lorainelab-quickload.scidas.org/quickload/S_lycopersicum_Sep_2019/S_lycopersicum_Sep_2019.bed.gz
            ann.loraine Ann Loraine made changes -
            Description Use gene mapping table to add gene descriptions to new annotations bed-detail file for SL5.0 genome assembly release and annotations.

            Specifically:

            * Get the BED-detail gene model file from IGBQuickload repository: http://lorainelab-quickload.scidas.org/quickload/S_lycopersicum_Jun_2022/S_lycopersicum_Jun_2022.bed.gz

            The 13th column of this file contains a gene identifier. The 4th column contains a transcript identifier. The final (14th) column contains "NA" for "Not Available." We would like it to instead contain a description of the gene. Since we don't have that, we would like to insert the description of gene counterparts present in the SL4.0 annotations.

            Note that the SL4 annotations can be found here: http://lorainelab-quickload.scidas.org/quickload/S_lycopersicum_Sep_2019/S_lycopersicum_Sep_2019.bed.gz
            Use gene mapping table to add gene descriptions to new annotations bed-detail file for SL5.0 genome assembly release and annotations.

            Specifically:

            * Get the BED-detail gene model file from IGBQuickload repository: http://lorainelab-quickload.scidas.org/quickload/S_lycopersicum_Jun_2022/S_lycopersicum_Jun_2022.bed.gz

            The 13th column of this file contains a gene identifier. The 4th column contains a transcript identifier. The final (14th) column contains "NA" for "Not Available." We would like it to instead contain a description of the gene. Since we don't have that, we would like to insert the description of gene counterparts present in the SL4.0 annotations.

            * Note that the SL4 annotations can be found here: http://lorainelab-quickload.scidas.org/quickload/S_lycopersicum_Sep_2019/S_lycopersicum_Sep_2019.bed.gz

            To map genes from SL4 onto SL5, you need a mapping file. The people who made SL5.0 gave us that mapping file.
            ann.loraine Ann Loraine made changes -
            Description Use gene mapping table to add gene descriptions to new annotations bed-detail file for SL5.0 genome assembly release and annotations.

            Specifically:

            * Get the BED-detail gene model file from IGBQuickload repository: http://lorainelab-quickload.scidas.org/quickload/S_lycopersicum_Jun_2022/S_lycopersicum_Jun_2022.bed.gz

            The 13th column of this file contains a gene identifier. The 4th column contains a transcript identifier. The final (14th) column contains "NA" for "Not Available." We would like it to instead contain a description of the gene. Since we don't have that, we would like to insert the description of gene counterparts present in the SL4.0 annotations.

            * Note that the SL4 annotations can be found here: http://lorainelab-quickload.scidas.org/quickload/S_lycopersicum_Sep_2019/S_lycopersicum_Sep_2019.bed.gz

            To map genes from SL4 onto SL5, you need a mapping file. The people who made SL5.0 gave us that mapping file.
            Use gene mapping table to add gene descriptions to new annotations bed-detail file for SL5.0 genome assembly release and annotations.

            Specifically:

            * Get the BED-detail gene model file from IGBQuickload repository: http://lorainelab-quickload.scidas.org/quickload/S_lycopersicum_Jun_2022/S_lycopersicum_Jun_2022.bed.gz

            The 13th column of this file contains a gene identifier. The 4th column contains a transcript identifier. The final (14th) column contains "NA" for "Not Available." We would like it to instead contain a description of the gene. Since we don't have that, we would like to insert the description of gene counterparts present in the SL4.0 annotations.

            * Note that the SL4 annotations can be found here: http://lorainelab-quickload.scidas.org/quickload/S_lycopersicum_Sep_2019/S_lycopersicum_Sep_2019.bed.gz

            To map genes from SL4 onto SL5, you need a mapping file. The people who made SL5.0 gave us that mapping file.


            Author Yao me this:

            {quote}
            Dear Ann,
            Yes, we do have tables for the conversion between different versions. They are now avaiable on our website (http://solomics.agis.org.cn/tomato/ftp/ID_convert/).
            Thanks so much for the intergration! I would be happy to forward it to my colleagues when it is avaiable.
            Best regards,
            Yao
            {quote}

            ann.loraine Ann Loraine made changes -
            Description Use gene mapping table to add gene descriptions to new annotations bed-detail file for SL5.0 genome assembly release and annotations.

            Specifically:

            * Get the BED-detail gene model file from IGBQuickload repository: http://lorainelab-quickload.scidas.org/quickload/S_lycopersicum_Jun_2022/S_lycopersicum_Jun_2022.bed.gz

            The 13th column of this file contains a gene identifier. The 4th column contains a transcript identifier. The final (14th) column contains "NA" for "Not Available." We would like it to instead contain a description of the gene. Since we don't have that, we would like to insert the description of gene counterparts present in the SL4.0 annotations.

            * Note that the SL4 annotations can be found here: http://lorainelab-quickload.scidas.org/quickload/S_lycopersicum_Sep_2019/S_lycopersicum_Sep_2019.bed.gz

            To map genes from SL4 onto SL5, you need a mapping file. The people who made SL5.0 gave us that mapping file.


            Author Yao me this:

            {quote}
            Dear Ann,
            Yes, we do have tables for the conversion between different versions. They are now avaiable on our website (http://solomics.agis.org.cn/tomato/ftp/ID_convert/).
            Thanks so much for the intergration! I would be happy to forward it to my colleagues when it is avaiable.
            Best regards,
            Yao
            {quote}

            Use gene mapping table to add gene descriptions to new annotations bed-detail file for SL5.0 genome assembly release and annotations.

            Specifically:

            * Get the BED-detail gene model file from IGBQuickload repository: http://lorainelab-quickload.scidas.org/quickload/S_lycopersicum_Jun_2022/S_lycopersicum_Jun_2022.bed.gz

            The 13th column of this file contains a gene identifier. The 4th column contains a transcript identifier. The final (14th) column contains "NA" for "Not Available." We would like it to instead contain a description of the gene. Since we don't have that, we would like to insert the description of gene counterparts present in the SL4.0 annotations.

            * Note that the SL4 annotations can be found here: http://lorainelab-quickload.scidas.org/quickload/S_lycopersicum_Sep_2019/S_lycopersicum_Sep_2019.bed.gz

            To map genes from SL4 onto SL5, you need a mapping file. The people who made SL5.0 gave us that mapping file back in July 2022:

            {quote}
            Dear Ann,
            Yes, we do have tables for the conversion between different versions. They are now avaiable on our website (http://solomics.agis.org.cn/tomato/ftp/ID_convert/).
            Thanks so much for the intergration! I would be happy to forward it to my colleagues when it is avaiable.
            Best regards,
            Yao
            {quote}

            Mapping files needed are also available in: /nobackup/tomato_genome/alt_splicing/mappingfiles

            ann.loraine Ann Loraine made changes -
            Sprint Summer 5 2022 July 18, Summer 6 2022 Aug 1 [ 151, 152 ] Summer 5 2022 July 18, Summer 6 2022 Aug 1, Fall 1 2022 Aug 15 [ 151, 152, 153 ]
            ann.loraine Ann Loraine made changes -
            Rank Ranked higher
            Mdavis4290 Molly Davis made changes -
            Assignee Molly Davis [ molly ]
            Hide
            ann.loraine Ann Loraine added a comment - - edited

            Potentially useful code is available here:

            Check out the code used to read the BED annotation files.

            Show
            ann.loraine Ann Loraine added a comment - - edited Potentially useful code is available here: https://bitbucket.org/hotpollen/flavonoid-rnaseq/src/main/ https://bitbucket.org/hotpollen/rna-seq/src/master/ Check out the code used to read the BED annotation files.
            ann.loraine Ann Loraine made changes -
            Status To-Do [ 10305 ] In Progress [ 3 ]
            Hide
            ann.loraine Ann Loraine added a comment -

            In the "bed-detail" file from IGB Quickload:

            • Each row represents a transcript (also called a "gene model")
            • Field 3 is a transcript identifier, one per row
            • Field 13 is the gene identifier.
            Show
            ann.loraine Ann Loraine added a comment - In the "bed-detail" file from IGB Quickload: Each row represents a transcript (also called a "gene model") Field 3 is a transcript identifier, one per row Field 13 is the gene identifier.
            ann.loraine Ann Loraine made changes -
            Description Use gene mapping table to add gene descriptions to new annotations bed-detail file for SL5.0 genome assembly release and annotations.

            Specifically:

            * Get the BED-detail gene model file from IGBQuickload repository: http://lorainelab-quickload.scidas.org/quickload/S_lycopersicum_Jun_2022/S_lycopersicum_Jun_2022.bed.gz

            The 13th column of this file contains a gene identifier. The 4th column contains a transcript identifier. The final (14th) column contains "NA" for "Not Available." We would like it to instead contain a description of the gene. Since we don't have that, we would like to insert the description of gene counterparts present in the SL4.0 annotations.

            * Note that the SL4 annotations can be found here: http://lorainelab-quickload.scidas.org/quickload/S_lycopersicum_Sep_2019/S_lycopersicum_Sep_2019.bed.gz

            To map genes from SL4 onto SL5, you need a mapping file. The people who made SL5.0 gave us that mapping file back in July 2022:

            {quote}
            Dear Ann,
            Yes, we do have tables for the conversion between different versions. They are now avaiable on our website (http://solomics.agis.org.cn/tomato/ftp/ID_convert/).
            Thanks so much for the intergration! I would be happy to forward it to my colleagues when it is avaiable.
            Best regards,
            Yao
            {quote}

            Mapping files needed are also available in: /nobackup/tomato_genome/alt_splicing/mappingfiles

            Git repository for this sub-project: https://bitbucket.org/hotpollen/splicing-analysis/src/master/

            Use gene mapping table to add gene descriptions to new annotations bed-detail file for SL5.0 genome assembly release and annotations.

            Specifically:

            * Get the BED-detail gene model file from IGBQuickload repository: http://lorainelab-quickload.scidas.org/quickload/S_lycopersicum_Jun_2022/S_lycopersicum_Jun_2022.bed.gz

            The 13th column of this file contains a gene identifier. The 4th column contains a transcript identifier. The final (14th) column contains "NA" for "Not Available." We would like it to instead contain a description of the gene. Since we don't have that, we would like to insert the description of gene counterparts present in the SL4.0 annotations.

            * Note that the SL4 annotations can be found here: http://lorainelab-quickload.scidas.org/quickload/S_lycopersicum_Sep_2019/S_lycopersicum_Sep_2019.bed.gz

            To map genes from SL4 onto SL5, you need a mapping file. The people who made SL5.0 gave us that mapping file back in July 2022:

            {quote}
            Dear Ann,
            Yes, we do have tables for the conversion between different versions. They are now avaiable on our website (http://solomics.agis.org.cn/tomato/ftp/ID_convert/).
            Thanks so much for the intergration! I would be happy to forward it to my colleagues when it is avaiable.
            Best regards,
            Yao
            {quote}

            Mapping files needed are also available in: /nobackup/tomato_genome/alt_splicing/mappingfiles

            ann.loraine Ann Loraine made changes -
            Sprint Summer 5 2022 July 18, Summer 6 2022 Aug 1, Fall 1 2022 Aug 15 [ 151, 152, 153 ] Summer 5 2022 July 18, Summer 6 2022 Aug 1, Fall 1 2022 Aug 15, Fall 2 2022 Sep 5 [ 151, 152, 153, 154 ]
            ann.loraine Ann Loraine made changes -
            Rank Ranked higher
            Hide
            Mdavis4290 Molly Davis added a comment - - edited

            Need Reviewing:

            I commit new code to bitbucket for file 'MappingNames.Rmd'.

            https://bitbucket.org/mdavis4290/splicing-analysis/src/master/DescriptionMapping/MappingNames.Rmd

            [~aloraine]

            Currently working on code for checking merge.
            Still learning git commands!

            Show
            Mdavis4290 Molly Davis added a comment - - edited Need Reviewing: I commit new code to bitbucket for file 'MappingNames.Rmd'. https://bitbucket.org/mdavis4290/splicing-analysis/src/master/DescriptionMapping/MappingNames.Rmd [~aloraine] Currently working on code for checking merge. Still learning git commands!
            ann.loraine Ann Loraine made changes -
            Status In Progress [ 3 ] Needs 1st Level Review [ 10005 ]
            ann.loraine Ann Loraine made changes -
            Status Needs 1st Level Review [ 10005 ] First Level Review in Progress [ 10301 ]
            ann.loraine Ann Loraine made changes -
            Status First Level Review in Progress [ 10301 ] Ready for Pull Request [ 10304 ]
            Hide
            ann.loraine Ann Loraine added a comment - - edited

            [~molly] - please submit PR from your fork's master branch to the team repository master branch. Use the Bitbucket interface; google "how to submit PR using bitbucket" for tips if required.

            Show
            ann.loraine Ann Loraine added a comment - - edited [~molly] - please submit PR from your fork's master branch to the team repository master branch. Use the Bitbucket interface; google "how to submit PR using bitbucket" for tips if required.
            nfreese Nowlan Freese made changes -
            Sprint Summer 5 2022 July 18, Summer 6 2022 Aug 1, Fall 1 2022 Aug 15, Fall 2 2022 Sep 5 [ 151, 152, 153, 154 ] Summer 5 2022 July 18, Summer 6 2022 Aug 1, Fall 1 2022 Aug 15, Fall 2 2022 Sep 5, Fall 3 2022 Sep 26 [ 151, 152, 153, 154, 155 ]
            nfreese Nowlan Freese made changes -
            Rank Ranked higher
            Hide
            Mdavis4290 Molly Davis added a comment -

            Made a pull request for the master branch for file 'MappingNames.Rmd'.

            Notes include:

            • Added correct files with all of the data included.
            • Made write commands to save files to local machine.
            • Creates two files. The first one includes the exact columns Ann requested. The second one looks like the original bed file SL5 but now with the descriptions.

            [~aloraine]

            Show
            Mdavis4290 Molly Davis added a comment - Made a pull request for the master branch for file 'MappingNames.Rmd'. Notes include: Added correct files with all of the data included. Made write commands to save files to local machine. Creates two files. The first one includes the exact columns Ann requested. The second one looks like the original bed file SL5 but now with the descriptions. [~aloraine]
            ann.loraine Ann Loraine made changes -
            Status Ready for Pull Request [ 10304 ] Pull Request Submitted [ 10101 ]
            ann.loraine Ann Loraine made changes -
            Status Pull Request Submitted [ 10101 ] Reviewing Pull Request [ 10303 ]
            ann.loraine Ann Loraine made changes -
            Status Reviewing Pull Request [ 10303 ] Merged Needs Testing [ 10002 ]
            ann.loraine Ann Loraine made changes -
            Assignee Molly Davis [ molly ]
            ann.loraine Ann Loraine made changes -
            Sprint Summer 5 2022 July 18, Summer 6 2022 Aug 1, Fall 1 2022 Aug 15, Fall 2 2022 Sep 5, Fall 3 2022 Sep 26 [ 151, 152, 153, 154, 155 ] Summer 5 2022 July 18, Summer 6 2022 Aug 1, Fall 1 2022 Aug 15, Fall 2 2022 Sep 5, Fall 3 2022 Sep 26, Fall 4 2022 Oct 10 [ 151, 152, 153, 154, 155, 156 ]
            ann.loraine Ann Loraine made changes -
            Rank Ranked higher
            ann.loraine Ann Loraine made changes -
            Status Merged Needs Testing [ 10002 ] Post-merge Testing In Progress [ 10003 ]
            ann.loraine Ann Loraine made changes -
            Assignee Ann Loraine [ aloraine ]
            Hide
            ann.loraine Ann Loraine added a comment - - edited

            Testing "MappingNames.Rmd" in https://bitbucket.org/hotpollen/splicing-analysis/src/master/DescriptionMapping/.

            An issue:

            • Markdown fails to compile (knit) due to files not being present on my system. Paths are hard-coded to a location not on my computer:

            SL5.fname="/Users/mollydavis333/Desktop/S_lycopersicum_Jun_2022.bed"
            SL4.fname="/Users/mollydavis333/Desktop/S_lycopersicum_Sep_2019.bed"
            ...
            mapping.fname="/Users/mollydavis333/Desktop/SL4-SL5_covert.tsv"

            This can be fixed by directly accessing the files using relative paths as these are version controlled in splicing_analysis/ExternalData.
            R can open "gzip'd" files. No need to uncompress them first. Google "read compressed data file into R" or something to see how this can be done.

            attn: [~molly]

            Show
            ann.loraine Ann Loraine added a comment - - edited Testing "MappingNames.Rmd" in https://bitbucket.org/hotpollen/splicing-analysis/src/master/DescriptionMapping/ . An issue: Markdown fails to compile (knit) due to files not being present on my system. Paths are hard-coded to a location not on my computer: SL5.fname="/Users/mollydavis333/Desktop/S_lycopersicum_Jun_2022.bed" SL4.fname="/Users/mollydavis333/Desktop/S_lycopersicum_Sep_2019.bed" ... mapping.fname="/Users/mollydavis333/Desktop/SL4-SL5_covert.tsv" This can be fixed by directly accessing the files using relative paths as these are version controlled in splicing_analysis/ExternalData. R can open "gzip'd" files. No need to uncompress them first. Google "read compressed data file into R" or something to see how this can be done. attn: [~molly]
            ann.loraine Ann Loraine made changes -
            Status Post-merge Testing In Progress [ 10003 ] To-Do [ 10305 ]
            ann.loraine Ann Loraine made changes -
            Assignee Ann Loraine [ aloraine ] Molly Davis [ molly ]
            nfreese Nowlan Freese made changes -
            Sprint Summer 5 2022 July 18, Summer 6 2022 Aug 1, Fall 1 2022 Aug 15, Fall 2 2022 Sep 5, Fall 3 2022 Sep 26, Fall 4 2022 Oct 10 [ 151, 152, 153, 154, 155, 156 ] Summer 5 2022 July 18, Summer 6 2022 Aug 1, Fall 1 2022 Aug 15, Fall 2 2022 Sep 5, Fall 3 2022 Sep 26, Fall 4 2022 Oct 10, Fall 5 2022 Oct 24 [ 151, 152, 153, 154, 155, 156, 157 ]
            nfreese Nowlan Freese made changes -
            Rank Ranked higher
            Hide
            Mdavis4290 Molly Davis added a comment -

            Link to updated mapping file on bitbucket under Molly fork:

            https://bitbucket.org/mdavis4290/splicing-analysis/src/master/DescriptionMapping/MappingNames.Rmd

            Not sure why there is a conflict. I have committed changes and when I try to 'git merge remotes/origin/master' it says that it is already up to date.

            Will need help making pull request with conflict in the way.

            [~aloraine]

            Show
            Mdavis4290 Molly Davis added a comment - Link to updated mapping file on bitbucket under Molly fork: https://bitbucket.org/mdavis4290/splicing-analysis/src/master/DescriptionMapping/MappingNames.Rmd Not sure why there is a conflict. I have committed changes and when I try to 'git merge remotes/origin/master' it says that it is already up to date. Will need help making pull request with conflict in the way. [~aloraine]
            ann.loraine Ann Loraine made changes -
            Sprint Summer 5 2022 July 18, Summer 6 2022 Aug 1, Fall 1 2022 Aug 15, Fall 2 2022 Sep 5, Fall 3 2022 Sep 26, Fall 4 2022 Oct 10, Fall 5 2022 Oct 24 [ 151, 152, 153, 154, 155, 156, 157 ] Summer 5 2022 July 18, Summer 6 2022 Aug 1, Fall 1 2022 Aug 15, Fall 2 2022 Sep 5, Fall 3 2022 Sep 26, Fall 4 2022 Oct 10, Fall 5 2022 Oct 24, Fall 6 2022 Nov 7 [ 151, 152, 153, 154, 155, 156, 157, 158 ]
            ann.loraine Ann Loraine made changes -
            Rank Ranked higher
            Hide
            ann.loraine Ann Loraine added a comment -

            PR is merged and code is ready for review and testing.

            Show
            ann.loraine Ann Loraine added a comment - PR is merged and code is ready for review and testing.
            ann.loraine Ann Loraine made changes -
            Status To-Do [ 10305 ] In Progress [ 3 ]
            ann.loraine Ann Loraine made changes -
            Status In Progress [ 3 ] Needs 1st Level Review [ 10005 ]
            ann.loraine Ann Loraine made changes -
            Status Needs 1st Level Review [ 10005 ] First Level Review in Progress [ 10301 ]
            ann.loraine Ann Loraine made changes -
            Status First Level Review in Progress [ 10301 ] Ready for Pull Request [ 10304 ]
            ann.loraine Ann Loraine made changes -
            Status Ready for Pull Request [ 10304 ] Pull Request Submitted [ 10101 ]
            ann.loraine Ann Loraine made changes -
            Status Pull Request Submitted [ 10101 ] Reviewing Pull Request [ 10303 ]
            ann.loraine Ann Loraine made changes -
            Assignee Molly Davis [ molly ]
            ann.loraine Ann Loraine made changes -
            Status Reviewing Pull Request [ 10303 ] Merged Needs Testing [ 10002 ]
            ann.loraine Ann Loraine made changes -
            Status Merged Needs Testing [ 10002 ] Post-merge Testing In Progress [ 10003 ]
            ann.loraine Ann Loraine made changes -
            Assignee Ann Loraine [ aloraine ]
            Hide
            ann.loraine Ann Loraine added a comment -

            The .Rmd file can't "knit".
            I think this is due to the final line of one of the code chunks in which a large data frame is being printed to the document.
            So I ran the file interactively, chunk by chunk.
            This resulted in creating two data files, one with identifiers and another that tries to be a BED file.
            However, when I opened the BED file in IGB, I got an error:

            09:03:26.741 ERROR c.a.igb.view.load.GeneralLoadUtils - Error in loadOnSequence
            java.lang.NumberFormatException: For input string: ""270"
            at java.lang.NumberFormatException.forInputString(NumberFormatException.java:65) ~[na:1.8.0_332]
            at java.lang.Integer.parseInt(Integer.java:569) ~[na:1.8.0_332]
            at java.lang.Integer.parseInt(Integer.java:615) ~[na:1.8.0_332]
            at com.affymetrix.genometry.parsers.BedParser.parseIntArray(BedParser.java:414) ~[genometry-9.1.10.jar:na]
            at org.lorainelab.igb.bed.BedSymloader.parseLine(BedSymloader.java:310) ~[na:na]
            at org.lorainelab.igb.bed.BedSymloader.parseLines(BedSymloader.java:184) ~[na:na]
            at org.lorainelab.igb.bed.BedSymloader.parse(BedSymloader.java:170) ~[na:na]
            at org.lorainelab.igb.bed.BedSymloader.parse(BedSymloader.java:146) ~[na:na]
            at org.lorainelab.igb.bed.BedSymloader.getRegion(BedSymloader.java:139) ~[na:na]
            at com.affymetrix.genometry.quickload.QuickLoadSymLoader.getRegion(QuickLoadSymLoader.java:287) ~[genometry-9.1.10.jar:na]
            at com.affymetrix.genometry.quickload.QuickLoadSymLoader.loadAndAddSymmetries(QuickLoadSymLoader.java:164) ~[genometry-9.1.10.jar:na]
            at com.affymetrix.genometry.quickload.QuickLoadSymLoader.loadSymmetriesThread(QuickLoadSymLoader.java:139) ~[genometry-9.1.10.jar:na]
            at com.affymetrix.genometry.quickload.QuickLoadSymLoader.loadFeatures(QuickLoadSymLoader.java:119) ~[genometry-9.1.10.jar:na]
            at com.affymetrix.igb.view.load.GeneralLoadUtils.loadFeaturesForSym(GeneralLoadUtils.java:749) ~[igb-9.1.10.jar:na]
            at com.affymetrix.igb.view.load.GeneralLoadUtils$1.loadOnSequence(GeneralLoadUtils.java:664) [igb-9.1.10.jar:na]
            at com.affymetrix.igb.view.load.GeneralLoadUtils$1.lambda$multiThreadedLoad$178(GeneralLoadUtils.java:607) [igb-9.1.10.jar:na]
            at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) ~[na:1.8.0_332]
            at java.util.concurrent.FutureTask.run(FutureTask.java:266) ~[na:1.8.0_332]
            at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) ~[na:1.8.0_332]
            at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) ~[na:1.8.0_332]
            at java.lang.Thread.run(Thread.java:750) ~[na:1.8.0_332]

            I looked at the BED file in Terminal and observed that many of the fields have quotation marks around them.
            This is likely the cause of the error, as IGB reports failing due to a NumberFormatException when attempting to
            parse "270

            The table output command needs to indicate that no quotation marks will be used.
            Moving back to "To-Do"

            Next steps:

            • Modify the output file format to not wrap any of the fields with quotation marks.
            • Modify the code chunk to no longer print out an entire data frame to the final knitted document
            • Ensure that the file can "knit" without error
            Show
            ann.loraine Ann Loraine added a comment - The .Rmd file can't "knit". I think this is due to the final line of one of the code chunks in which a large data frame is being printed to the document. So I ran the file interactively, chunk by chunk. This resulted in creating two data files, one with identifiers and another that tries to be a BED file. However, when I opened the BED file in IGB, I got an error: 09:03:26.741 ERROR c.a.igb.view.load.GeneralLoadUtils - Error in loadOnSequence java.lang.NumberFormatException: For input string: ""270" at java.lang.NumberFormatException.forInputString(NumberFormatException.java:65) ~ [na:1.8.0_332] at java.lang.Integer.parseInt(Integer.java:569) ~ [na:1.8.0_332] at java.lang.Integer.parseInt(Integer.java:615) ~ [na:1.8.0_332] at com.affymetrix.genometry.parsers.BedParser.parseIntArray(BedParser.java:414) ~ [genometry-9.1.10.jar:na] at org.lorainelab.igb.bed.BedSymloader.parseLine(BedSymloader.java:310) ~ [na:na] at org.lorainelab.igb.bed.BedSymloader.parseLines(BedSymloader.java:184) ~ [na:na] at org.lorainelab.igb.bed.BedSymloader.parse(BedSymloader.java:170) ~ [na:na] at org.lorainelab.igb.bed.BedSymloader.parse(BedSymloader.java:146) ~ [na:na] at org.lorainelab.igb.bed.BedSymloader.getRegion(BedSymloader.java:139) ~ [na:na] at com.affymetrix.genometry.quickload.QuickLoadSymLoader.getRegion(QuickLoadSymLoader.java:287) ~ [genometry-9.1.10.jar:na] at com.affymetrix.genometry.quickload.QuickLoadSymLoader.loadAndAddSymmetries(QuickLoadSymLoader.java:164) ~ [genometry-9.1.10.jar:na] at com.affymetrix.genometry.quickload.QuickLoadSymLoader.loadSymmetriesThread(QuickLoadSymLoader.java:139) ~ [genometry-9.1.10.jar:na] at com.affymetrix.genometry.quickload.QuickLoadSymLoader.loadFeatures(QuickLoadSymLoader.java:119) ~ [genometry-9.1.10.jar:na] at com.affymetrix.igb.view.load.GeneralLoadUtils.loadFeaturesForSym(GeneralLoadUtils.java:749) ~ [igb-9.1.10.jar:na] at com.affymetrix.igb.view.load.GeneralLoadUtils$1.loadOnSequence(GeneralLoadUtils.java:664) [igb-9.1.10.jar:na] at com.affymetrix.igb.view.load.GeneralLoadUtils$1.lambda$multiThreadedLoad$178(GeneralLoadUtils.java:607) [igb-9.1.10.jar:na] at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) ~ [na:1.8.0_332] at java.util.concurrent.FutureTask.run(FutureTask.java:266) ~ [na:1.8.0_332] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) ~ [na:1.8.0_332] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) ~ [na:1.8.0_332] at java.lang.Thread.run(Thread.java:750) ~ [na:1.8.0_332] I looked at the BED file in Terminal and observed that many of the fields have quotation marks around them. This is likely the cause of the error, as IGB reports failing due to a NumberFormatException when attempting to parse "270 The table output command needs to indicate that no quotation marks will be used. Moving back to "To-Do" Next steps: Modify the output file format to not wrap any of the fields with quotation marks. Modify the code chunk to no longer print out an entire data frame to the final knitted document Ensure that the file can "knit" without error
            ann.loraine Ann Loraine made changes -
            Status Post-merge Testing In Progress [ 10003 ] To-Do [ 10305 ]
            ann.loraine Ann Loraine made changes -
            Assignee Ann Loraine [ aloraine ] Molly Davis [ molly ]
            ann.loraine Ann Loraine made changes -
            Link This issue relates to IGBF-3229 [ IGBF-3229 ]
            Hide
            ann.loraine Ann Loraine added a comment -

            Moving the above request changes into a new ticket and closing this one.

            Show
            ann.loraine Ann Loraine added a comment - Moving the above request changes into a new ticket and closing this one.
            ann.loraine Ann Loraine made changes -
            Status To-Do [ 10305 ] In Progress [ 3 ]
            ann.loraine Ann Loraine made changes -
            Status In Progress [ 3 ] Needs 1st Level Review [ 10005 ]
            ann.loraine Ann Loraine made changes -
            Status Needs 1st Level Review [ 10005 ] First Level Review in Progress [ 10301 ]
            ann.loraine Ann Loraine made changes -
            Status First Level Review in Progress [ 10301 ] Ready for Pull Request [ 10304 ]
            ann.loraine Ann Loraine made changes -
            Status Ready for Pull Request [ 10304 ] Pull Request Submitted [ 10101 ]
            ann.loraine Ann Loraine made changes -
            Status Pull Request Submitted [ 10101 ] Reviewing Pull Request [ 10303 ]
            ann.loraine Ann Loraine made changes -
            Status Reviewing Pull Request [ 10303 ] Merged Needs Testing [ 10002 ]
            ann.loraine Ann Loraine made changes -
            Status Merged Needs Testing [ 10002 ] Post-merge Testing In Progress [ 10003 ]
            ann.loraine Ann Loraine made changes -
            Resolution Done [ 10000 ]
            Status Post-merge Testing In Progress [ 10003 ] Closed [ 6 ]
            Hide
            ann.loraine Ann Loraine added a comment - - edited

            Update:

            • Aligned SL4 gene models onto SL5 genome assembly using blat.sh:
            #!/bin/bash
            G=S_lycopersicum_Jun_2022
            D=$G.2bit
            Q=S_lycopersicum_Sep_2019_models_cDNA.fa
            PSL=SL42SL5.psl
            MI=15000
            blat -noTrimA -maxIntron=$MI -noHead -minIdentity=95 -dots=100 $D $Q $PSL
            
            • Sorted and tabix-indexed PSL output file with:
            sort -k14,14 -k16,16n SL42SL5.psl | bgzip -c > SL42SL5.psl.gz
            tabix -s 14 -b 16 -e 17 SL42SL5.psl.gz
            
            • Added output and code to repository - bitbucket.org/hotpollen/splicing-analysis.git
            Show
            ann.loraine Ann Loraine added a comment - - edited Update: Aligned SL4 gene models onto SL5 genome assembly using blat.sh: #!/bin/bash G=S_lycopersicum_Jun_2022 D=$G.2bit Q=S_lycopersicum_Sep_2019_models_cDNA.fa PSL=SL42SL5.psl MI=15000 blat -noTrimA -maxIntron=$MI -noHead -minIdentity=95 -dots=100 $D $Q $PSL Sorted and tabix-indexed PSL output file with: sort -k14,14 -k16,16n SL42SL5.psl | bgzip -c > SL42SL5.psl.gz tabix -s 14 -b 16 -e 17 SL42SL5.psl.gz Added output and code to repository - bitbucket.org/hotpollen/splicing-analysis.git
            Hide
            ann.loraine Ann Loraine added a comment -

            Reference: https://bitbucket.org/lorainelab/affyprobesetsforigb/src/master/ (documents how to make a tabix-indexed file from PSL blat output)

            Show
            ann.loraine Ann Loraine added a comment - Reference: https://bitbucket.org/lorainelab/affyprobesetsforigb/src/master/ (documents how to make a tabix-indexed file from PSL blat output)
            Hide
            ann.loraine Ann Loraine added a comment -

            Update:

            Modifying SL5 description field to include SL4 locus identifier, for example:

            hexokinase-1 protein (AHRD V3.3 *** AT1G05205.1)

            becomes:

            hexokinase-1 protein (AHRD V3.3 *** AT1G05205.1) ITAG4.0:Solyc07g052420.3

            Show
            ann.loraine Ann Loraine added a comment - Update: Modifying SL5 description field to include SL4 locus identifier, for example: hexokinase-1 protein (AHRD V3.3 *** AT1G05205.1) becomes: hexokinase-1 protein (AHRD V3.3 *** AT1G05205.1) ITAG4.0:Solyc07g052420.3
            Hide
            ann.loraine Ann Loraine added a comment - - edited

            Added new file to svn repository after creating tabix-indexed file with:

            sort -k1,1 -k2,2n ~/src/splicing-analysis2/DescriptionMapping/output/S_lycopersicum_Jun_2022.bed | bgzip > S_lycopersicum_Jun_2022.bed.gz 
            tabix -s 1 -b 2 -e 3 S_lycopersicum_Jun_2022.bed.gz
            

            svn repo info:

            Show
            ann.loraine Ann Loraine added a comment - - edited Added new file to svn repository after creating tabix-indexed file with: sort -k1,1 -k2,2n ~/src/splicing-analysis2/DescriptionMapping/output/S_lycopersicum_Jun_2022.bed | bgzip > S_lycopersicum_Jun_2022.bed.gz tabix -s 1 -b 2 -e 3 S_lycopersicum_Jun_2022.bed.gz svn repo info: Browse by visiting https://svn.bioviz.org/viewvc/ svn repo URLs: URL: https://svn.bioviz.org/repos/genomes/quickload/S_lycopersicum_Jun_2022 Repository Root: https://svn.bioviz.org/repos/genomes to check out the repo using read-only user, enter user name "guest" password "guest"
            Hide
            ann.loraine Ann Loraine added a comment - - edited

            Updated quickload sites on RENCI and UNCC hosting.

            At UNCC, logged in with:

            ssh -J aloraine@cci-jump.uncc.edu -p 1657 aloraine@igbquickload.org
            

            At RENCI, logged in with:

            ssh -J aloraine@hop.renci.org aloraine@lorainelab-quickload.scidas.org
            
            Show
            ann.loraine Ann Loraine added a comment - - edited Updated quickload sites on RENCI and UNCC hosting. At UNCC, logged in with: ssh -J aloraine@cci-jump.uncc.edu -p 1657 aloraine@igbquickload.org At RENCI, logged in with: ssh -J aloraine@hop.renci.org aloraine@lorainelab-quickload.scidas.org

              People

              • Assignee:
                Mdavis4290 Molly Davis
                Reporter:
                ann.loraine Ann Loraine
              • Votes:
                0 Vote for this issue
                Watchers:
                2 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: