Uploaded image for project: 'IGB'
  1. IGB
  2. IGBF-4215

Mapping SolyID to Kegg by way of RefSEq / NCBI via BLAST

    Details

    • Type: Task
    • Status: Closed (View Workflow)
    • Priority: Major
    • Resolution: Done
    • Affects Version/s: None
    • Fix Version/s: None
    • Labels:
      None

      Description

      So the problem is that with KEGG as we currently stand,

      we only have 4000 KEGG ids that map to about 8700 SL4 genes.

      Since there are 34,000 genes, we are leaving MANY genes out of consideration.

      However, there is a KEGG API mapping file that associates NCBI Ids for tomato to KEGG IDs. (see attached)

      Looks like so:
      100037489 NP_001234263
      100037490 NP_001234268
      100037491 NP_001234273
      100037492 NP_001234279
      100037493 NP_001234283
      100037494 NP_001234289
      100037498 NP_001234310
      100037495 NP_001234471
      100037496 XP_004230843

      Ultimately we need a salmon counts table that is NORMALIZED !! And has a column with the KEGG ids, (1st column).

      HERE:
      Download all of the NCBI IDs from column 2.
      Blast Soly genes to these IDs and make a new table with the top hit.
      Summarize how many hits do we get, how many misses also.

        Attachments

        1. kegg-api-sly.txt
          592 kB
        2. soly_mapping.tsv
          1021 kB
        3. best_hits_kegg.tsv
          1.43 MB

          Activity

          Hide
          bbendick Brandon Bendickson added a comment -

          Ran Blast and got resulting table. Time to parse!

          Show
          bbendick Brandon Bendickson added a comment - Ran Blast and got resulting table. Time to parse!
          Hide
          bbendick Brandon Bendickson added a comment -

          successfully mapped soly IDs to ncbi IDS and finally KEGG Ids. attaching the map file to this comment. soly_mapping.tsv

          Show
          bbendick Brandon Bendickson added a comment - successfully mapped soly IDs to ncbi IDS and finally KEGG Ids. attaching the map file to this comment. soly_mapping.tsv
          Hide
          bbendick Brandon Bendickson added a comment -

          Of the 26,484 ncbi IDs, 23,288 had successful mapping to soly IDs and their respective kegg IDs.

          Show
          bbendick Brandon Bendickson added a comment - Of the 26,484 ncbi IDs, 23,288 had successful mapping to soly IDs and their respective kegg IDs.
          Hide
          bbendick Brandon Bendickson added a comment -

          Number of Soly IDs with 3 or more NCBI hits: 17847
          Number of Soly IDs with 2 NCBI hits: 3884
          Number of Soly IDs with 1 NCBI hit: 6074
          Number of Soly IDs with 0 NCBI hits: 0

          Show
          bbendick Brandon Bendickson added a comment - Number of Soly IDs with 3 or more NCBI hits: 17847 Number of Soly IDs with 2 NCBI hits: 3884 Number of Soly IDs with 1 NCBI hit: 6074 Number of Soly IDs with 0 NCBI hits: 0
          Hide
          bbendick Brandon Bendickson added a comment -

          file with the blast data

          Show
          bbendick Brandon Bendickson added a comment - file with the blast data

            People

            • Assignee:
              bbendick Brandon Bendickson
              Reporter:
              robofjoy Robert Reid
            • Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: