Uploaded image for project: 'IGB'
  1. IGB
  2. IGBF-3741

Align Trinity Contigs to SL4 & SL5 via BLAT

    Details

    • Type: Task
    • Status: Closed (View Workflow)
    • Priority: Minor
    • Resolution: Done
    • Affects Version/s: None
    • Fix Version/s: None
    • Labels:
      None

      Description

      Goal is to visualize how the BLAT alignments look.
      To do this we need to align contigs to SL4 and SL5 genome (and not to CDS).

      But before we do this, we want to implement Ann's suggestions for blat parameters. (e.g., max intron size:10K)

      The current blat command being used is this:

      blat /projects/tomato_genome/db/SL5/SL5.cds.fa \
      /dev/null /dev/null -tileSize=11 -makeOoc=$

      {file}

      .11.ooc -repMatch=1024

      Script will point to proper reference genome files.
      And we will incorporate other parameters as Ann suggests.
      which are..........

        Attachments

          Activity

          Hide
          ann.loraine Ann Loraine added a comment -

          Please note I am moving this to the next sprint since it seems unlikely we will get around to doing further work on this during the current sprint, which ends tomorrow.

          Show
          ann.loraine Ann Loraine added a comment - Please note I am moving this to the next sprint since it seems unlikely we will get around to doing further work on this during the current sprint, which ends tomorrow.
          Hide
          ann.loraine Ann Loraine added a comment -

          I added the requested details in the preceding comment.
          If you have any questions, please let me know.

          attn:

          Robert Reid
          Brandon Bendickson

          Show
          ann.loraine Ann Loraine added a comment - I added the requested details in the preceding comment. If you have any questions, please let me know. attn: Robert Reid Brandon Bendickson
          Hide
          ann.loraine Ann Loraine added a comment -

          As requested, here are the parameters and inputs you should use to align the contig sequences against the SL4 and SL5 genome sequences.

          • You need "target" genome sequence files in ".2bit" format. Please use these files as the target sequences:

          SL5 genome assembly, in blat-friendly ".2bit" format: http://lorainelab-quickload.scidas.org/quickload/S_lycopersicum_Jun_2022/S_lycopersicum_Jun_2022.2bit
          SL4 genome assembly, in blat-friend ".2bit" format: http://lorainelab-quickload.scidas.org/quickload/S_lycopersicum_Sep_2019/S_lycopersicum_Sep_2019.2bit

          • The output format you should use for blat is "pslx". This will ensure that the aligned sequence gets included in the output. If we do that, IGB will be able to display the sequence and we can more easily check that the results are good. To specify this output format, use this option: -out=pslx
          • Please run blat using the option that does not include a header. That option is: -noHead
          • Do not use the default maximum intron size. It is too big. Use 13,000 bases. That option is: -maxIntron=13000

          For more information about running blat, see this link: https://genome.ucsc.edu/goldenPath/help/blatSpec.html

          Last but not least, I recommend that you run blat in parallel. If you do that, make sure that you keep track of which sequence came from which collection of assembled contigs. Otherwise, it will be super hard to sanity-check the results. For example, what I would like to do is open up the blat alignments for a set of transcript contigs where I know exactly which RNA-Seq reads were used to make them. Then, I can easily use the Genome Browser to compare the genome alignments for the input RNA-Seq reads to the alignments of their assembled contigs. If we do this, we can easily get an idea for how the trinity assembly software performs when there are lots of reads or not so many reads in a given region. It will make more sense when I can show you what I mean in IGB

          Show
          ann.loraine Ann Loraine added a comment - As requested, here are the parameters and inputs you should use to align the contig sequences against the SL4 and SL5 genome sequences. You need "target" genome sequence files in ".2bit" format. Please use these files as the target sequences: SL5 genome assembly, in blat-friendly ".2bit" format: http://lorainelab-quickload.scidas.org/quickload/S_lycopersicum_Jun_2022/S_lycopersicum_Jun_2022.2bit SL4 genome assembly, in blat-friend ".2bit" format: http://lorainelab-quickload.scidas.org/quickload/S_lycopersicum_Sep_2019/S_lycopersicum_Sep_2019.2bit The output format you should use for blat is "pslx". This will ensure that the aligned sequence gets included in the output. If we do that, IGB will be able to display the sequence and we can more easily check that the results are good. To specify this output format, use this option: -out=pslx Please run blat using the option that does not include a header. That option is: -noHead Do not use the default maximum intron size. It is too big. Use 13,000 bases. That option is: -maxIntron=13000 For more information about running blat, see this link: https://genome.ucsc.edu/goldenPath/help/blatSpec.html Last but not least, I recommend that you run blat in parallel. If you do that, make sure that you keep track of which sequence came from which collection of assembled contigs. Otherwise, it will be super hard to sanity-check the results. For example, what I would like to do is open up the blat alignments for a set of transcript contigs where I know exactly which RNA-Seq reads were used to make them. Then, I can easily use the Genome Browser to compare the genome alignments for the input RNA-Seq reads to the alignments of their assembled contigs. If we do this, we can easily get an idea for how the trinity assembly software performs when there are lots of reads or not so many reads in a given region. It will make more sense when I can show you what I mean in IGB

            People

            • Assignee:
              bbendick Brandon Bendickson
              Reporter:
              robofjoy Robert Reid
            • Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: