Details

    • Type: Task
    • Status: Closed (View Workflow)
    • Priority: Major
    • Resolution: Done
    • Affects Version/s: None
    • Fix Version/s: None
    • Labels:
      None

      Description


      GOAL: Run the DBR1 raw sequences through the NCBI filtering tools.

        Attachments

          Issue Links

            Activity

            Hide
            robofjoy Robert Reid added a comment -

            Location of results
            /projects/tomato_genome/fnb/dataprocessing/TSA-transcriptomeShotgunAssembly/kelsieData/FCS_tools/FCS_GX/tom_runs/fin_results

            Brandon has run a python script to remove the last 11 sequences that TSA people complained about.

            And in addition, this script also addresses any sequences that are duplicated.

            remove_Ns.py

            Show
            robofjoy Robert Reid added a comment - Location of results /projects/tomato_genome/fnb/dataprocessing/TSA-transcriptomeShotgunAssembly/kelsieData/FCS_tools/FCS_GX/tom_runs/fin_results Brandon has run a python script to remove the last 11 sequences that TSA people complained about. And in addition, this script also addresses any sequences that are duplicated. remove_Ns.py
            Hide
            bbendick Brandon Bendickson added a comment -

            Removed duplicates and wrote them to their own file (Variety_dupes.fa). The new files are located in /projects/tomato_genome/fnb/dataprocessing/TSA-transcriptomeShotgunAssembly/kelsieData/FCS_tools/FCS_GX/tom_runs/fin_results

            They follow this pattern (Variety_contaminant3_removed.fa), We should be ready for another submission. Python script is fix_contam3_v2.py and is in same directory

            Show
            bbendick Brandon Bendickson added a comment - Removed duplicates and wrote them to their own file (Variety_dupes.fa). The new files are located in /projects/tomato_genome/fnb/dataprocessing/TSA-transcriptomeShotgunAssembly/kelsieData/FCS_tools/FCS_GX/tom_runs/fin_results They follow this pattern (Variety_contaminant3_removed.fa), We should be ready for another submission. Python script is fix_contam3_v2.py and is in same directory
            Hide
            robofjoy Robert Reid added a comment -

            I think this comment is in the wrong task! I'll copy it over to the TSA task.

            Show
            robofjoy Robert Reid added a comment - I think this comment is in the wrong task! I'll copy it over to the TSA task.
            Hide
            bbendick Brandon Bendickson added a comment -

            FCS adaptor keeps dying due to BLAST step using up too much memory. I have gone up to 350gb per run and still running out of memory. Trying to find a way around this.

            Show
            bbendick Brandon Bendickson added a comment - FCS adaptor keeps dying due to BLAST step using up too much memory. I have gone up to 350gb per run and still running out of memory. Trying to find a way around this.
            Hide
            bbendick Brandon Bendickson added a comment -

            Checked troubleshooting page on github and found this:

            Can FCS-adaptor run on sequencing reads?

            FCS-adaptor is developed to operate on assembled genomes and is not intended to replace adaptor trimming on reads. There are various tools for read trimming; tools that are specialized for sequencing technologies may have a more expansive adaptor catalogue relative to FCS-adaptor.

            This could be why my jobs are dying no matter how much memory I give them, the tool cant blast the reads.

            Show
            bbendick Brandon Bendickson added a comment - Checked troubleshooting page on github and found this: Can FCS-adaptor run on sequencing reads? FCS-adaptor is developed to operate on assembled genomes and is not intended to replace adaptor trimming on reads. There are various tools for read trimming; tools that are specialized for sequencing technologies may have a more expansive adaptor catalogue relative to FCS-adaptor. This could be why my jobs are dying no matter how much memory I give them, the tool cant blast the reads.
            Hide
            robofjoy Robert Reid added a comment -

            OK, good to know. Let's stop trying to filter the reads!

            R

            Show
            robofjoy Robert Reid added a comment - OK, good to know. Let's stop trying to filter the reads! R
            Hide
            robofjoy Robert Reid added a comment -

            This ticket can be closed as we now know that these tools are not suitable for large raw reads.

            Show
            robofjoy Robert Reid added a comment - This ticket can be closed as we now know that these tools are not suitable for large raw reads.

              People

              • Assignee:
                robofjoy Robert Reid
                Reporter:
                robofjoy Robert Reid
              • Votes:
                0 Vote for this issue
                Watchers:
                2 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: