Details

    • Type: Task
    • Status: Closed (View Workflow)
    • Priority: Major
    • Resolution: Done
    • Affects Version/s: None
    • Fix Version/s: None
    • Labels:
      None

      Description


      GOAL: Run the DBR1 raw sequences through the NCBI filtering tools.

        Attachments

          Issue Links

            Activity

            robofjoy Robert Reid created issue -
            robofjoy Robert Reid made changes -
            Field Original Value New Value
            Epic Link IGBF-2993 [ 21429 ]
            robofjoy Robert Reid made changes -
            Status To-Do [ 10305 ] In Progress [ 3 ]
            Hide
            robofjoy Robert Reid added a comment -

            Location of results
            /projects/tomato_genome/fnb/dataprocessing/TSA-transcriptomeShotgunAssembly/kelsieData/FCS_tools/FCS_GX/tom_runs/fin_results

            Brandon has run a python script to remove the last 11 sequences that TSA people complained about.

            And in addition, this script also addresses any sequences that are duplicated.

            remove_Ns.py

            Show
            robofjoy Robert Reid added a comment - Location of results /projects/tomato_genome/fnb/dataprocessing/TSA-transcriptomeShotgunAssembly/kelsieData/FCS_tools/FCS_GX/tom_runs/fin_results Brandon has run a python script to remove the last 11 sequences that TSA people complained about. And in addition, this script also addresses any sequences that are duplicated. remove_Ns.py
            Hide
            bbendick Brandon Bendickson added a comment -

            Removed duplicates and wrote them to their own file (Variety_dupes.fa). The new files are located in /projects/tomato_genome/fnb/dataprocessing/TSA-transcriptomeShotgunAssembly/kelsieData/FCS_tools/FCS_GX/tom_runs/fin_results

            They follow this pattern (Variety_contaminant3_removed.fa), We should be ready for another submission. Python script is fix_contam3_v2.py and is in same directory

            Show
            bbendick Brandon Bendickson added a comment - Removed duplicates and wrote them to their own file (Variety_dupes.fa). The new files are located in /projects/tomato_genome/fnb/dataprocessing/TSA-transcriptomeShotgunAssembly/kelsieData/FCS_tools/FCS_GX/tom_runs/fin_results They follow this pattern (Variety_contaminant3_removed.fa), We should be ready for another submission. Python script is fix_contam3_v2.py and is in same directory
            bbendick Brandon Bendickson made changes -
            Assignee Brandon Bendickson [ bbendick ] Robert Reid [ robertreid ]
            Hide
            robofjoy Robert Reid added a comment -

            I think this comment is in the wrong task! I'll copy it over to the TSA task.

            Show
            robofjoy Robert Reid added a comment - I think this comment is in the wrong task! I'll copy it over to the TSA task.
            robofjoy Robert Reid made changes -
            Link This issue blocks IGBF-4001 [ IGBF-4001 ]
            ann.loraine Ann Loraine made changes -
            Sprint Fall 6 [ 207 ] Fall 6, Fall 7 [ 207, 208 ]
            ann.loraine Ann Loraine made changes -
            Rank Ranked higher
            ann.loraine Ann Loraine made changes -
            Sprint Fall 6, Fall 7 [ 207, 208 ] Fall 6, Fall 7, Fall 8 [ 207, 208, 209 ]
            ann.loraine Ann Loraine made changes -
            Rank Ranked higher
            Hide
            bbendick Brandon Bendickson added a comment -

            FCS adaptor keeps dying due to BLAST step using up too much memory. I have gone up to 350gb per run and still running out of memory. Trying to find a way around this.

            Show
            bbendick Brandon Bendickson added a comment - FCS adaptor keeps dying due to BLAST step using up too much memory. I have gone up to 350gb per run and still running out of memory. Trying to find a way around this.
            Hide
            bbendick Brandon Bendickson added a comment -

            Checked troubleshooting page on github and found this:

            Can FCS-adaptor run on sequencing reads?

            FCS-adaptor is developed to operate on assembled genomes and is not intended to replace adaptor trimming on reads. There are various tools for read trimming; tools that are specialized for sequencing technologies may have a more expansive adaptor catalogue relative to FCS-adaptor.

            This could be why my jobs are dying no matter how much memory I give them, the tool cant blast the reads.

            Show
            bbendick Brandon Bendickson added a comment - Checked troubleshooting page on github and found this: Can FCS-adaptor run on sequencing reads? FCS-adaptor is developed to operate on assembled genomes and is not intended to replace adaptor trimming on reads. There are various tools for read trimming; tools that are specialized for sequencing technologies may have a more expansive adaptor catalogue relative to FCS-adaptor. This could be why my jobs are dying no matter how much memory I give them, the tool cant blast the reads.
            Hide
            robofjoy Robert Reid added a comment -

            OK, good to know. Let's stop trying to filter the reads!

            R

            Show
            robofjoy Robert Reid added a comment - OK, good to know. Let's stop trying to filter the reads! R
            Hide
            robofjoy Robert Reid added a comment -

            This ticket can be closed as we now know that these tools are not suitable for large raw reads.

            Show
            robofjoy Robert Reid added a comment - This ticket can be closed as we now know that these tools are not suitable for large raw reads.
            robofjoy Robert Reid made changes -
            Status In Progress [ 3 ] Needs 1st Level Review [ 10005 ]
            robofjoy Robert Reid made changes -
            Status Needs 1st Level Review [ 10005 ] First Level Review in Progress [ 10301 ]
            robofjoy Robert Reid made changes -
            Status First Level Review in Progress [ 10301 ] Ready for Pull Request [ 10304 ]
            robofjoy Robert Reid made changes -
            Status Ready for Pull Request [ 10304 ] Pull Request Submitted [ 10101 ]
            robofjoy Robert Reid made changes -
            Status Pull Request Submitted [ 10101 ] Reviewing Pull Request [ 10303 ]
            robofjoy Robert Reid made changes -
            Status Reviewing Pull Request [ 10303 ] Merged Needs Testing [ 10002 ]
            robofjoy Robert Reid made changes -
            Status Merged Needs Testing [ 10002 ] Post-merge Testing In Progress [ 10003 ]
            robofjoy Robert Reid made changes -
            Resolution Done [ 10000 ]
            Status Post-merge Testing In Progress [ 10003 ] Closed [ 6 ]

              People

              • Assignee:
                robofjoy Robert Reid
                Reporter:
                robofjoy Robert Reid
              • Votes:
                0 Vote for this issue
                Watchers:
                2 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: