Uploaded image for project: 'IGB'
  1. IGB
  2. IGBF-3944

Python script to remove NCBI's contaminants

    Details

    • Type: Task
    • Status: Closed (View Workflow)
    • Priority: Major
    • Resolution: Done
    • Affects Version/s: None
    • Fix Version/s: None
    • Labels:
      None

      Description

      GOAL: After successfully removing UNIVEC adapters, the TSA have come up with a new list of sequences they fear might be contaminant. So we will now write a python script to remove these bits.

      LOcation on cluster where this data is:
      /projects/tomato_genome/fnb/dataprocessing/TSA-transcriptomeShotgunAssembly/kelsieData

      Files to use:
      rw-rr- 1 rreid2 tomato_genome 80M Oct 14 12:38 Heinz_new_ID_clean.fna
      rw-rr- 1 rreid2 tomato_genome 146M Oct 14 12:38 Nagcarlang_new_ID_clean.fna
      rw-rr- 1 rreid2 tomato_genome 69M Oct 14 12:38 Malintka_new_ID_clean.fna
      rw-rr- 1 rreid2 tomato_genome 52M Oct 14 12:38 Tamaulipas_new_ID_clean.fna
      rw-rr- 1 rreid2 tomato_genome 1.4M Oct 15 09:30 contaminant2.txt

      FLOW:
      #Read in contaminant2.txt as a dictionary, the header as the key, the region to be removed as the value.
      #Iterate through the fasta file checking to see if each header is in the dict.
      #If NO, write the sequence out to a new file.
      #If YES, chop away the the region.

        1. If region is in middle of sequence, make 2 new sequences.
        2. ## if within 50BP of the beginning or the end, truncate the sequence.
      1. Write new sequence(s) to file, renaming the header if making 2 sequences ( A and B ).

        Attachments

          Issue Links

            Activity

            robofjoy Robert Reid created issue -
            robofjoy Robert Reid made changes -
            Field Original Value New Value
            Epic Link IGBF-2993 [ 21429 ]
            robofjoy Robert Reid made changes -
            Link This issue blocks IGBF-3928 [ IGBF-3928 ]
            bbendick Brandon Bendickson made changes -
            Status To-Do [ 10305 ] In Progress [ 3 ]
            bbendick Brandon Bendickson made changes -
            Status In Progress [ 3 ] Needs 1st Level Review [ 10005 ]
            bbendick Brandon Bendickson made changes -
            Assignee Brandon Bendickson [ bbendick ] Robert Reid [ robertreid ]
            robofjoy Robert Reid made changes -
            Status Needs 1st Level Review [ 10005 ] First Level Review in Progress [ 10301 ]
            robofjoy Robert Reid made changes -
            Status First Level Review in Progress [ 10301 ] To-Do [ 10305 ]
            robofjoy Robert Reid made changes -
            Status To-Do [ 10305 ] In Progress [ 3 ]
            robofjoy Robert Reid made changes -
            Assignee Robert Reid [ robertreid ] Brandon Bendickson [ bbendick ]
            bbendick Brandon Bendickson made changes -
            Status In Progress [ 3 ] Needs 1st Level Review [ 10005 ]
            bbendick Brandon Bendickson made changes -
            Assignee Brandon Bendickson [ bbendick ] Robert Reid [ robertreid ]
            robofjoy Robert Reid made changes -
            Status Needs 1st Level Review [ 10005 ] First Level Review in Progress [ 10301 ]
            robofjoy Robert Reid made changes -
            Status First Level Review in Progress [ 10301 ] To-Do [ 10305 ]
            robofjoy Robert Reid made changes -
            Status To-Do [ 10305 ] In Progress [ 3 ]
            ann.loraine Ann Loraine made changes -
            Sprint Fall 4 [ 205 ] Fall 4, Fall 5 [ 205, 206 ]
            ann.loraine Ann Loraine made changes -
            Rank Ranked higher
            bbendick Brandon Bendickson made changes -
            Status In Progress [ 3 ] Needs 1st Level Review [ 10005 ]
            robofjoy Robert Reid made changes -
            Status Needs 1st Level Review [ 10005 ] First Level Review in Progress [ 10301 ]
            robofjoy Robert Reid made changes -
            Status First Level Review in Progress [ 10301 ] Ready for Pull Request [ 10304 ]
            robofjoy Robert Reid made changes -
            Status Ready for Pull Request [ 10304 ] Pull Request Submitted [ 10101 ]
            robofjoy Robert Reid made changes -
            Status Pull Request Submitted [ 10101 ] Reviewing Pull Request [ 10303 ]
            robofjoy Robert Reid made changes -
            Status Reviewing Pull Request [ 10303 ] Merged Needs Testing [ 10002 ]
            robofjoy Robert Reid made changes -
            Status Merged Needs Testing [ 10002 ] Post-merge Testing In Progress [ 10003 ]
            robofjoy Robert Reid made changes -
            Resolution Done [ 10000 ]
            Status Post-merge Testing In Progress [ 10003 ] Closed [ 6 ]

              People

              • Assignee:
                robofjoy Robert Reid
                Reporter:
                robofjoy Robert Reid
              • Votes:
                0 Vote for this issue
                Watchers:
                2 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: