Details
-
Type:
Task
-
Status: Closed (View Workflow)
-
Priority:
Major
-
Resolution: Done
-
Affects Version/s: None
-
Fix Version/s: None
-
Labels:None
-
Story Points:3
-
Epic Link:
-
Sprint:Fall 1, Fall 2
Description
GOAL: Prep a new table that harnesses the results from Ticket 3772, The salmon counts.
The first step is to make a new counts table but we add SolyIDs back in. We associate the Soly Ids with the de novo contigs via a previous BLAT alignment. This task is the
We take the 4 tables and a Soly ID table produced previously (ticket # I don't recall)
Make a python script that will read in all the data, and make a large table where each row is a SolyId gene, each column is an experiment.
Will need good column labels!
This will be a 2 task process, both involving writing python scripts.
This task is step 1: Adding a SolyID to a salmon counts table using our BLAT results from many steps ago.
We run this script repeatedly, one for each plant variety.
1. We need the Blat result fna file where we have a blatted the rna=spades contigs to SL5.
That can be found in this location:
/projects/tomato_genome/fnb/dataprocessing/brandon_work/mal/malintka-spades/spades_blat/blat-SL5-CDS-malintka-bestLongHit.fna
We read this file into a dict with the NODE id as the key and the SolyID as the value pair.
2. We need the salmon gene count file for the same variety:
/projects/tomato_genome/fnb/dataprocessing/brandon_work/NEXTFLOW/start_fresh/Mal-run-2/results-3.14.0/star_salmon/salmon.merged.gene_counts.tsv
The first column in the table is the NodeID, we ignore the 2nd column and then we keep all of the remaining column of read counts.
We read a line, parse it, we check if the ID in 1st column is in our dict from above.
If so, we write out a line using the SolyID as the first column and then write out all of the remaining fields!
In the end we write out a table, each row has a solyID and all of the gene counts.
We then repeat this script but point at new plant variety (aka MAL, etc).
After that we move to next phase of merging the 4 tables into 1 (new ticket that is not yet created)!!!