Details
-
Type:
Task
-
Status: Closed (View Workflow)
-
Priority:
Minor
-
Resolution: Done
-
Affects Version/s: None
-
Fix Version/s: None
-
Labels:None
-
Story Points:4
-
Epic Link:
-
Sprint:Summer 2, Summer 3, Summer 4, Summer 5, Summer 6, Summer 7, Fall 1
Description
GOAL:
To set up and run an nf-core/rna-seq pipeline to align our RNA-Seq sequences against the de novo created contigs (Trinity, Spades).
Next flow requires a GTF and a bed file along with the contigs. There are various ways to generate these and it is possible issues will arise from these files.
Otherwise, we can follow Molly's protocol:
https://docs.google.com/document/d/1ig9ET-ykXF5nAX3P487cXWmZDGUlQpcwrvFXpbyP5vw/edit?usp=sharing
Attachments
Issue Links
- blocks
-
IGBF-3817 Create a method to compare NextFlow gene expression results
-
- To-Do
-
Activity
To make a bed file:
cat ../postblat/blat-heinz-bestLongHit.fna | awk '$0 ~ "^>"
{name=substr($0, 2); printf name"\t1\t"}$0 !~ "^>"
{printf length($0)"\t"name"\n"}' > blat-heinz.bed
And all the bam files exist:
To check:
/projects/tomato_genome/fnb/dataprocessing/brandon_work/NEXTFLOW/start_fresh$ ls -lrt Tam-run-2/results-3.14.0/star_salmon/*bam
There are some size discrepancies across the bam files, the smallest being 800MB and the largest 5GB. But no idea if that is anything to be concerned about.
Next phase: Create 1 large salmon counts file using the results from these 4 salmon files. That means new ticket!