Details
-
Type:
Task
-
Status: Closed (View Workflow)
-
Priority:
Major
-
Resolution: Done
-
Affects Version/s: None
-
Fix Version/s: None
-
Labels:None
-
Story Points:3
-
Epic Link:
-
Sprint:Spring 7, Spring 8
Description
SRP499796
Directory: /projects/tomato_genome/fnb/dataprocessing/SRP499796
SL4 and SL5 need to be run with this data set.
For this task, we need to confirm and sanity-check the ARE 120 minute flavonoid data that Rob recently uploaded and submitted to the Sequence Read Archive.
If the data are good, we will replace all the existing BAM, junctions, etc. files deployed in the "hotpollen" quickload site with newly processed data.
For this task:
- Check SRP on NCBI and review submission
- Download the data onto the cluster by using the SRP name
- Run nf-core/rnaseq pipeline
- Run our coverage graph and junctions scripts on the data
Note that all files should now use their "SRR" names instead of the existing file names.
Attachments
Issue Links
Activity
Re-run Directory: /projects/tomato_genome/fnb/dataprocessing/SRP499796/nfcore-SL4
Prefetch SRR Script:
#! /bin/bash #SBATCH --job-name=prefetch_SRR #SBATCH --partition=Orion #SBATCH --nodes=1 #SBATCH --ntasks-per-node=1 #SBATCH --mem=4gb #SBATCH --output=%x_%j.out #SBATCH --time=24:00:00 cd /projects/tomato_genome/fnb/dataprocessing/SRP499796/nfcore-SL4 module load sra-tools/2.11.0 vdb-config --interactive files=( SRR28558218 SRR28558219 SRR28558220 SRR28558221 SRR28558222 SRR28558223 SRR28558224 SRR28558225 SRR28558226 SRR28558227 SRR28558228 SRR28558229 SRR28558230 SRR28558231 SRR28558232 SRR28558233 SRR28558234 SRR28558235 SRR28558236 SRR28558237 SRR28558238 SRR28558239 SRR28558240 SRR28558241 ) for f in "${files[@]}"; do echo $f; prefetch $f; done
Execute:
chmod u+x prefetch.slurm
sbatch prefetch.slurm
Faster Dump Script:
#! /bin/bash #SBATCH --job-name=fastqdump_SRR #SBATCH --partition=Orion #SBATCH --nodes=1 #SBATCH --ntasks-per-node=1 #SBATCH --mem=40gb #SBATCH --output=%x_%j.out #SBATCH --time=24:00:00 #SBATCH --array=1-24 #setting up where to grab files from file=$(sed -n -e "${SLURM_ARRAY_TASK_ID}p" /projects/tomato_genome/fnb/dataprocessing/SRP499796/nfcore-SL4/Sra_ids.txt) cd /projects/tomato_genome/fnb/dataprocessing/SRP499796/nfcore-SL4 module load sra-tools/2.11.0 echo "Starting faster-qdump on $file"; cd /projects/tomato_genome/fnb/dataprocessing/SRP499796/nfcore-SL4/$file fasterq-dump ${file}.sra perl /projects/tomato_genome/scripts/validateHiseqPairs.pl ${file}_1.fastq ${file}_2.fastq cp ${file}_1.fastq /projects/tomato_genome/fnb/dataprocessing/SRP499796/nfcore-SL4/${file}_1.fastq cp ${file}_2.fastq /projects/tomato_genome/fnb/dataprocessing/SRP499796/nfcore-SL4/${file}_2.fastq echo "finished"
Execute:
chmod u+x fasterdump.slurm
sbatch fasterdump.slurm
I reviewed the files as follows:
Testing passes. Moving to DONE.