Details
-
Type:
Task
-
Status: Closed (View Workflow)
-
Priority:
Minor
-
Resolution: Done
-
Affects Version/s: None
-
Fix Version/s: None
-
Labels:None
-
Story Points:1
-
Epic Link:
-
Sprint:Spring 8
Description
To ensure that the sequences are matching what we have versus what the SRA has, I came up with a simple check. We look for the first 10 sequences of a given SRA run across all the sequence files.
For speed, I head the fastq.gz file to get only the first 5 sequences.
Use zcat to keep files zipped.
Grep for the sequence of interest.
for f in *gz; do echo $f; zcat < $f | head -n 20 | grep "^CTGGCTTTTC" ; done
Probability dictates that we should find just 1 result.
Attachments
Issue Links
- blocks
-
IGBF-3683 Update SRA to use the correct sample codes for Muday lab time course data
-
- Closed
-
Activity
Field | Original Value | New Value |
---|---|---|
Epic Link | IGBF-2993 [ 21429 ] |
Status | To-Do [ 10305 ] | In Progress [ 3 ] |
Status | In Progress [ 3 ] | Needs 1st Level Review [ 10005 ] |
Status | Needs 1st Level Review [ 10005 ] | First Level Review in Progress [ 10301 ] |
Status | First Level Review in Progress [ 10301 ] | Ready for Pull Request [ 10304 ] |
Status | Ready for Pull Request [ 10304 ] | Pull Request Submitted [ 10101 ] |
Status | Pull Request Submitted [ 10101 ] | Reviewing Pull Request [ 10303 ] |
Status | Reviewing Pull Request [ 10303 ] | Merged Needs Testing [ 10002 ] |
Status | Merged Needs Testing [ 10002 ] | Post-merge Testing In Progress [ 10003 ] |
Resolution | Done [ 10000 ] | |
Status | Post-merge Testing In Progress [ 10003 ] | Closed [ 6 ] |
When succesful, results will look a little like this
.....
V.28.45.9_R2.fastq.gz
V.28.75.7_R1.fastq.gz
V.28.75.7_R2.fastq.gz
V.28.75.8_R1.fastq.gz
V.28.75.8_R2.fastq.gz
V.28.75.9_R1.fastq.gz
V.28.75.9_R2.fastq.gz
V.34.15.7_R1.fastq.gz
V.34.15.7_R2.fastq.gz
V.34.15.8_R1.fastq.gz
V.34.15.8_R2.fastq.gz
V.34.15.9_R1.fastq.gz
V.34.15.9_R2.fastq.gz
V.34.30.7_R1.fastq.gz
V.34.30.7_R2.fastq.gz
V.34.30.8_R1.fastq.gz
V.34.30.8_R2.fastq.gz
V.34.30.9_R1.fastq.gz
CTGGCTTTTCAGATTTCTCATCCCTGTATGCTTTTCTTCGAGGTGGAGACACCTTCGGCACCTTGTCCACTACATCAGCTGAACTTTGCAAATTGGTTGTCGAGTACAGTTTCTGACCAGCTGGAATGCTGTACGCATTCTTCACCTCAA
V.34.30.9_R2.fastq.gz
V.34.45.7_R1.fastq.gz
V.34.45.7_R2.fastq.gz
V.34.45.8_R1.fastq.gz
V.34.45.8_R2.fastq.gz
V.34.45.9_R1.fastq.gz
V.34.45.9_R2.fastq.gz
V.34.75.7_R1.fastq.gz
V.34.75.7_R2.fastq.gz
V.34.75.8_R1.fastq.gz
V.34.75.8_R2.fastq.gz
V.34.75.9_R2.fastq.gz
We see just the 1 result and this matches the SRA title and ID for this sample. (and this is one of the 16 samples that got switched!!)
Will spot test a few more.