Details
-
Type:
Task
-
Status: Closed (View Workflow)
-
Priority:
Minor
-
Resolution: Done
-
Affects Version/s: None
-
Fix Version/s: None
-
Labels:None
-
Story Points:1
-
Epic Link:
-
Sprint:Spring 8
Description
To ensure that the sequences are matching what we have versus what the SRA has, I came up with a simple check. We look for the first 10 sequences of a given SRA run across all the sequence files.
For speed, I head the fastq.gz file to get only the first 5 sequences.
Use zcat to keep files zipped.
Grep for the sequence of interest.
for f in *gz; do echo $f; zcat < $f | head -n 20 | grep "^CTGGCTTTTC" ; done
Probability dictates that we should find just 1 result.
Attachments
Issue Links
- blocks
-
IGBF-3683 Update SRA to use the correct sample codes for Muday lab time course data
-
- Closed
-
Nothing more to this. Moving to done and adding it to the archive.