Details
-
Type:
Task
-
Status: To-Do (View Workflow)
-
Priority:
Major
-
Resolution: Unresolved
-
Affects Version/s: None
-
Fix Version/s: None
-
Labels:None
-
Story Points:2
-
Epic Link:
-
Sprint:Fall 1, Summer 1, Summer 2, Summer 3, Summer 4, Summer 5, Summer 6, Fall 2, Fall 3, Fall 4
Description
Many of the alignments we see in BAM tracks are redundant, meaning: they show the same exact pattern of blocks, gaps, insertions, strand affiliation, etc. as other alignments in the same track.
Sometimes, it is helpful to only see the alignments that are different from each other.
For this task, create a new track filter for alignment tracks that only shows alignments that are different from each other - that is, unique.
One possible way to create this filter would be to use a hashtable, with hash keys constructed from a combination of the chromosome name, strand, start position, and CIGAR string for each alignment.
The hash key could look like:
- chr:strand:start:CIGAR
as an example:
- chr1:+:66666:50M1500N20M
How this would work from a user's perspective:
When a filter is started or added to a track, it creates some kind of hashtable data structure, linked to that particular track somehow.
When the user click "Load Data," the data get loaded into memory, and the filter creates a hash key for each new alignment that is read. If no value exists in the filter's hashtable for that key, then the alignment gets shown, and the hashtable stores a value to signal that this newly encountered key exists in the data.
I bet the hashtable will be surprisingly small.