[IGBF-4239] Make a "only show unique alignments" IGB track filter App for alignments tracks - JIRA UNCC

Details

Type: Task
Status: To-Do (View Workflow)
Priority: Major
Resolution: Unresolved
Affects Version/s: None
Fix Version/s: None
Labels:
None

Story Points:
2
Epic Link:
Build & maintain IGB Apps
Sprint:
Fall 1, Summer 1, Summer 2, Summer 3, Summer 4, Summer 5, Summer 6, Fall 2, Fall 3, Fall 4

Description

Many of the alignments we see in BAM tracks are redundant, meaning: they show the same exact pattern of blocks, gaps, insertions, strand affiliation, etc. as other alignments in the same track.

Sometimes, it is helpful to only see the alignments that are different from each other.

For this task, create a new track filter for alignment tracks that only shows alignments that are different from each other - that is, unique.

One possible way to create this filter would be to use a hashtable, with hash keys constructed from a combination of the chromosome name, strand, start position, and CIGAR string for each alignment.

The hash key could look like:

chr:strand:start:CIGAR

as an example:

chr1:+:66666:50M1500N20M

How this would work from a user's perspective:

When a filter is started or added to a track, it creates some kind of hashtable data structure, linked to that particular track somehow.

When the user click "Load Data," the data get loaded into memory, and the filter creates a hash key for each new alignment that is read. If no value exists in the filter's hashtable for that key, then the alignment gets shown, and the hashtable stores a value to signal that this newly encountered key exists in the data.

I bet the hashtable will be surprisingly small.

Attachments

Activity

Ascending order - Click to sort in descending order

Ann Loraine created issue - 04/Jun/25 11:27 AM

Ann Loraine made changes - 04/Jun/25 11:27 AM

Field	Original Value	New Value
Epic Link		IGBF-1908 [ 17998 ]

Ann Loraine made changes - 04/Jun/25 11:27 AM

Summary

Make show unique only IGB track filter for alignments tracks

Make a show unique only IGB track filter for alignments tracks

Ann Loraine made changes - 04/Jun/25 11:27 AM

Summary

Make a show unique only IGB track filter for alignments tracks

Make a "only show unique alignments" IGB track filter App for alignments tracks

Ann Loraine made changes - 04/Jun/25 4:47 PM

Comment

[ I think the "Score" menu item [~karthik] saw may be coming from the core IGB codebase, not the App. ]

Ann Loraine made changes - 04/Jun/25 4:52 PM

Description

Many of the alignments we see in BAM tracks are redundant, meaning: they show the same exact pattern of blocks, gaps, insertions, etc. as other alignments in the same track.

Sometimes, it is helpful to *only* see the alignments that are different from each other.

For this task, create a new track filter for alignment tracks that only shows alignments that are different from each other, or unique.

One possible way to create this filter would be to use a hashtable, with hash keys constructed from a combination of the chromosome name, strand, start position, and CIGAR string for each alignment.

The hash key could look like:

* chr:strand:start:CIGAR

as an example:

* chr1:+:66666:50M1500N20M

How this would work from a user's perspective:

When a filter is started or added to a track, it creates some kind of hashtable data structure, linked to that particular track somehow.

When the user click "Load Data," the data get loaded into memory, and the filter creates a hash key for each new alignment that is read. If no value exists in the filter's hashtable for that key, then the alignment gets shown, and the hashtable stores a value to signal that this newly encountered key exists in the data.

Many of the alignments we see in BAM tracks are redundant, meaning: they show the same exact pattern of blocks, gaps, insertions, etc. as other alignments in the same track.

Sometimes, it is helpful to *only* see the alignments that are different from each other.

For this task, create a new track filter for alignment tracks that only shows alignments that are different from each other - that is, unique.

One possible way to create this filter would be to use a hashtable, with hash keys constructed from a combination of the chromosome name, strand, start position, and CIGAR string for each alignment.

The hash key could look like:

* chr:strand:start:CIGAR

as an example:

* chr1:+:66666:50M1500N20M

How this would work from a user's perspective:

When a filter is started or added to a track, it creates some kind of hashtable data structure, linked to that particular track somehow.

When the user click "Load Data," the data get loaded into memory, and the filter creates a hash key for each new alignment that is read. If no value exists in the filter's hashtable for that key, then the alignment gets shown, and the hashtable stores a value to signal that this newly encountered key exists in the data.

I bet the hashtable will be surprisingly small.

Ann Loraine made changes - 04/Jun/25 5:24 PM

Description

Many of the alignments we see in BAM tracks are redundant, meaning: they show the same exact pattern of blocks, gaps, insertions, etc. as other alignments in the same track.

Sometimes, it is helpful to *only* see the alignments that are different from each other.

For this task, create a new track filter for alignment tracks that only shows alignments that are different from each other - that is, unique.

One possible way to create this filter would be to use a hashtable, with hash keys constructed from a combination of the chromosome name, strand, start position, and CIGAR string for each alignment.

The hash key could look like:

* chr:strand:start:CIGAR

as an example:

* chr1:+:66666:50M1500N20M

How this would work from a user's perspective:

When a filter is started or added to a track, it creates some kind of hashtable data structure, linked to that particular track somehow.

When the user click "Load Data," the data get loaded into memory, and the filter creates a hash key for each new alignment that is read. If no value exists in the filter's hashtable for that key, then the alignment gets shown, and the hashtable stores a value to signal that this newly encountered key exists in the data.

I bet the hashtable will be surprisingly small.

Many of the alignments we see in BAM tracks are redundant, meaning: they show the same exact pattern of blocks, gaps, insertions, strand affiliation, etc. as other alignments in the same track.

Sometimes, it is helpful to *only* see the alignments that are different from each other.

For this task, create a new track filter for alignment tracks that only shows alignments that are different from each other - that is, unique.

One possible way to create this filter would be to use a hashtable, with hash keys constructed from a combination of the chromosome name, strand, start position, and CIGAR string for each alignment.

The hash key could look like:

* chr:strand:start:CIGAR

as an example:

* chr1:+:66666:50M1500N20M

How this would work from a user's perspective:

When a filter is started or added to a track, it creates some kind of hashtable data structure, linked to that particular track somehow.

When the user click "Load Data," the data get loaded into memory, and the filter creates a hash key for each new alignment that is read. If no value exists in the filter's hashtable for that key, then the alignment gets shown, and the hashtable stores a value to signal that this newly encountered key exists in the data.

I bet the hashtable will be surprisingly small.

Nowlan Freese made changes - 11/Jun/25 9:51 AM

Sprint

Summer 1 [ 218 ]

Summer 1, Summer 2 [ 218, 219 ]

Nowlan Freese made changes - 11/Jun/25 9:51 AM

Rank

Ranked higher

Ann Loraine made changes - 13/Jun/25 4:05 PM

Link

This issue relates to IGBF-4214 [ IGBF-4214 ]

Karthik Raveendran made changes - 18/Jun/25 9:58 AM

Status

To-Do [ 10305 ]

In Progress [ 3 ]

Karthik Raveendran made changes - 18/Jun/25 9:59 AM

Status

In Progress [ 3 ]

To-Do [ 10305 ]

Karthik Raveendran made changes - 18/Jun/25 10:00 AM

Status

To-Do [ 10305 ]

In Progress [ 3 ]

Karthik Raveendran made changes - 18/Jun/25 10:00 AM

Status

In Progress [ 3 ]

To-Do [ 10305 ]

Karthik Raveendran made changes - 18/Jun/25 1:05 PM

Status

To-Do [ 10305 ]

In Progress [ 3 ]

Karthik Raveendran made changes - 18/Jun/25 1:05 PM

Status

In Progress [ 3 ]

To-Do [ 10305 ]

Karthik Raveendran made changes - 18/Jun/25 2:02 PM

Status

To-Do [ 10305 ]

In Progress [ 3 ]

Ann Loraine made changes - 23/Jun/25 12:26 PM

Sprint

Summer 1, Summer 2 [ 218, 219 ]

Summer 1, Summer 2, Summer 3 [ 218, 219, 220 ]

Ann Loraine made changes - 23/Jun/25 12:26 PM

Rank

Ranked higher

Hide

Permalink

Karthik Raveendran added a comment - 27/Jun/25 3:28 PM

Only Show Unique Alignments repo: https://bitbucket.org/KarthikRavee91/only_show_unique_alignments_filter/src/main/

Show

Karthik Raveendran added a comment - 27/Jun/25 3:28 PM Only Show Unique Alignments repo: https://bitbucket.org/KarthikRavee91/only_show_unique_alignments_filter/src/main/

Hide

Permalink

Karthik Raveendran added a comment - 01/Jul/25 1:54 PM - edited

App is available in download section of the repo: https://bitbucket.org/KarthikRavee91/only_show_unique_alignments_filter/downloads/

There is an existing issue with this version of the app. Unique id that are generated is saved to the app and when the user clicks Load Data while the filter is active, the unique id reads disappears. I wanted someone to check the unique id logic while I fix this issue.

Note: A quick hack to fix this for testing is to simply Load Data -> Remove Filter -> Add filter again

Show

Karthik Raveendran added a comment - 01/Jul/25 1:54 PM - edited App is available in download section of the repo: https://bitbucket.org/KarthikRavee91/only_show_unique_alignments_filter/downloads/ There is an existing issue with this version of the app. Unique id that are generated is saved to the app and when the user clicks Load Data while the filter is active, the unique id reads disappears. I wanted someone to check the unique id logic while I fix this issue. Note: A quick hack to fix this for testing is to simply Load Data -> Remove Filter -> Add filter again

Karthik Raveendran made changes - 01/Jul/25 1:54 PM

Status

In Progress [ 3 ]

Needs 1st Level Review [ 10005 ]

Karthik Raveendran made changes - 01/Jul/25 1:54 PM

Assignee

Karthik Raveendran [ karthik ]

Ann Loraine made changes - 05/Jul/25 11:34 AM

Sprint

Summer 1, Summer 2, Summer 3 [ 218, 219, 220 ]

Summer 1, Summer 2, Summer 3, Summer 4 [ 218, 219, 220, 221 ]

Ann Loraine made changes - 05/Jul/25 11:34 AM

Rank

Ranked higher

Ann Loraine made changes - 07/Jul/25 8:01 AM

Assignee

Ann Loraine [ aloraine ]

Hide

Permalink

Karthik Raveendran added a comment - 15/Jul/25 11:53 AM - edited

New commit had been push with improvements for Load Data and Load Sequence workflow. See commit

When the user selects the filter for the alignments that is already loaded, then scrolls to another gene and click on Load Data, the previously loaded and filtered alignments should not disappear. Similarly, alignments should not disappear if the user loads the sequence.

Show

Karthik Raveendran added a comment - 15/Jul/25 11:53 AM - edited New commit had been push with improvements for Load Data and Load Sequence workflow. See commit When the user selects the filter for the alignments that is already loaded, then scrolls to another gene and click on Load Data, the previously loaded and filtered alignments should not disappear. Similarly, alignments should not disappear if the user loads the sequence.

Paige Kulzer made changes - 16/Jul/25 11:52 AM

Assignee

Ann Loraine [ aloraine ]

Paige Kulzer [ pkulzer ]

Paige Kulzer made changes - 16/Jul/25 11:52 AM

Status

Needs 1st Level Review [ 10005 ]

First Level Review in Progress [ 10301 ]

Hide

Permalink

Paige Kulzer added a comment - 16/Jul/25 12:03 PM

Fetched Karthik's new commit and tested locally on my Mac using data from the smoke testing quickload (https://wiki.bioviz.org/confluence/display/ITD/File+Formats). This new commit has fixed the issue that was being observed previously where filtered alignments were disappearing with a second click of the Load Data and/or Load Sequence buttons.

However, I noticed during testing that alignments that start and stop at the same position were being collapsed even if they had nucleotide differences. I believe this is due to the way that the filter uses the cigar string to compare alignments, and ultimately this is a poor dataset to have been testing with because the cigar strings do not appear to be properly formatted. I'm not sure if more "real-world" data will have better cigar strings or not, so this might be something to look into as part of a separate ticket.

Additionally, I found an edge-case scenario that breaks the filter. Adding multiple of the same filter to a track brings back the issue of reads disappearing after clicking Load Data/Load Sequence. This is the case, too, if one filter is added to a single strand of a dataset and then the same filter is added to the track once it's combined.

Overall, this filter app is working really well and the scope of this ticket has been completed – recommending PR.

Show

Paige Kulzer added a comment - 16/Jul/25 12:03 PM Fetched Karthik's new commit and tested locally on my Mac using data from the smoke testing quickload ( https://wiki.bioviz.org/confluence/display/ITD/File+Formats ). This new commit has fixed the issue that was being observed previously where filtered alignments were disappearing with a second click of the Load Data and/or Load Sequence buttons. However, I noticed during testing that alignments that start and stop at the same position were being collapsed even if they had nucleotide differences. I believe this is due to the way that the filter uses the cigar string to compare alignments, and ultimately this is a poor dataset to have been testing with because the cigar strings do not appear to be properly formatted. I'm not sure if more "real-world" data will have better cigar strings or not, so this might be something to look into as part of a separate ticket. Additionally, I found an edge-case scenario that breaks the filter. Adding multiple of the same filter to a track brings back the issue of reads disappearing after clicking Load Data/Load Sequence. This is the case, too, if one filter is added to a single strand of a dataset and then the same filter is added to the track once it's combined. Overall, this filter app is working really well and the scope of this ticket has been completed – recommending PR.

Paige Kulzer made changes - 16/Jul/25 12:03 PM

Status

First Level Review in Progress [ 10301 ]

Ready for Pull Request [ 10304 ]

Paige Kulzer made changes - 16/Jul/25 12:04 PM

Assignee

Paige Kulzer [ pkulzer ]

Karthik Raveendran [ karthik ]

Hide

Permalink

Ann Loraine added a comment - 17/Jul/25 4:49 PM - edited

I added comments to the commit, with suggestions for a way you could potentially implement the App without requiring a change to the IGB filters API.

I'm worried that making a change to the API could break existing Apps or cause other unforeseen problems. Also, it means we would not be able to use the App in IGB 10.1.0, the released version.

Show

Ann Loraine added a comment - 17/Jul/25 4:49 PM - edited I added comments to the commit, with suggestions for a way you could potentially implement the App without requiring a change to the IGB filters API. I'm worried that making a change to the API could break existing Apps or cause other unforeseen problems. Also, it means we would not be able to use the App in IGB 10.1.0, the released version.

Ann Loraine made changes - 17/Jul/25 4:49 PM

Status

Ready for Pull Request [ 10304 ]

Pull Request Submitted [ 10101 ]

Ann Loraine made changes - 17/Jul/25 4:49 PM

Status

Pull Request Submitted [ 10101 ]

Reviewing Pull Request [ 10303 ]

Ann Loraine made changes - 17/Jul/25 4:49 PM

Status

Reviewing Pull Request [ 10303 ]

To-Do [ 10305 ]

Ann Loraine made changes - 21/Jul/25 10:04 AM

Sprint

Summer 1, Summer 2, Summer 3, Summer 4 [ 218, 219, 220, 221 ]

Summer 1, Summer 2, Summer 3, Summer 4, Summer 5 [ 218, 219, 220, 221, 222 ]

Ann Loraine made changes - 21/Jul/25 10:04 AM

Rank

Ranked higher

Karthik Raveendran made changes - 22/Jul/25 10:58 AM

Status

To-Do [ 10305 ]

In Progress [ 3 ]

Ann Loraine made changes - 06/Aug/25 9:42 AM

Sprint

Summer 1, Summer 2, Summer 3, Summer 4, Summer 5 [ 218, 219, 220, 221, 222 ]

Summer 1, Summer 2, Summer 3, Summer 4, Summer 5, Summer 6 [ 218, 219, 220, 221, 222, 223 ]

Ann Loraine made changes - 06/Aug/25 9:42 AM

Rank

Ranked higher

Ann Loraine made changes - 18/Aug/25 10:33 AM

Sprint

Summer 1, Summer 2, Summer 3, Summer 4, Summer 5, Summer 6 [ 218, 219, 220, 221, 222, 223 ]

Testing 3 : 19 Nov - 29 Nov 2, Summer 1, Summer 2, Summer 3, Summer 4, Summer 5, Summer 6 [ 83, 218, 219, 220, 221, 222, 223 ]

Karthik Raveendran made changes - 21/Aug/25 9:39 AM

Status

In Progress [ 3 ]

To-Do [ 10305 ]

Ann Loraine made changes - 02/Sep/25 7:52 AM

Sprint

Fall 1, Summer 1, Summer 2, Summer 3, Summer 4, Summer 5, Summer 6 [ 83, 218, 219, 220, 221, 222, 223 ]

Fall 1, Summer 1, Summer 2, Summer 3, Summer 4, Summer 5, Summer 6, Fall 2 [ 83, 218, 219, 220, 221, 222, 223, 225 ]

Ann Loraine made changes - 02/Sep/25 7:52 AM

Rank

Ranked higher

Ann Loraine made changes - 16/Sep/25 5:42 PM

Sprint

Fall 1, Summer 1, Summer 2, Summer 3, Summer 4, Summer 5, Summer 6, Fall 2 [ 83, 218, 219, 220, 221, 222, 223, 225 ]

Fall 1, Summer 1, Summer 2, Summer 3, Summer 4, Summer 5, Summer 6, Fall 2, Fall 3 [ 83, 218, 219, 220, 221, 222, 223, 225, 226 ]

Ann Loraine made changes - 16/Sep/25 5:42 PM

Rank

Ranked higher

Ann Loraine made changes - 28/Sep/25 11:52 AM

Sprint

Fall 1, Summer 1, Summer 2, Summer 3, Summer 4, Summer 5, Summer 6, Fall 2, Fall 3 [ 83, 218, 219, 220, 221, 222, 223, 225, 226 ]

Fall 1, Summer 1, Summer 2, Summer 3, Summer 4, Summer 5, Summer 6, Fall 2, Fall 3, Fall 4 [ 83, 218, 219, 220, 221, 222, 223, 225, 226, 227 ]

Ann Loraine made changes - 28/Sep/25 11:52 AM

Rank

Ranked higher

Ann Loraine made changes - 12/Oct/25 10:15 AM

Sprint

Fall 1, Summer 1, Summer 2, Summer 3, Summer 4, Summer 5, Summer 6, Fall 2, Fall 3, Fall 4 [ 83, 218, 219, 220, 221, 222, 223, 225, 226, 227 ]

Fall 1, Summer 1, Summer 2, Summer 3, Summer 4, Summer 5, Summer 6, Fall 2, Fall 3, Fall 4, Fall 5 [ 83, 218, 219, 220, 221, 222, 223, 225, 226, 227, 228 ]

Ann Loraine made changes - 12/Oct/25 10:15 AM

Rank

Ranked higher

Nowlan Freese made changes - 21/Oct/25 10:33 AM

Sprint

Fall 1, Summer 1, Summer 2, Summer 3, Summer 4, Summer 5, Summer 6, Fall 2, Fall 3, Fall 4, Fall 5 [ 83, 218, 219, 220, 221, 222, 223, 225, 226, 227, 228 ]

Fall 1, Summer 1, Summer 2, Summer 3, Summer 4, Summer 5, Summer 6, Fall 2, Fall 3, Fall 4 [ 83, 218, 219, 220, 221, 222, 223, 225, 226, 227 ]

People

Assignee:

Karthik Raveendran

Reporter:

Ann Loraine

Votes:

0 Vote for this issue

Watchers:

3 Start watching this issue

Dates

Created:

04/Jun/25 11:27 AM

Updated:

21/Oct/25 10:33 AM