Background:
For both Color by and Filter for SAMtags, users will select/enter a SAMtag, such as CB,CR,UB,MI, and the corresponding values, eg. TGTACTCAGA, that they are interested in seeing or coloring. At any instance, users can only select on SAMtag but they can add multiple values of interest. When the users click OK, this information is then packaged and handed over to SAMtagsFilter or SAMtagsColor to make corresponding changes in the track.
Each read in the track is compared with the user values one by one and then its filtered/not filtered or colored/not colored. For SAMtagsFilter and SAMtagsColor, this conditional step happens in particular function and these functions are called for each and every read alignment:
- For SAMtagsFilter (extends SymmetryFilter), the function is filterSymmetry(BioSeq seq, SeqSymmetry sym) and,
- For SAMtagsColor(extends ColorProvidor), the function is getColor(SeqSymmetry sym)
Every time the function is called read alignment information is passed.
To begin the investigation, I suggest starting from the algorithms in these functions. For single-cell data, the number of reads in a given track could be a lot and processing through each of them could take up some computing power. There was an earlier issue where the filterSymmetry function was called twice for each read which was fixed in an earlier ticket. Considering the volume of data that needs to be handled, it is possible even a small change in the efficiency of the algorithm could have a huge difference.
Background:
For both Color by and Filter for SAMtags, users will select/enter a SAMtag, such as CB,CR,UB,MI, and the corresponding values, eg. TGTACTCAGA, that they are interested in seeing or coloring. At any instance, users can only select on SAMtag but they can add multiple values of interest. When the users click OK, this information is then packaged and handed over to SAMtagsFilter or SAMtagsColor to make corresponding changes in the track.
Each read in the track is compared with the user values one by one and then its filtered/not filtered or colored/not colored. For SAMtagsFilter and SAMtagsColor, this conditional step happens in particular function and these functions are called for each and every read alignment:
Every time the function is called read alignment information is passed.
To begin the investigation, I suggest starting from the algorithms in these functions. For single-cell data, the number of reads in a given track could be a lot and processing through each of them could take up some computing power. There was an earlier issue where the filterSymmetry function was called twice for each read which was fixed in an earlier ticket. Considering the volume of data that needs to be handled, it is possible even a small change in the efficiency of the algorithm could have a huge difference.