[IGBF-2741] Make an App that allows filtering by "id" - part 1 - JIRA UNCC

Details

Type: Task
Status: Closed (View Workflow)
Priority: Major
Resolution: Done
Affects Version/s: None
Fix Version/s: None
Labels:
None

Story Points:
3
Epic Link:
Improve IGB for users
Sprint:
Winter 1 Dec 28 - Jan 8, Winter 2 Jan 11 - Jan 22, Winter 3 Jan 25 - Feb 5, Winter 4 Feb 8 - Feb 19, Winter 5 Feb 22 - Mar 5, Winter 6 Mar 8 - Mar 19, Spring 1 2021 Mar 22 - Apr 2

Description

Because the methods for creating the sequence data start with minute amounts of RNA per cell, the protocols use PCR amplification in order to produce enough material for sequencing. This means that the same fragment of RNA will often get copied many, many times. Thanks to this, the number of sequences observed per gene will be only loosely related to the number of mRNAs from that gene that were in the original sample.

To get around this, the experimental protocols include a step that adds a "UMI" sequence tag to every read that came from the same RNA molecule. (UMI stands for "unique molecular identifier.") So instead of counting every single read, the data analysis protocols instead count the number of unique "UMIs" per gene. To keep track of UMIs in the data processing steps, the computational pipelines typically append the UMI sequence to the read name. In IGB, this "read name" is also the "id" attribute.

In addition to copying the UMI sequence, the pipelines also often copy another string that uniquely identifies the particular cell that the read came from. This is called a "cell barcode" and is also introduced into every read as part of the experimental protocol that produces the data.

Therefore in IGB it would be super-useful if we can create a filter that limits the reads being show to a specific string that the user enters. This would allow users to get a better sense of how often a given UMI appears in their data. It would also allow them to visualize gene expression for a single cell instead of looking at all of the data at once for every cell.

Also, this type of thing would be useful for any type of track, not only BAM tracks.

For this App, please implemented a new "filter by name" option that lets a user hide all items in a track that do not match the name.

Lastly, it is not clear yet whether we can introduce new filters as Apps. So a big part of this task will involve understanding the existing filtering system for tracks to see if an App can be added to implement a new one.

Attachments

Issue Links

relates to

IGBF-2817 Improve user-facing documentation for Filter-by-Id app

Closed

Activity

Ascending order - Click to sort in descending order

Ann Loraine created issue - 06/Jan/21 9:40 AM

Ann Loraine made changes - 06/Jan/21 9:40 AM

Field	Original Value	New Value
Epic Link		IGBF-1765 [ 17855 ]

Ann Loraine made changes - 06/Jan/21 9:40 AM

Link

This issue relates to IGBF-2712 [ IGBF-2712 ]

Ann Loraine made changes - 06/Jan/21 9:40 AM

Rank

Ranked higher

Ann Loraine made changes - 06/Jan/21 9:41 AM

Description

Nearly every single-cell RNA-Seq pipeline aligns the "raw" sequence data onto a reference, usually a reference genome sequence. This is an essential step in the workflow. This step is done in order to generate "counts per gene per cell" spreadsheets that are then analyzed using unsupervised clustering methods and other such things. The "counts per gene per cell" are supposed to reflect the number of RNAs observed from the gene. However, there is a problem with that!

Because the methods for creating the sequence data start with minute amounts of RNA per cell, the protocols use PCR amplification in order to produce enough material for sequencing. One problem with this however is that this means that the same fragment of RNA will often get copied many, many times. Thanks to this, the number of sequences observed per gene will be only loosely related to the number of mRNAs from that gene that were in the original sample.

To get around this, the experimental protocols include a step that adds a "UMI" sequence tag to every read that came from the same RNA molecule. So instead of counting every single read, the data analysis protocols instead count the number of unique "UMIs" per gene. To keep track of UMIs, the read alignment and processing computational pipelines typically append the UMI sequence to the read name. In IGB, this "read name" is actually the "id" attribute.

In addition to copying the UMI sequence, the pipelines also often copy another string that uniquely identifies the particular cell that the read came from.

Therefore in IGB it would be super-useful if we can create a filter that limits the reads being show to a specific string that the user enters.

Also, this type of thing would be useful for any type of track, not only BAM tracks.

For this App, please implemented a new "filter by name" option that lets a user hide all items in a track that do not match the name.

Nearly every single-cell RNA-Seq pipeline aligns the "raw" sequence data onto a reference, usually a reference genome sequence. This is an essential step in the workflow. This step is done in order to generate "counts per gene per cell" spreadsheets that are then analyzed using unsupervised clustering methods and other such methods. The "counts per gene per cell" are supposed to reflect the number of RNAs observed from the gene. However, there is a problem with that!

Because the methods for creating the sequence data start with minute amounts of RNA per cell, the protocols use PCR amplification in order to produce enough material for sequencing. One problem with this however is that this means that the same fragment of RNA will often get copied many, many times. Thanks to this, the number of sequences observed per gene will be only loosely related to the number of mRNAs from that gene that were in the original sample.

To get around this, the experimental protocols include a step that adds a "UMI" sequence tag to every read that came from the same RNA molecule. So instead of counting every single read, the data analysis protocols instead count the number of unique "UMIs" per gene. To keep track of UMIs, the read alignment and processing computational pipelines typically append the UMI sequence to the read name. In IGB, this "read name" is actually the "id" attribute.

In addition to copying the UMI sequence, the pipelines also often copy another string that uniquely identifies the particular cell that the read came from.

Therefore in IGB it would be super-useful if we can create a filter that limits the reads being show to a specific string that the user enters.

Also, this type of thing would be useful for any type of track, not only BAM tracks.

For this App, please implemented a new "filter by name" option that lets a user hide all items in a track that do not match the name.

Ann Loraine made changes - 06/Jan/21 9:41 AM

Description

Nearly every single-cell RNA-Seq pipeline aligns the "raw" sequence data onto a reference, usually a reference genome sequence. This is an essential step in the workflow. This step is done in order to generate "counts per gene per cell" spreadsheets that are then analyzed using unsupervised clustering methods and other such methods. The "counts per gene per cell" are supposed to reflect the number of RNAs observed from the gene. However, there is a problem with that!

Because the methods for creating the sequence data start with minute amounts of RNA per cell, the protocols use PCR amplification in order to produce enough material for sequencing. One problem with this however is that this means that the same fragment of RNA will often get copied many, many times. Thanks to this, the number of sequences observed per gene will be only loosely related to the number of mRNAs from that gene that were in the original sample.

To get around this, the experimental protocols include a step that adds a "UMI" sequence tag to every read that came from the same RNA molecule. So instead of counting every single read, the data analysis protocols instead count the number of unique "UMIs" per gene. To keep track of UMIs, the read alignment and processing computational pipelines typically append the UMI sequence to the read name. In IGB, this "read name" is actually the "id" attribute.

In addition to copying the UMI sequence, the pipelines also often copy another string that uniquely identifies the particular cell that the read came from.

Therefore in IGB it would be super-useful if we can create a filter that limits the reads being show to a specific string that the user enters.

Also, this type of thing would be useful for any type of track, not only BAM tracks.

For this App, please implemented a new "filter by name" option that lets a user hide all items in a track that do not match the name.

Nearly every single-cell RNA-Seq pipeline aligns the "raw" sequence data onto a reference, usually a reference genome sequence. This is an essential step in the workflow. This step is done in order to generate "counts per gene per cell" spreadsheets that are then analyzed using unsupervised clustering methods and other such methods. The "counts per gene per cell" numbers are supposed to reflect the number of RNAs observed from the gene. However, there is a problem with that!

Because the methods for creating the sequence data start with minute amounts of RNA per cell, the protocols use PCR amplification in order to produce enough material for sequencing. One problem with this however is that this means that the same fragment of RNA will often get copied many, many times. Thanks to this, the number of sequences observed per gene will be only loosely related to the number of mRNAs from that gene that were in the original sample.

To get around this, the experimental protocols include a step that adds a "UMI" sequence tag to every read that came from the same RNA molecule. So instead of counting every single read, the data analysis protocols instead count the number of unique "UMIs" per gene. To keep track of UMIs, the read alignment and processing computational pipelines typically append the UMI sequence to the read name. In IGB, this "read name" is actually the "id" attribute.

In addition to copying the UMI sequence, the pipelines also often copy another string that uniquely identifies the particular cell that the read came from.

Therefore in IGB it would be super-useful if we can create a filter that limits the reads being show to a specific string that the user enters.

Also, this type of thing would be useful for any type of track, not only BAM tracks.

For this App, please implemented a new "filter by name" option that lets a user hide all items in a track that do not match the name.

Ann Loraine made changes - 06/Jan/21 9:41 AM

Description

Nearly every single-cell RNA-Seq pipeline aligns the "raw" sequence data onto a reference, usually a reference genome sequence. This is an essential step in the workflow. This step is done in order to generate "counts per gene per cell" spreadsheets that are then analyzed using unsupervised clustering methods and other such methods. The "counts per gene per cell" numbers are supposed to reflect the number of RNAs observed from the gene. However, there is a problem with that!

Because the methods for creating the sequence data start with minute amounts of RNA per cell, the protocols use PCR amplification in order to produce enough material for sequencing. One problem with this however is that this means that the same fragment of RNA will often get copied many, many times. Thanks to this, the number of sequences observed per gene will be only loosely related to the number of mRNAs from that gene that were in the original sample.

To get around this, the experimental protocols include a step that adds a "UMI" sequence tag to every read that came from the same RNA molecule. So instead of counting every single read, the data analysis protocols instead count the number of unique "UMIs" per gene. To keep track of UMIs, the read alignment and processing computational pipelines typically append the UMI sequence to the read name. In IGB, this "read name" is actually the "id" attribute.

In addition to copying the UMI sequence, the pipelines also often copy another string that uniquely identifies the particular cell that the read came from.

Therefore in IGB it would be super-useful if we can create a filter that limits the reads being show to a specific string that the user enters.

Also, this type of thing would be useful for any type of track, not only BAM tracks.

For this App, please implemented a new "filter by name" option that lets a user hide all items in a track that do not match the name.

Nearly every single-cell RNA-Seq pipeline aligns the "raw" sequence data onto a reference, usually a reference genome sequence. This is an essential step in the workflow. This step is done in order to generate "counts per gene per cell" spreadsheets that are then analyzed using unsupervised clustering methods and other such methods. The "counts per gene per cell" numbers are supposed to reflect the number of RNAs observed from the gene. However, there is a problem with that!

Because the methods for creating the sequence data start with minute amounts of RNA per cell, the protocols use PCR amplification in order to produce enough material for sequencing. This means that the same fragment of RNA will often get copied many, many times. Thanks to this, the number of sequences observed per gene will be only loosely related to the number of mRNAs from that gene that were in the original sample.

To get around this, the experimental protocols include a step that adds a "UMI" sequence tag to every read that came from the same RNA molecule. So instead of counting every single read, the data analysis protocols instead count the number of unique "UMIs" per gene. To keep track of UMIs, the read alignment and processing computational pipelines typically append the UMI sequence to the read name. In IGB, this "read name" is actually the "id" attribute.

In addition to copying the UMI sequence, the pipelines also often copy another string that uniquely identifies the particular cell that the read came from.

Therefore in IGB it would be super-useful if we can create a filter that limits the reads being show to a specific string that the user enters.

Also, this type of thing would be useful for any type of track, not only BAM tracks.

For this App, please implemented a new "filter by name" option that lets a user hide all items in a track that do not match the name.

Ann Loraine made changes - 06/Jan/21 9:42 AM

Description

Nearly every single-cell RNA-Seq pipeline aligns the "raw" sequence data onto a reference, usually a reference genome sequence. This is an essential step in the workflow. This step is done in order to generate "counts per gene per cell" spreadsheets that are then analyzed using unsupervised clustering methods and other such methods. The "counts per gene per cell" numbers are supposed to reflect the number of RNAs observed from the gene. However, there is a problem with that!

Because the methods for creating the sequence data start with minute amounts of RNA per cell, the protocols use PCR amplification in order to produce enough material for sequencing. This means that the same fragment of RNA will often get copied many, many times. Thanks to this, the number of sequences observed per gene will be only loosely related to the number of mRNAs from that gene that were in the original sample.

To get around this, the experimental protocols include a step that adds a "UMI" sequence tag to every read that came from the same RNA molecule. So instead of counting every single read, the data analysis protocols instead count the number of unique "UMIs" per gene. To keep track of UMIs, the read alignment and processing computational pipelines typically append the UMI sequence to the read name. In IGB, this "read name" is actually the "id" attribute.

In addition to copying the UMI sequence, the pipelines also often copy another string that uniquely identifies the particular cell that the read came from.

Therefore in IGB it would be super-useful if we can create a filter that limits the reads being show to a specific string that the user enters.

Also, this type of thing would be useful for any type of track, not only BAM tracks.

For this App, please implemented a new "filter by name" option that lets a user hide all items in a track that do not match the name.

Nearly every single-cell RNA-Seq pipeline aligns the "raw" sequence data onto a reference, usually a reference genome sequence. This is an essential step in the workflow. This step is done in order to generate "counts per gene per cell" spreadsheets that are then analyzed using unsupervised clustering methods and other such methods. The "counts per gene per cell" numbers are supposed to reflect the number of RNAs observed from the gene. However, there is a problem with that!

Because the methods for creating the sequence data start with minute amounts of RNA per cell, the protocols use PCR amplification in order to produce enough material for sequencing. This means that the same fragment of RNA will often get copied many, many times. Thanks to this, the number of sequences observed per gene will be only loosely related to the number of mRNAs from that gene that were in the original sample.

To get around this, the experimental protocols include a step that adds a "UMI" sequence tag to every read that came from the same RNA molecule. (UMI stands for "unique molecular identifier.") So instead of counting every single read, the data analysis protocols instead count the number of unique "UMIs" per gene. To keep track of UMIs, the read alignment and processing computational pipelines typically append the UMI sequence to the read name. In IGB, this "read name" is actually the "id" attribute.

In addition to copying the UMI sequence, the pipelines also often copy another string that uniquely identifies the particular cell that the read came from.

Therefore in IGB it would be super-useful if we can create a filter that limits the reads being show to a specific string that the user enters.

Also, this type of thing would be useful for any type of track, not only BAM tracks.

For this App, please implemented a new "filter by name" option that lets a user hide all items in a track that do not match the name.

Ann Loraine made changes - 06/Jan/21 9:42 AM

Description

Nearly every single-cell RNA-Seq pipeline aligns the "raw" sequence data onto a reference, usually a reference genome sequence. This is an essential step in the workflow. This step is done in order to generate "counts per gene per cell" spreadsheets that are then analyzed using unsupervised clustering methods and other such methods. The "counts per gene per cell" numbers are supposed to reflect the number of RNAs observed from the gene. However, there is a problem with that!

Because the methods for creating the sequence data start with minute amounts of RNA per cell, the protocols use PCR amplification in order to produce enough material for sequencing. This means that the same fragment of RNA will often get copied many, many times. Thanks to this, the number of sequences observed per gene will be only loosely related to the number of mRNAs from that gene that were in the original sample.

To get around this, the experimental protocols include a step that adds a "UMI" sequence tag to every read that came from the same RNA molecule. (UMI stands for "unique molecular identifier.") So instead of counting every single read, the data analysis protocols instead count the number of unique "UMIs" per gene. To keep track of UMIs, the read alignment and processing computational pipelines typically append the UMI sequence to the read name. In IGB, this "read name" is actually the "id" attribute.

In addition to copying the UMI sequence, the pipelines also often copy another string that uniquely identifies the particular cell that the read came from.

Therefore in IGB it would be super-useful if we can create a filter that limits the reads being show to a specific string that the user enters.

Also, this type of thing would be useful for any type of track, not only BAM tracks.

For this App, please implemented a new "filter by name" option that lets a user hide all items in a track that do not match the name.

Nearly every single-cell RNA-Seq pipeline aligns the "raw" sequence data onto a reference, usually a reference genome sequence. This is an essential step in the workflow. This step is done in order to generate "counts per gene per cell" spreadsheets that are then analyzed using unsupervised clustering methods and other such methods. The "counts per gene per cell" numbers are supposed to reflect the number of RNAs observed from the gene. However, there is a problem with that!

Because the methods for creating the sequence data start with minute amounts of RNA per cell, the protocols use PCR amplification in order to produce enough material for sequencing. This means that the same fragment of RNA will often get copied many, many times. Thanks to this, the number of sequences observed per gene will be only loosely related to the number of mRNAs from that gene that were in the original sample.

To get around this, the experimental protocols include a step that adds a "UMI" sequence tag to every read that came from the same RNA molecule. (UMI stands for "unique molecular identifier.") So instead of counting every single read, the data analysis protocols instead count the number of unique "UMIs" per gene. To keep track of UMIs in the data processing steps, the computational pipelines typically append the UMI sequence to the read name. In IGB, this "read name" is also the "id" attribute.

In addition to copying the UMI sequence, the pipelines also often copy another string that uniquely identifies the particular cell that the read came from.

Therefore in IGB it would be super-useful if we can create a filter that limits the reads being show to a specific string that the user enters.

Also, this type of thing would be useful for any type of track, not only BAM tracks.

For this App, please implemented a new "filter by name" option that lets a user hide all items in a track that do not match the name.

Ann Loraine made changes - 06/Jan/21 9:43 AM

Description

Nearly every single-cell RNA-Seq pipeline aligns the "raw" sequence data onto a reference, usually a reference genome sequence. This is an essential step in the workflow. This step is done in order to generate "counts per gene per cell" spreadsheets that are then analyzed using unsupervised clustering methods and other such methods. The "counts per gene per cell" numbers are supposed to reflect the number of RNAs observed from the gene. However, there is a problem with that!

Because the methods for creating the sequence data start with minute amounts of RNA per cell, the protocols use PCR amplification in order to produce enough material for sequencing. This means that the same fragment of RNA will often get copied many, many times. Thanks to this, the number of sequences observed per gene will be only loosely related to the number of mRNAs from that gene that were in the original sample.

To get around this, the experimental protocols include a step that adds a "UMI" sequence tag to every read that came from the same RNA molecule. (UMI stands for "unique molecular identifier.") So instead of counting every single read, the data analysis protocols instead count the number of unique "UMIs" per gene. To keep track of UMIs in the data processing steps, the computational pipelines typically append the UMI sequence to the read name. In IGB, this "read name" is also the "id" attribute.

In addition to copying the UMI sequence, the pipelines also often copy another string that uniquely identifies the particular cell that the read came from.

Therefore in IGB it would be super-useful if we can create a filter that limits the reads being show to a specific string that the user enters.

Also, this type of thing would be useful for any type of track, not only BAM tracks.

For this App, please implemented a new "filter by name" option that lets a user hide all items in a track that do not match the name.

Nearly every single-cell RNA-Seq pipeline aligns the "raw" sequence data onto a reference, usually a reference genome sequence. This is an essential step in the workflow. This step is done in order to generate "counts per gene per cell" spreadsheets that are then analyzed using unsupervised clustering methods and other such methods. The "counts per gene per cell" numbers are supposed to reflect the number of RNAs observed from the gene. However, there is a problem with that!

Because the methods for creating the sequence data start with minute amounts of RNA per cell, the protocols use PCR amplification in order to produce enough material for sequencing. This means that the same fragment of RNA will often get copied many, many times. Thanks to this, the number of sequences observed per gene will be only loosely related to the number of mRNAs from that gene that were in the original sample.

To get around this, the experimental protocols include a step that adds a "UMI" sequence tag to every read that came from the same RNA molecule. (UMI stands for "unique molecular identifier.") So instead of counting every single read, the data analysis protocols instead count the number of unique "UMIs" per gene. To keep track of UMIs in the data processing steps, the computational pipelines typically append the UMI sequence to the read name. In IGB, this "read name" is also the "id" attribute.

In addition to copying the UMI sequence, the pipelines also often copy another string that uniquely identifies the particular cell that the read came from. This is called a "cell barcode" and is also introduced into every read as part of the experimental protocol that produces the data.

Therefore in IGB it would be super-useful if we can create a filter that limits the reads being show to a specific string that the user enters.

Also, this type of thing would be useful for any type of track, not only BAM tracks.

For this App, please implemented a new "filter by name" option that lets a user hide all items in a track that do not match the name.

Ann Loraine made changes - 06/Jan/21 9:43 AM

Description

Nearly every single-cell RNA-Seq pipeline aligns the "raw" sequence data onto a reference, usually a reference genome sequence. This is an essential step in the workflow. This step is done in order to generate "counts per gene per cell" spreadsheets that are then analyzed using unsupervised clustering methods and other such methods. The "counts per gene per cell" numbers are supposed to reflect the number of RNAs observed from the gene. However, there is a problem with that!

Because the methods for creating the sequence data start with minute amounts of RNA per cell, the protocols use PCR amplification in order to produce enough material for sequencing. This means that the same fragment of RNA will often get copied many, many times. Thanks to this, the number of sequences observed per gene will be only loosely related to the number of mRNAs from that gene that were in the original sample.

To get around this, the experimental protocols include a step that adds a "UMI" sequence tag to every read that came from the same RNA molecule. (UMI stands for "unique molecular identifier.") So instead of counting every single read, the data analysis protocols instead count the number of unique "UMIs" per gene. To keep track of UMIs in the data processing steps, the computational pipelines typically append the UMI sequence to the read name. In IGB, this "read name" is also the "id" attribute.

In addition to copying the UMI sequence, the pipelines also often copy another string that uniquely identifies the particular cell that the read came from. This is called a "cell barcode" and is also introduced into every read as part of the experimental protocol that produces the data.

Therefore in IGB it would be super-useful if we can create a filter that limits the reads being show to a specific string that the user enters.

Also, this type of thing would be useful for any type of track, not only BAM tracks.

For this App, please implemented a new "filter by name" option that lets a user hide all items in a track that do not match the name.

Nearly every single-cell RNA-Seq pipeline aligns the "raw" sequence data onto a reference, usually a reference genome sequence. This is an essential step in the workflow. This step is done in order to generate "counts per gene per cell" spreadsheets that are then analyzed using unsupervised clustering methods and other such methods. The "counts per gene per cell" numbers are supposed to reflect the number of RNAs observed from the gene. However, there is a problem with that!

Because the methods for creating the sequence data start with minute amounts of RNA per cell, the protocols use PCR amplification in order to produce enough material for sequencing. This means that the same fragment of RNA will often get copied many, many times. Thanks to this, the number of sequences observed per gene will be only loosely related to the number of mRNAs from that gene that were in the original sample.

To get around this, the experimental protocols include a step that adds a "UMI" sequence tag to every read that came from the same RNA molecule. (UMI stands for "unique molecular identifier.") So instead of counting every single read, the data analysis protocols instead count the number of unique "UMIs" per gene. To keep track of UMIs in the data processing steps, the computational pipelines typically append the UMI sequence to the read name. In IGB, this "read name" is also the "id" attribute.

In addition to copying the UMI sequence, the pipelines also often copy another string that uniquely identifies the particular cell that the read came from. This is called a "cell barcode" and is also introduced into every read as part of the experimental protocol that produces the data.

Therefore in IGB it would be super-useful if we can create a filter that limits the reads being show to a specific string that the user enters. This would allow users to get a better sense of how often a given UMI appears in their data. It would also allow them to visualize gene expression for a single cell instead of looking at all of the data at once for every cell.

Also, this type of thing would be useful for any type of track, not only BAM tracks.

For this App, please implemented a new "filter by name" option that lets a user hide all items in a track that do not match the name.

Ann Loraine made changes - 06/Jan/21 9:45 AM

Description

Nearly every single-cell RNA-Seq pipeline aligns the "raw" sequence data onto a reference, usually a reference genome sequence. This is an essential step in the workflow. This step is done in order to generate "counts per gene per cell" spreadsheets that are then analyzed using unsupervised clustering methods and other such methods. The "counts per gene per cell" numbers are supposed to reflect the number of RNAs observed from the gene. However, there is a problem with that!

Because the methods for creating the sequence data start with minute amounts of RNA per cell, the protocols use PCR amplification in order to produce enough material for sequencing. This means that the same fragment of RNA will often get copied many, many times. Thanks to this, the number of sequences observed per gene will be only loosely related to the number of mRNAs from that gene that were in the original sample.

To get around this, the experimental protocols include a step that adds a "UMI" sequence tag to every read that came from the same RNA molecule. (UMI stands for "unique molecular identifier.") So instead of counting every single read, the data analysis protocols instead count the number of unique "UMIs" per gene. To keep track of UMIs in the data processing steps, the computational pipelines typically append the UMI sequence to the read name. In IGB, this "read name" is also the "id" attribute.

In addition to copying the UMI sequence, the pipelines also often copy another string that uniquely identifies the particular cell that the read came from. This is called a "cell barcode" and is also introduced into every read as part of the experimental protocol that produces the data.

Therefore in IGB it would be super-useful if we can create a filter that limits the reads being show to a specific string that the user enters. This would allow users to get a better sense of how often a given UMI appears in their data. It would also allow them to visualize gene expression for a single cell instead of looking at all of the data at once for every cell.

Also, this type of thing would be useful for any type of track, not only BAM tracks.

For this App, please implemented a new "filter by name" option that lets a user hide all items in a track that do not match the name.

Nearly every single-cell RNA-Seq pipeline aligns the "raw" sequence data onto a reference, usually a reference genome sequence. This is an essential step in the workflow. This step is done in order to generate "counts per gene per cell" spreadsheets that are then analyzed using unsupervised clustering methods and other such methods. The "counts per gene per cell" numbers are supposed to reflect the number of RNAs observed from the gene. However, there is a problem with that!

Because the methods for creating the sequence data start with minute amounts of RNA per cell, the protocols use PCR amplification in order to produce enough material for sequencing. This means that the same fragment of RNA will often get copied many, many times. Thanks to this, the number of sequences observed per gene will be only loosely related to the number of mRNAs from that gene that were in the original sample.

To get around this, the experimental protocols include a step that adds a "UMI" sequence tag to every read that came from the same RNA molecule. (UMI stands for "unique molecular identifier.") So instead of counting every single read, the data analysis protocols instead count the number of unique "UMIs" per gene. To keep track of UMIs in the data processing steps, the computational pipelines typically append the UMI sequence to the read name. In IGB, this "read name" is also the "id" attribute.

In addition to copying the UMI sequence, the pipelines also often copy another string that uniquely identifies the particular cell that the read came from. This is called a "cell barcode" and is also introduced into every read as part of the experimental protocol that produces the data.

Therefore in IGB it would be super-useful if we can create a filter that limits the reads being show to a specific string that the user enters. This would allow users to get a better sense of how often a given UMI appears in their data. It would also allow them to visualize gene expression for a single cell instead of looking at all of the data at once for every cell.

Also, this type of thing would be useful for any type of track, not only BAM tracks.

For this App, please implemented a new "filter by name" option that lets a user hide all items in a track that do not match the name.

Lastly, it is not clear yet whether we can introduce new filters as Apps. So a big part of this task will involve understanding the existing filtering system for tracks to see if an App can be added to implement a new one.

Irvin Naylor (Inactive) made changes - 06/Jan/21 10:04 AM

Assignee

Irvin Naylor [ inaylor ]

Irvin Naylor (Inactive) made changes - 06/Jan/21 3:13 PM

Status

To-Do [ 10305 ]

In Progress [ 3 ]

Ann Loraine made changes - 07/Jan/21 10:20 AM

Assignee

Irvin Naylor [ inaylor ]

Noor Zahara [ noor91zahara ]

Ann Loraine made changes - 11/Jan/21 9:16 AM

Summary

Make an App that allows filtering by "id"

Make an App that allows filtering by "id" - part 1

Ann Loraine made changes - 11/Jan/21 9:40 AM

Sprint

Winter 1 Dec 28 - Jan 8 [ 111 ]

Winter 1 Dec 28 - Jan 8, Winter 2 Jan 11 - Jan 22 [ 111, 112 ]

Ann Loraine made changes - 11/Jan/21 9:40 AM

Rank

Ranked higher

Irvin Naylor (Inactive) made changes - 13/Jan/21 1:12 PM

Assignee

Noor Zahara [ noor91zahara ]

Irvin Naylor [ inaylor ]

Irvin Naylor (Inactive) made changes - 21/Jan/21 2:02 PM

Status

In Progress [ 3 ]

Needs 1st Level Review [ 10005 ]

Irvin Naylor (Inactive) made changes - 21/Jan/21 2:19 PM

Assignee

Irvin Naylor [ inaylor ]

Ann Loraine made changes - 25/Jan/21 9:20 AM

Sprint

Winter 1 Dec 28 - Jan 8, Winter 2 Jan 11 - Jan 22 [ 111, 112 ]

Winter 1 Dec 28 - Jan 8, Winter 2 Jan 11 - Jan 22, Winter 3 Jan 25 - Feb 5 [ 111, 112, 113 ]

Ann Loraine made changes - 25/Jan/21 9:20 AM

Rank

Ranked higher

Ann Loraine made changes - 28/Jan/21 9:01 AM

Status

Needs 1st Level Review [ 10005 ]

First Level Review in Progress [ 10301 ]

Ann Loraine made changes - 28/Jan/21 9:01 AM

Assignee

Irvin Naylor [ inaylor ]

Ann Loraine made changes - 28/Jan/21 9:01 AM

Status

First Level Review in Progress [ 10301 ]

To-Do [ 10305 ]

Irvin Naylor (Inactive) made changes - 28/Jan/21 9:51 AM

Assignee

Irvin Naylor [ inaylor ]

Ann Loraine made changes - 29/Jan/21 9:10 AM

Assignee

Irvin Naylor [ inaylor ]

Irvin Naylor (Inactive) made changes - 29/Jan/21 10:04 AM

Status

To-Do [ 10305 ]

In Progress [ 3 ]

Irvin Naylor (Inactive) made changes - 29/Jan/21 10:04 AM

Status

In Progress [ 3 ]

Needs 1st Level Review [ 10005 ]

Irvin Naylor (Inactive) made changes - 29/Jan/21 10:04 AM

Assignee

Irvin Naylor [ inaylor ]

Ann Loraine made changes - 06/Feb/21 8:05 PM

Sprint

Winter 1 Dec 28 - Jan 8, Winter 2 Jan 11 - Jan 22, Winter 3 Jan 25 - Feb 5 [ 111, 112, 113 ]

Winter 1 Dec 28 - Jan 8, Winter 2 Jan 11 - Jan 22, Winter 3 Jan 25 - Feb 5, Winter 4 Feb 8 - Feb 19 [ 111, 112, 113, 114 ]

Ann Loraine made changes - 06/Feb/21 8:06 PM

Rank

Ranked higher

Ann Loraine made changes - 22/Feb/21 12:23 PM

Sprint

Winter 1 Dec 28 - Jan 8, Winter 2 Jan 11 - Jan 22, Winter 3 Jan 25 - Feb 5, Winter 4 Feb 8 - Feb 19 [ 111, 112, 113, 114 ]

Winter 1 Dec 28 - Jan 8, Winter 2 Jan 11 - Jan 22, Winter 3 Jan 25 - Feb 5, Winter 4 Feb 8 - Feb 19, Winter 5 Feb 22 - Mar 5 [ 111, 112, 113, 114, 115 ]

Ann Loraine made changes - 22/Feb/21 12:23 PM

Rank

Ranked higher

Irvin Naylor (Inactive) made changes - 23/Feb/21 10:33 AM

Assignee

Irvin Naylor [ inaylor ]

Ann Loraine made changes - 05/Mar/21 6:49 PM

Sprint

Winter 1 Dec 28 - Jan 8, Winter 2 Jan 11 - Jan 22, Winter 3 Jan 25 - Feb 5, Winter 4 Feb 8 - Feb 19, Winter 5 Feb 22 - Mar 5 [ 111, 112, 113, 114, 115 ]

Winter 1 Dec 28 - Jan 8, Winter 2 Jan 11 - Jan 22, Winter 3 Jan 25 - Feb 5, Winter 4 Feb 8 - Feb 19, Winter 5 Feb 22 - Mar 5, Winter 6 Mar 8 - Mar 19 [ 111, 112, 113, 114, 115, 116 ]

Ann Loraine made changes - 05/Mar/21 6:49 PM

Rank

Ranked higher

Ann Loraine made changes - 09/Mar/21 10:30 AM

Assignee

Irvin Naylor [ inaylor ]

Logan Weidenhammer (Inactive) made changes - 12/Mar/21 10:02 AM

Assignee

Rachel Weidenhammer [ rweidenh ]

Logan Weidenhammer (Inactive) made changes - 12/Mar/21 10:02 AM

Status

Needs 1st Level Review [ 10005 ]

First Level Review in Progress [ 10301 ]

Logan Weidenhammer (Inactive) made changes - 12/Mar/21 12:46 PM

Status

First Level Review in Progress [ 10301 ]

To-Do [ 10305 ]

Logan Weidenhammer (Inactive) made changes - 12/Mar/21 12:46 PM

Assignee

Rachel Weidenhammer [ rweidenh ]

Irvin Naylor [ inaylor ]

Ann Loraine made changes - 16/Mar/21 8:37 AM

Link

This issue relates to ~~IGBF-2817~~ [ ~~IGBF-2817~~ ]

Ann Loraine made changes - 16/Mar/21 10:15 AM

Rank

Ranked higher

Ann Loraine made changes - 16/Mar/21 10:15 AM

Status

To-Do [ 10305 ]

In Progress [ 3 ]

Ann Loraine made changes - 16/Mar/21 10:15 AM

Status

In Progress [ 3 ]

Needs 1st Level Review [ 10005 ]

Ann Loraine made changes - 16/Mar/21 10:15 AM

Status

Needs 1st Level Review [ 10005 ]

First Level Review in Progress [ 10301 ]

Ann Loraine made changes - 16/Mar/21 10:15 AM

Status

First Level Review in Progress [ 10301 ]

Ready for Pull Request [ 10304 ]

Irvin Naylor (Inactive) made changes - 18/Mar/21 5:25 PM

Status

Ready for Pull Request [ 10304 ]

Pull Request Submitted [ 10101 ]

Irvin Naylor (Inactive) made changes - 18/Mar/21 5:25 PM

Status

Pull Request Submitted [ 10101 ]

Reviewing Pull Request [ 10303 ]

Irvin Naylor (Inactive) made changes - 18/Mar/21 5:25 PM

Status

Reviewing Pull Request [ 10303 ]

Merged Needs Testing [ 10002 ]

Irvin Naylor (Inactive) made changes - 18/Mar/21 5:25 PM

Status

Merged Needs Testing [ 10002 ]

Post-merge Testing In Progress [ 10003 ]

Irvin Naylor (Inactive) made changes - 18/Mar/21 5:26 PM

Resolution		Done [ 10000 ]
Status	Post-merge Testing In Progress [ 10003 ]	Closed [ 6 ]

Irvin Naylor (Inactive) made changes - 23/Mar/21 10:18 AM

Sprint

Winter 1 Dec 28 - Jan 8, Winter 2 Jan 11 - Jan 22, Winter 3 Jan 25 - Feb 5, Winter 4 Feb 8 - Feb 19, Winter 5 Feb 22 - Mar 5, Winter 6 Mar 8 - Mar 19 [ 111, 112, 113, 114, 115, 116 ]

Winter 1 Dec 28 - Jan 8, Winter 2 Jan 11 - Jan 22, Winter 3 Jan 25 - Feb 5, Winter 4 Feb 8 - Feb 19, Winter 5 Feb 22 - Mar 5, Winter 6 Mar 8 - Mar 19, Spring 1 Mar 22 - Apr 2 [ 111, 112, 113, 114, 115, 116, 117 ]

People

Assignee:

Irvin Naylor (Inactive)

Reporter:

Ann Loraine

Votes:

0 Vote for this issue

Watchers:

5 Start watching this issue

Dates

Created:

06/Jan/21 9:40 AM

Updated:

23/Mar/21 10:18 AM

Resolved:

18/Mar/21 5:26 PM