Uploaded image for project: 'IGB'
  1. IGB
  2. IGBF-3838

Obtain and view sequence read alignments for PBMC dataset from Seurat vignette

    Details

    • Type: Task
    • Status: Closed (View Workflow)
    • Priority: Major
    • Resolution: Done
    • Affects Version/s: None
    • Fix Version/s: None
    • Labels:
      None

      Description

      For this task, locate a BAM format data files for the RNA-Seq data set described in this Seurat tutorial:

      The above tutorial demonstrates features of the Seurat single-cell RNA-Seq data analysis library using data from peripheral blood mononuclear Cells (PBMC) originally from the 10X Genomics Web site.

      Open the data file in IGB. Use the Seurat library (in R) to locate some genes with high counts and then check that IGB shows the same or similar numbers of read alignments.

      Then, make it possible for other people to also open the file and view the contents in IGB.

      Note: If you get stuck on figuring out how to use Seurat for this, ask for help from one of our many local R experts.

        Attachments

          Issue Links

            Activity

            Hide
            karthik Karthik Raveendran added a comment -

            Finding some genes with high UMI counts for a cell was done but to check that IGB shows the same number of reads alignments of a gene from a particular cell is a challenge. After discussing with Nowlan Freese on Wednesday, working on filter by or color by function where a cell can be filtered or colored would make it easy to finish the task

            Show
            karthik Karthik Raveendran added a comment - Finding some genes with high UMI counts for a cell was done but to check that IGB shows the same number of reads alignments of a gene from a particular cell is a challenge. After discussing with Nowlan Freese on Wednesday, working on filter by or color by function where a cell can be filtered or colored would make it easy to finish the task
            Hide
            nfreese Nowlan Freese added a comment -

            Karthik Raveendran - I attached the quickload pointing at the data 10X_PBMC3k files online. Try downloading and adding the quickload to IGB and make sure the data load and look the same as the bam file you downloaded.

            Show
            nfreese Nowlan Freese added a comment - Karthik Raveendran - I attached the quickload pointing at the data 10X_PBMC3k files online. Try downloading and adding the quickload to IGB and make sure the data load and look the same as the bam file you downloaded.
            Hide
            karthik Karthik Raveendran added a comment -

            The following genes (with high counts in count matrix file) was observed:
            IQSEC1
            TCEANC
            DNASE1L1
            MLLT3
            GDPD5
            SNHG9
            CPNE7
            FTL
            RPL13A

            No match found for the following:
            TCEB3
            AP003733.1
            AP000769.7
            TRIM69

            No reads for the following:
            B3GALT4

            Show
            karthik Karthik Raveendran added a comment - The following genes (with high counts in count matrix file) was observed: IQSEC1 TCEANC DNASE1L1 MLLT3 GDPD5 SNHG9 CPNE7 FTL RPL13A No match found for the following: TCEB3 AP003733.1 AP000769.7 TRIM69 No reads for the following: B3GALT4
            Hide
            ann.loraine Ann Loraine added a comment - - edited

            Review from Ann Loraine:

            I looked IQSEC1. The alignments are really weird looking! Please see attached image. I see what appears to be a pretty significant problem in that it really seems like lots and lots of the alignments are not matching the exon/intron boundaries of the gene models.

            Request for Karthik Raveendran:

            Is there any on-line documentation for this alignment file? Can you provide some links so that we can investigate further? If this is truly the source of the big matrix of counts, then there is a very serious problem here that might blow up the entire field. Unlikely! But we need to figure out why the alignments do not appear match up with the gene models very well, regardless.

            To complete this task, please add relevant links and pass the ticket back to "Needs first level review." We'll then make a new task to investigate why the alignments do not match the gene models very well.

            Show
            ann.loraine Ann Loraine added a comment - - edited Review from Ann Loraine : I looked IQSEC1. The alignments are really weird looking! Please see attached image. I see what appears to be a pretty significant problem in that it really seems like lots and lots of the alignments are not matching the exon/intron boundaries of the gene models. Request for Karthik Raveendran : Is there any on-line documentation for this alignment file? Can you provide some links so that we can investigate further? If this is truly the source of the big matrix of counts, then there is a very serious problem here that might blow up the entire field. Unlikely! But we need to figure out why the alignments do not appear match up with the gene models very well, regardless. To complete this task, please add relevant links and pass the ticket back to "Needs first level review." We'll then make a new task to investigate why the alignments do not match the gene models very well.
            Hide
            nfreese Nowlan Freese added a comment - - edited

            The 10X data is currently located at: https://www.10xgenomics.com/datasets/3-k-pbm-cs-from-a-healthy-donor-1-standard-1-1-0
            Note that the website will ask for some personal information.

            The bam file is aligned against the hg19 genome (H_sapiens_Feb_2009)

            Documentation for 10X Genomics Cell Ranger Barcoded BAM tags: https://www.10xgenomics.com/support/software/cell-ranger/latest/analysis/outputs/cr-outputs-bam

            Part of the SAM header:

            STAR	ID:STAR	VN:STAR_2.5.1b	CL:STAR   --runThreadN 4   --genomeDir /mnt/opt/refdata_cellranger/hg19-1.1.0/star   --readFilesIn /mnt/yard/rudy/testing/cellranger_110/public_datasets/pbmc3k/pbmc3k/CELLRANGER_CS/CELLRANGER/EXTRACT_READS/fork0/chnk0/files/reads.fastq/1.fastq      --outStd BAM_Unsorted   --outSAMtype BAM   Unsorted      --outSAMunmapped Within      --chimOutType WithinBAM   --quantMode TranscriptomeSAM      --quantTranscriptomeBan Singleend
            
            Show
            nfreese Nowlan Freese added a comment - - edited The 10X data is currently located at: https://www.10xgenomics.com/datasets/3-k-pbm-cs-from-a-healthy-donor-1-standard-1-1-0 Note that the website will ask for some personal information. The bam file is aligned against the hg19 genome (H_sapiens_Feb_2009) Documentation for 10X Genomics Cell Ranger Barcoded BAM tags: https://www.10xgenomics.com/support/software/cell-ranger/latest/analysis/outputs/cr-outputs-bam Part of the SAM header: STAR ID:STAR VN:STAR_2.5.1b CL:STAR --runThreadN 4 --genomeDir /mnt/opt/refdata_cellranger/hg19-1.1.0/star --readFilesIn /mnt/yard/rudy/testing/cellranger_110/public_datasets/pbmc3k/pbmc3k/CELLRANGER_CS/CELLRANGER/EXTRACT_READS/fork0/chnk0/files/reads.fastq/1.fastq --outStd BAM_Unsorted --outSAMtype BAM Unsorted --outSAMunmapped Within --chimOutType WithinBAM --quantMode TranscriptomeSAM --quantTranscriptomeBan Singleend

              People

              • Assignee:
                karthik Karthik Raveendran
                Reporter:
                ann.loraine Ann Loraine
              • Votes:
                0 Vote for this issue
                Watchers:
                3 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: