Uploaded image for project: 'IGB'
  1. IGB
  2. IGBF-1469

Investigate supporting CRAM files

    Details

    • Story Points:
      3
    • Sprint:
      Winter 2018 Sprint 3, Summer 2019 Sprint 8, Fall 2 2023 Sep 17, Fall 4 2023 Oct 16

      Description

      Discussion on twitter:

      https://twitter.com/pjacock/status/500208907651153920

      CRAM is a compressed, random-access file format used instead of BAM. It may be better. Not yet widely used but may be in future. Supporting CRAM could benefit users.

        Attachments

          Issue Links

            Activity

            ann.loraine Ann Loraine created issue -
            ann.loraine Ann Loraine made changes -
            Field Original Value New Value
            Epic Link IGBF-165 [ 14687 ]
            ann.loraine Ann Loraine made changes -
            Rank Ranked higher
            ann.loraine Ann Loraine made changes -
            Rank Ranked lower
            ann.loraine Ann Loraine made changes -
            Rank Ranked lower
            ann.loraine Ann Loraine made changes -
            Rank Ranked lower
            ann.loraine Ann Loraine made changes -
            Rank Ranked lower
            mason Mason Meyer (Inactive) made changes -
            Epic Link IGBF-165 [ 14687 ] IGBF-440 [ 15498 ]
            mason Mason Meyer (Inactive) made changes -
            Rank Ranked lower
            mason Mason Meyer (Inactive) made changes -
            Epic Link IGBF-440 [ 15498 ] IGBF-501 [ 15563 ]
            mason Mason Meyer (Inactive) made changes -
            Rank Ranked higher
            Hide
            ann.loraine Ann Loraine added a comment -

            Probably this is a low-ish priority as CRAM doesn't seem to be gaining much traction. Other things are probably a lot higher priority.

            Show
            ann.loraine Ann Loraine added a comment - Probably this is a low-ish priority as CRAM doesn't seem to be gaining much traction. Other things are probably a lot higher priority.
            ann.loraine Ann Loraine made changes -
            Project IGB [ 10840 ] Deep Backlog [ 11041 ]
            Key IGBF-166 DB-174
            Workflow Loraine Lab Workflow [ 14817 ] jira [ 16881 ]
            Hide
            nfreese Nowlan Freese added a comment -

            The Genome in a Bottle project has ultralong Oxford Nanopore alignments in cram format.

            Show
            nfreese Nowlan Freese added a comment - The Genome in a Bottle project has ultralong Oxford Nanopore alignments in cram format.
            nfreese Nowlan Freese made changes -
            Assignee David Norris [ dcnorris ]
            nfreese Nowlan Freese made changes -
            Project Deep Backlog [ 11041 ] IGB [ 10840 ]
            Key DB-174 IGBF-1469
            Workflow jira [ 16881 ] Loraine Lab Workflow [ 18183 ]
            nfreese Nowlan Freese made changes -
            Rank Ranked higher
            ann.loraine Ann Loraine made changes -
            Rank Ranked higher
            nfreese Nowlan Freese made changes -
            Rank Ranked higher
            ann.loraine Ann Loraine made changes -
            Rank Ranked higher
            ann.loraine Ann Loraine made changes -
            Story Points 1
            Sprint Winter 2018 Sprint 3 [ 58 ]
            Labels Advanced
            Hide
            ann.loraine Ann Loraine added a comment -

            Task:

            • Investigate and understand the format - make notes here
            • Investigate tooling in htsjdk library to determine if parsing code is available there
            Show
            ann.loraine Ann Loraine added a comment - Task: Investigate and understand the format - make notes here Investigate tooling in htsjdk library to determine if parsing code is available there
            ann.loraine Ann Loraine made changes -
            Rank Ranked higher
            ann.loraine Ann Loraine made changes -
            Sprint Winter 2018 Sprint 3 [ 58 ] Winter 2018 Sprint 3, Spring 2019 Sprint 1 [ 58, 59 ]
            ann.loraine Ann Loraine made changes -
            Rank Ranked higher
            ann.loraine Ann Loraine made changes -
            Sprint Winter 2018 Sprint 3, Spring 2019 Sprint 1 [ 58, 59 ] Winter 2018 Sprint 3 [ 58 ]
            ann.loraine Ann Loraine made changes -
            Rank Ranked lower
            ann.loraine Ann Loraine made changes -
            Sprint Winter 2018 Sprint 3 [ 58 ] Winter 2018 Sprint 3, Spring 2019 Sprint 2 [ 58, 60 ]
            ann.loraine Ann Loraine made changes -
            Rank Ranked higher
            ann.loraine Ann Loraine made changes -
            Sprint Winter 2018 Sprint 3, Spring 2019 Sprint 2 [ 58, 60 ] Winter 2018 Sprint 3 [ 58 ]
            ann.loraine Ann Loraine made changes -
            Rank Ranked lower
            Hide
            ann.loraine Ann Loraine added a comment -
            Show
            ann.loraine Ann Loraine added a comment - Blog post about CRAM - https://brentp.github.io/post/cram-speed/
            Show
            ann.loraine Ann Loraine added a comment - Also see: https://www.ga4gh.org/news/cram4gh-twitter-chat-recap/
            ann.loraine Ann Loraine made changes -
            Fix Version/s 9.1.0 Major Release [ 10601 ]
            Story Points 1 2
            Sprint Winter 2018 Sprint 3 [ 58 ] Winter 2018 Sprint 3, Spring 2019 Sprint 6 [ 58, 64 ]
            ann.loraine Ann Loraine made changes -
            Rank Ranked higher
            ann.loraine Ann Loraine made changes -
            Sprint Winter 2018 Sprint 3, Spring 2019 Sprint 6 [ 58, 64 ] Winter 2018 Sprint 3, Spring 2019 Sprint 7 [ 58, 66 ]
            ann.loraine Ann Loraine made changes -
            Story Points 2 3.5
            ann.loraine Ann Loraine made changes -
            Comment [ BAM files are too big :-) ]
            ann.loraine Ann Loraine made changes -
            Sprint Winter 2018 Sprint 3, Spring 2019 Sprint 7 [ 58, 66 ] Winter 2018 Sprint 3, Summer 2019 Sprint 8 [ 58, 67 ]
            ann.loraine Ann Loraine made changes -
            Rank Ranked lower
            ann.loraine Ann Loraine made changes -
            Rank Ranked higher
            ann.loraine Ann Loraine made changes -
            Epic Link IGBF-501 [ 15563 ] IGBF-1765 [ 17855 ]
            ann.loraine Ann Loraine made changes -
            Story Points 3.5 2
            ann.loraine Ann Loraine made changes -
            Story Points 2 3
            ann.loraine Ann Loraine made changes -
            Story Points 3 2
            shamika Shamika Gajanan Kulkarni (Inactive) made changes -
            Assignee Shamika Gajanan Kulkarni [ shamika ]
            shamika Shamika Gajanan Kulkarni (Inactive) made changes -
            Status Open [ 1 ] In Progress [ 3 ]
            shamika Shamika Gajanan Kulkarni (Inactive) made changes -
            Status In Progress [ 3 ] Open [ 1 ]
            shamika Shamika Gajanan Kulkarni (Inactive) made changes -
            Status Open [ 1 ] In Progress [ 3 ]
            ann.loraine Ann Loraine made changes -
            Sprint Winter 2018 Sprint 3, Summer 2019 Sprint 8 [ 58, 67 ] Winter 2018 Sprint 3, Summer 2019 Sprint 8, Summer 2019 Sprint 9 [ 58, 67, 68 ]
            ann.loraine Ann Loraine made changes -
            Rank Ranked higher
            ann.loraine Ann Loraine made changes -
            Sprint Winter 2018 Sprint 3, Summer 2019 Sprint 8, Summer 2019 Sprint 9 [ 58, 67, 68 ] Winter 2018 Sprint 3, Summer 2019 Sprint 8, Summer 2019 Sprint 11 [ 58, 67, 70 ]
            Assignee Shamika Gajanan Kulkarni [ shamika ]
            ann.loraine Ann Loraine made changes -
            Status In Progress [ 3 ] Open [ 1 ]
            ann.loraine Ann Loraine made changes -
            Sprint Winter 2018 Sprint 3, Summer 2019 Sprint 8, Summer 2019 Sprint 11 [ 58, 67, 70 ] Winter 2018 Sprint 3, Summer 2019 Sprint 8, Summer 2019 Sprint 12 [ 58, 67, 71 ]
            ann.loraine Ann Loraine made changes -
            Sprint Winter 2018 Sprint 3, Summer 2019 Sprint 8, Summer 2019 Sprint 12 [ 58, 67, 71 ] Winter 2018 Sprint 3, Summer 2019 Sprint 8, Summer 2019 Sprint 13 [ 58, 67, 72 ]
            ann.loraine Ann Loraine made changes -
            Rank Ranked lower
            noor91zahara Noor Zahara (Inactive) made changes -
            Rank Ranked higher
            noor91zahara Noor Zahara (Inactive) made changes -
            Rank Ranked lower
            ann.loraine Ann Loraine made changes -
            Sprint Winter 2018 Sprint 3, Summer 2019 Sprint 8, Summer 2019 Sprint 13 [ 58, 67, 72 ] Winter 2018 Sprint 3, Summer 2019 Sprint 8, Summer 2019 Sprint 14 [ 58, 67, 73 ]
            ann.loraine Ann Loraine made changes -
            Workflow Loraine Lab Workflow [ 18183 ] Fall 2019 Workflow Update [ 18723 ]
            ann.loraine Ann Loraine made changes -
            Sprint Winter 2018 Sprint 3, Summer 2019 Sprint 8, Fall 2019 Sprint 2 [ 58, 67, 73 ] Winter 2018 Sprint 3, Summer 2019 Sprint 8, Fall 2019 Sprint 6 [ 58, 67, 76 ]
            ann.loraine Ann Loraine made changes -
            Sprint Winter 2018 Sprint 3, Summer 2019 Sprint 8, Fall 2019 Sprint 5 [ 58, 67, 76 ] Winter 2018 Sprint 3, Summer 2019 Sprint 8, Fall 2019 Sprint 4 [ 58, 67, 75 ]
            ann.loraine Ann Loraine made changes -
            Rank Ranked higher
            ann.loraine Ann Loraine made changes -
            Sprint Winter 2018 Sprint 3, Summer 2019 Sprint 8, Fall 4 : 30 Sep to 11 Oct [ 58, 67, 75 ] Winter 2018 Sprint 3, Summer 2019 Sprint 8, Fall 9 : 9 Dec to 20 Dec [ 58, 67, 80 ]
            ann.loraine Ann Loraine made changes -
            Workflow Fall 2019 Workflow Update [ 18723 ] Revised Fall 2019 Workflow Update [ 20845 ]
            Status Open [ 1 ] In Progress [ 3 ]
            ann.loraine Ann Loraine made changes -
            Fix Version/s 9.1.0 Major Release [ 10601 ]
            ann.loraine Ann Loraine made changes -
            Sprint Winter 2018 Sprint 3, Summer 2019 Sprint 8, Fall 9 : 9 Dec to 20 Dec [ 58, 67, 80 ] Winter 2018 Sprint 3, Summer 2019 Sprint 8 [ 58, 67 ]
            ann.loraine Ann Loraine made changes -
            Rank Ranked higher
            ann.loraine Ann Loraine made changes -
            Status In Progress [ 3 ] Needs 1st Level Review [ 10005 ]
            ann.loraine Ann Loraine made changes -
            Status Needs 1st Level Review [ 10005 ] First Level Review in Progress [ 10301 ]
            ann.loraine Ann Loraine made changes -
            Status First Level Review in Progress [ 10301 ] To-Do [ 10305 ]
            ann.loraine Ann Loraine made changes -
            Rank Ranked higher
            Hide
            ann.loraine Ann Loraine added a comment -

            Nowlan Freese : Consumer genomics companies are using CRAM to distribute data to customers.
            [~aloraine] : Situation has changed. We now have even better reasons to support CRAM natively.

            Show
            ann.loraine Ann Loraine added a comment - Nowlan Freese : Consumer genomics companies are using CRAM to distribute data to customers. [~aloraine] : Situation has changed. We now have even better reasons to support CRAM natively.
            ann.loraine Ann Loraine made changes -
            Rank Ranked lower
            ann.loraine Ann Loraine made changes -
            Rank Ranked higher
            nfreese Nowlan Freese made changes -
            Rank Ranked lower
            nfreese Nowlan Freese made changes -
            Story Points 2 3
            Sprint Winter 2018 Sprint 3, Summer 2019 Sprint 8 [ 58, 67 ] Winter 2018 Sprint 3, Summer 2019 Sprint 8, Fall 2 2023 Sep 17 [ 58, 67, 178 ]
            Assignee Kaushik Gopu [ kgopu ]
            nfreese Nowlan Freese made changes -
            Attachment A_thaliana_Jun_2009_Chr1.bam [ 17963 ]
            Attachment A_thaliana_Jun_2009_Chr1.bam.bai [ 17964 ]
            Attachment A_thaliana_Jun_2009_Chr1.cram [ 17965 ]
            Attachment A_thaliana_Jun_2009_Chr1.sam [ 17966 ]
            Hide
            nfreese Nowlan Freese added a comment - - edited

            I have attached example files for bam (and bai index), sam, and cram (and crai index). Files were created from the same original bam file using samtools - http://lorainelab-quickload.scidas.org/rnaseq/A_thaliana_Jun_2009/auxin_arf19/Col.C.bam. File contains RNA-Seq data from Arabidopsis thaliana (A_thaliana_Jun_2009) within Chr1:6,689-8,835

            Show
            nfreese Nowlan Freese added a comment - - edited I have attached example files for bam (and bai index), sam, and cram (and crai index). Files were created from the same original bam file using samtools - http://lorainelab-quickload.scidas.org/rnaseq/A_thaliana_Jun_2009/auxin_arf19/Col.C.bam . File contains RNA-Seq data from Arabidopsis thaliana (A_thaliana_Jun_2009) within Chr1:6,689-8,835
            nfreese Nowlan Freese made changes -
            Attachment A_thaliana_Jun_2009_Chr1.cram.crai [ 17967 ]
            nfreese Nowlan Freese made changes -
            Status To-Do [ 10305 ] In Progress [ 3 ]
            Show
            kgopu Kaushik Gopu added a comment - - edited Blogs I am referring to understand various mapped sequence data formats and their differences: https://gatk.broadinstitute.org/hc/en-us/articles/360035890791-SAM-or-BAM-or-CRAM-Mapped-sequence-data-formats#:~:text=SAM%20stands%20for%20Sequence%20Alignment,up%20to%20very%20much%20indeed ).
            ann.loraine Ann Loraine made changes -
            Sprint Winter 2018 Sprint 3, Summer 2019 Sprint 8, Fall 2 2023 Sep 17 [ 58, 67, 178 ] Winter 2018 Sprint 3, Summer 2019 Sprint 8, Fall 2 2023 Sep 17, Fall 3 2023 Oct 2 [ 58, 67, 178, 179 ]
            ann.loraine Ann Loraine made changes -
            Rank Ranked higher
            nfreese Nowlan Freese made changes -
            Link This issue is blocked by IGBF-3464 [ IGBF-3464 ]
            nfreese Nowlan Freese made changes -
            Status In Progress [ 3 ] To-Do [ 10305 ]
            nfreese Nowlan Freese made changes -
            Sprint Winter 2018 Sprint 3, Summer 2019 Sprint 8, Fall 2 2023 Sep 17, Fall 3 2023 Oct 2 [ 58, 67, 178, 179 ] Winter 2018 Sprint 3, Summer 2019 Sprint 8, Fall 2 2023 Sep 17, Fall 4 [ 58, 67, 178, 180 ]
            ann.loraine Ann Loraine made changes -
            Epic Link IGBF-1765 [ 17855 ] IGBF-3467 [ 22628 ]
            ann.loraine Ann Loraine made changes -
            Status To-Do [ 10305 ] In Progress [ 3 ]
            ann.loraine Ann Loraine made changes -
            Status In Progress [ 3 ] Needs 1st Level Review [ 10005 ]
            ann.loraine Ann Loraine made changes -
            Status Needs 1st Level Review [ 10005 ] First Level Review in Progress [ 10301 ]
            ann.loraine Ann Loraine made changes -
            Status First Level Review in Progress [ 10301 ] Ready for Pull Request [ 10304 ]
            ann.loraine Ann Loraine made changes -
            Status Ready for Pull Request [ 10304 ] Pull Request Submitted [ 10101 ]
            ann.loraine Ann Loraine made changes -
            Status Pull Request Submitted [ 10101 ] Reviewing Pull Request [ 10303 ]
            ann.loraine Ann Loraine made changes -
            Status Reviewing Pull Request [ 10303 ] Merged Needs Testing [ 10002 ]
            ann.loraine Ann Loraine made changes -
            Status Merged Needs Testing [ 10002 ] Post-merge Testing In Progress [ 10003 ]
            ann.loraine Ann Loraine made changes -
            Resolution Done [ 10000 ]
            Status Post-merge Testing In Progress [ 10003 ] Closed [ 6 ]
            Hide
            ann.loraine Ann Loraine added a comment -

            Conclusion: Decided to proceed with supporting CRAM in IGB

            Show
            ann.loraine Ann Loraine added a comment - Conclusion: Decided to proceed with supporting CRAM in IGB
            nfreese Nowlan Freese made changes -
            Fix Version/s 10.0.0 Major Release [ 10900 ]

              People

              • Assignee:
                kgopu Kaushik Gopu
                Reporter:
                ann.loraine Ann Loraine
              • Votes:
                0 Vote for this issue
                Watchers:
                3 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: