Uploaded image for project: 'IGB'
  1. IGB
  2. IGBF-3748

Add individual genome CRAM file to IGB quickload via svn

    Details

    • Type: Task
    • Status: To-Do (View Workflow)
    • Priority: Major
    • Resolution: Unresolved
    • Affects Version/s: None
    • Fix Version/s: None
    • Labels:
      None

      Description

      Situation: We have identified a publicly available Nebula Genomics CRAM file (link). Nowlan has downloaded the file and created an index, and was able to view the file in IGB. We would like to make this file available to IGB users as an example of a consumer genomics CRAM file.

      Task: Upload and deploy the CRAM file (NB72462M.cram), CRAM index, VCF file, and VCF index and make them available through an IGB Quickload. Also deploy the IGBF-3841 via the annots.xml.

      Note: Data are from Personal Genome Project
      Link to metadata in PGP: https://my.pgp-hms.org/profile/huF7A4DE
      Link to PGP search page: https://my.pgp-hms.org/public_genetic_data

        Attachments

        1. annots.xml
          2 kB
        2. quickload_V2.zip
          9 kB
        3. quickload_v3.zip
          9 kB
        4. quickload.zip
          8 kB

          Issue Links

            Activity

            Hide
            nfreese Nowlan Freese added a comment - - edited

            Ann Loraine:

            • I installed SVN on my new machine, but the subversion repository appears to be down again and is responding with svn: E200029: Couldn't perform atomic initialization
            • We had previously discussed some changes to the annots.xml, but I don't think the changes were saved/updated in the ticket (I'm pretty sure the attached zip is the old/original version). I have added an updated version with what I can remember we discussed for review.
              • Note that the url listed in the annots.xml will be updated to point at whatever documentation we create about the source of the CRAM file. We will also need to update the location of the index.

            I think we should bump this ticket down a couple of sprints. This file will not load in IGB 10.0.1 so we will need to do the IGB 10.1.0 release first. I think it would be confusing for users if a new dataset appeared in the default data for humans but wouldn't load.

            Discussed with Dr. Loraine, she suggested adding the changes to annots.xml but having it commented out.

            Show
            nfreese Nowlan Freese added a comment - - edited Ann Loraine : I installed SVN on my new machine, but the subversion repository appears to be down again and is responding with svn: E200029: Couldn't perform atomic initialization We had previously discussed some changes to the annots.xml, but I don't think the changes were saved/updated in the ticket (I'm pretty sure the attached zip is the old/original version). I have added an updated version with what I can remember we discussed for review. Note that the url listed in the annots.xml will be updated to point at whatever documentation we create about the source of the CRAM file. We will also need to update the location of the index. I think we should bump this ticket down a couple of sprints. This file will not load in IGB 10.0.1 so we will need to do the IGB 10.1.0 release first. I think it would be confusing for users if a new dataset appeared in the default data for humans but wouldn't load. Discussed with Dr. Loraine, she suggested adding the changes to annots.xml but having it commented out.
            Hide
            ann.loraine Ann Loraine added a comment -

            Very sorry for blocking your work. I restarted the svn site and reattached the virtual hard drive storing the data.
            Logged in using ec2-user@svn.bioviz.org, then did:

            [elastic svn ~]$ sudo su
            [root@ip-172-31-63-94 ec2-user]# ls /svn
            [root@ip-172-31-63-94 ec2-user]# mount -t ext4 /dev/xvdf /svn
            [root@ip-172-31-63-94 ec2-user]# ls /svn
            genomes  lost+found  svn-auth
            [root@ip-172-31-63-94 ec2-user]# 
            
            Show
            ann.loraine Ann Loraine added a comment - Very sorry for blocking your work. I restarted the svn site and reattached the virtual hard drive storing the data. Logged in using ec2-user@svn.bioviz.org, then did: [elastic svn ~]$ sudo su [root@ip-172-31-63-94 ec2-user]# ls /svn [root@ip-172-31-63-94 ec2-user]# mount -t ext4 /dev/xvdf /svn [root@ip-172-31-63-94 ec2-user]# ls /svn genomes lost+found svn-auth [root@ip-172-31-63-94 ec2-user]#
            Hide
            ann.loraine Ann Loraine added a comment -

            Follow-up regarding suggestion of commentint-out new data sets in Quickload main:

            • I am not actually sure IGB can respect comments in an annots.xml file.
            Show
            ann.loraine Ann Loraine added a comment - Follow-up regarding suggestion of commentint-out new data sets in Quickload main: I am not actually sure IGB can respect comments in an annots.xml file.
            Hide
            nfreese Nowlan Freese added a comment - - edited

            Some additional notes:

            • Should be able to comment out the new file in the annots.xml (see stackoverflow).
            • It looks like we will need to store the index somewhere online that we can provide a full URL to. I have tried multiple versions of the quickload using bam/cram files. The only time the annots.xml index attribute appears to work is if a full URL path is provided for both the name and index attributes. This doesn't line up with our current annots.xml documentation, but does seem to line up with the requirements for the IGBF-445 ticket where the work was done (links to files and their indexes stored online in different locations). Note that the commit for this work was quite extensive, so it is unclear how easy it would be to improve the annots.xml index attribute logic. I also could not find any examples in our current quickloads where we use the index attribute.
            • See my attached annots.xml for examples of what I tested, the only files that did not throw an error regarding the index attribute were "NB72462M - Personal Genome Project" and "thing4".

            Error:

            java.lang.IllegalArgumentException: URI is not absolute
            	at java.base/java.net.URL.of(URL.java:862)
            	at java.base/java.net.URI.toURL(URI.java:1172)
            	at org.lorainelab.igb.bam.BAM.getSAMFileReader(BAM.java:106)
            	at org.lorainelab.igb.bam.BAM.init(BAM.java:160)
            	at org.lorainelab.igb.bam.XAM.getChromosomeList(XAM.java:100)
            	at com.affymetrix.genometry.quickload.QuickLoadSymLoader.loadAndAddSymmetries(QuickLoadSymLoader.java:153)
            	at com.affymetrix.genometry.quickload.QuickLoadSymLoader.loadSymmetriesThread(QuickLoadSymLoader.java:139)
            	at com.affymetrix.genometry.quickload.QuickLoadSymLoader.loadFeatures(QuickLoadSymLoader.java:119)
            	at com.affymetrix.igb.view.load.GeneralLoadUtils.loadFeaturesForSym(GeneralLoadUtils.java:768)
            	at com.affymetrix.igb.view.load.GeneralLoadUtils$2.runInBackground(GeneralLoadUtils.java:710)
            	at com.affymetrix.igb.view.load.GeneralLoadUtils$2.runInBackground(GeneralLoadUtils.java:705)
            	at com.affymetrix.genometry.thread.CThreadWorker.doInBackground(CThreadWorker.java:73)
            	at java.desktop/javax.swing.SwingWorker$1.call(SwingWorker.java:305)
            	at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:317)
            	at java.desktop/javax.swing.SwingWorker.run(SwingWorker.java:342)
            	at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1144)
            	at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:642)
            	at java.base/java.lang.Thread.run(Thread.java:1583)
            
            Show
            nfreese Nowlan Freese added a comment - - edited Some additional notes: Should be able to comment out the new file in the annots.xml (see stackoverflow ). It looks like we will need to store the index somewhere online that we can provide a full URL to. I have tried multiple versions of the quickload using bam/cram files. The only time the annots.xml index attribute appears to work is if a full URL path is provided for both the name and index attributes. This doesn't line up with our current annots.xml documentation , but does seem to line up with the requirements for the IGBF-445 ticket where the work was done (links to files and their indexes stored online in different locations). Note that the commit for this work was quite extensive, so it is unclear how easy it would be to improve the annots.xml index attribute logic. I also could not find any examples in our current quickloads where we use the index attribute. See my attached annots.xml for examples of what I tested, the only files that did not throw an error regarding the index attribute were "NB72462M - Personal Genome Project" and "thing4". Error: java.lang.IllegalArgumentException: URI is not absolute at java.base/java.net.URL.of(URL.java:862) at java.base/java.net.URI.toURL(URI.java:1172) at org.lorainelab.igb.bam.BAM.getSAMFileReader(BAM.java:106) at org.lorainelab.igb.bam.BAM.init(BAM.java:160) at org.lorainelab.igb.bam.XAM.getChromosomeList(XAM.java:100) at com.affymetrix.genometry.quickload.QuickLoadSymLoader.loadAndAddSymmetries(QuickLoadSymLoader.java:153) at com.affymetrix.genometry.quickload.QuickLoadSymLoader.loadSymmetriesThread(QuickLoadSymLoader.java:139) at com.affymetrix.genometry.quickload.QuickLoadSymLoader.loadFeatures(QuickLoadSymLoader.java:119) at com.affymetrix.igb.view.load.GeneralLoadUtils.loadFeaturesForSym(GeneralLoadUtils.java:768) at com.affymetrix.igb.view.load.GeneralLoadUtils$2.runInBackground(GeneralLoadUtils.java:710) at com.affymetrix.igb.view.load.GeneralLoadUtils$2.runInBackground(GeneralLoadUtils.java:705) at com.affymetrix.genometry.thread.CThreadWorker.doInBackground(CThreadWorker.java:73) at java.desktop/javax.swing.SwingWorker$1.call(SwingWorker.java:305) at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:317) at java.desktop/javax.swing.SwingWorker.run(SwingWorker.java:342) at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1144) at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:642) at java.base/java.lang. Thread .run( Thread .java:1583)
            Hide
            nfreese Nowlan Freese added a comment -

            Added quickload_v3.zip which includes the IGBF-3841 data.

            Show
            nfreese Nowlan Freese added a comment - Added quickload_v3.zip which includes the IGBF-3841 data.

              People

              • Assignee:
                ann.loraine Ann Loraine
                Reporter:
                nfreese Nowlan Freese
              • Votes:
                0 Vote for this issue
                Watchers:
                3 Start watching this issue

                Dates

                • Created:
                  Updated: