Uploaded image for project: 'IGB'
  1. IGB
  2. IGBF-1217

add Genome in a Bottle data sets to IGB Quickload

    Details

    • Story Points:
      2
    • Sprint:
      Fall 4 Oct 26 - Nov 6, Fall 5 Nov 9 - Nov 20, Fall 6 Nov 30 - Dec 11, Fall 7 Dec 14 - Dec 23

      Description

      This is a suggestion from Dr. Steve Chervitz of Personalis in Menlo Park.

        Attachments

          Issue Links

            Activity

            Hide
            nfreese Nowlan Freese added a comment -

            GIAB data sets feature several highly sequenced individuals/families. Each individual has sequencing from different technologies, such as Illumina, Oxford Nanopore, and PacBio, making it a rich source of different data types that are comparable.

            Show
            nfreese Nowlan Freese added a comment - GIAB data sets feature several highly sequenced individuals/families . Each individual has sequencing from different technologies, such as Illumina, Oxford Nanopore, and PacBio , making it a rich source of different data types that are comparable.
            Hide
            nfreese Nowlan Freese added a comment -

            I think we can just add the links to the annots.xml for the hg38 genome in the Quickload. Can structure the folders based on family and then datatype. I think this would be very useful for users.

            Show
            nfreese Nowlan Freese added a comment - I think we can just add the links to the annots.xml for the hg38 genome in the Quickload. Can structure the folders based on family and then datatype. I think this would be very useful for users.
            Hide
            ann.loraine Ann Loraine added a comment -

            Adding to the current sprint.

            Show
            ann.loraine Ann Loraine added a comment - Adding to the current sprint.
            Hide
            nfreese Nowlan Freese added a comment -

            This issue is blocked by IGBF-2589

            Show
            nfreese Nowlan Freese added a comment - This issue is blocked by IGBF-2589
            Hide
            nfreese Nowlan Freese added a comment - - edited

            To test:
            Add the attached quickload (uncompress the 1217.zip) to IGB 9.1.6

            Go to the 2013 Human genome location chr1:44,786,065-44,786,698
            Check that each file in the quickload loads - note that these are very large files, do one at a time.

            Go to the 2009 Human genome location chr1:45,252,412-45,253,438
            Check that each file in the quickload loads - note that these are very large files, do one at a time.

            Note: Colors should be the same for each individual (for example, all of the Ashkenazim mother's files should be the same color).

            Show
            nfreese Nowlan Freese added a comment - - edited To test: Add the attached quickload (uncompress the 1217.zip) to IGB 9.1.6 Go to the 2013 Human genome location chr1:44,786,065-44,786,698 Check that each file in the quickload loads - note that these are very large files, do one at a time. Go to the 2009 Human genome location chr1:45,252,412-45,253,438 Check that each file in the quickload loads - note that these are very large files, do one at a time. Note: Colors should be the same for each individual (for example, all of the Ashkenazim mother's files should be the same color).
            Hide
            chirag24 Chirag Chandrahas Shetty (Inactive) added a comment -

            Getting errors for two tracks:
            1) baylor_NGMKR_all_reads: Exception in thread "AWT-EventQueue-0" java.lang.IllegalStateException: Not connected
            at sun.net.ftp.impl.FtpClient.issueCommand(Unknown Source)
            at sun.net.ftp.impl.FtpClient.issueCommandCheck(Unknown Source)
            at sun.net.ftp.impl.FtpClient.setType(Unknown Source)
            at sun.net.ftp.FtpClient.setBinaryType(Unknown Source)
            at sun.net.www.protocol.ftp.FtpURLConnection.getInputStream(Unknown Source)
            at com.affymetrix.genometry.util.LocalUrlCacher.RetrieveFromURL(LocalUrlCacher.java:460)
            at com.affymetrix.genometry.util.LocalUrlCacher.getInputStream(LocalUrlCacher.java:284)
            at com.affymetrix.genometry.util.LocalUrlCacher.getInputStream(LocalUrlCacher.java:154)
            at com.affymetrix.genometry.util.LocalUrlCacher.getInputStream(LocalUrlCacher.java:115)
            at com.affymetrix.genometry.util.LocalUrlCacher.isURIReachable(LocalUrlCacher.java:859)
            at com.affymetrix.igb.view.load.FeatureTreeView$FeatureTreeCellEditor.isReachable(FeatureTreeView.java:723)
            at com.affymetrix.igb.view.load.FeatureTreeView$FeatureTreeCellEditor.access$600(FeatureTreeView.java:655)
            at com.affymetrix.igb.view.load.FeatureTreeView$FeatureTreeCellEditor$FeatureLoadAction.actionPerformed(FeatureTreeView.java:683)
            at javax.swing.AbstractButton.fireActionPerformed(Unknown Source)
            at javax.swing.AbstractButton$Handler.actionPerformed(Unknown Source)
            at javax.swing.DefaultButtonModel.fireActionPerformed(Unknown Source)
            at javax.swing.JToggleButton$ToggleButtonModel.setPressed(Unknown Source)
            at javax.swing.plaf.basic.BasicButtonListener.mouseReleased(Unknown Source)
            at java.awt.AWTEventMulticaster.mouseReleased(Unknown Source)
            at java.awt.Component.processMouseEvent(Unknown Source)
            at javax.swing.JComponent.processMouseEvent(Unknown Source)
            at java.awt.Component.processEvent(Unknown Source)
            at java.awt.Container.processEvent(Unknown Source)
            at java.awt.Component.dispatchEventImpl(Unknown Source)
            at java.awt.Container.dispatchEventImpl(Unknown Source)
            at java.awt.Component.dispatchEvent(Unknown Source)
            at javax.swing.plaf.basic.BasicTreeUI$MouseInputHandler.mouseReleased(Unknown Source)
            at java.awt.AWTEventMulticaster.mouseReleased(Unknown Source)
            at java.awt.Component.processMouseEvent(Unknown Source)
            at javax.swing.JComponent.processMouseEvent(Unknown Source)
            at java.awt.Component.processEvent(Unknown Source)
            at java.awt.Container.processEvent(Unknown Source)
            at java.awt.Component.dispatchEventImpl(Unknown Source)
            at java.awt.Container.dispatchEventImpl(Unknown Source)
            at java.awt.Component.dispatchEvent(Unknown Source)
            at java.awt.LightweightDispatcher.retargetMouseEvent(Unknown Source)
            at java.awt.LightweightDispatcher.processMouseEvent(Unknown Source)
            at java.awt.LightweightDispatcher.dispatchEvent(Unknown Source)
            at java.awt.Container.dispatchEventImpl(Unknown Source)
            at java.awt.Window.dispatchEventImpl(Unknown Source)
            at java.awt.Component.dispatchEvent(Unknown Source)
            at java.awt.EventQueue.dispatchEventImpl(Unknown Source)
            at java.awt.EventQueue.access$500(Unknown Source)
            at java.awt.EventQueue$3.run(Unknown Source)
            at java.awt.EventQueue$3.run(Unknown Source)
            at java.security.AccessController.doPrivileged(Native Method)
            at java.security.ProtectionDomain$JavaSecurityAccessImpl.doIntersectionPrivilege(Unknown Source)
            at java.security.ProtectionDomain$JavaSecurityAccessImpl.doIntersectionPrivilege(Unknown Source)
            at java.awt.EventQueue$4.run(Unknown Source)
            at java.awt.EventQueue$4.run(Unknown Source)
            at java.security.AccessController.doPrivileged(Native Method)
            at java.security.ProtectionDomain$JavaSecurityAccessImpl.doIntersectionPrivilege(Unknown Source)
            at java.awt.EventQueue.dispatchEvent(Unknown Source)
            at java.awt.EventDispatchThread.pumpOneEventForFilters(Unknown Source)
            at java.awt.EventDispatchThread.pumpEventsForFilter(Unknown Source)
            at java.awt.EventDispatchThread.pumpEventsForHierarchy(Unknown Source)
            at java.awt.EventDispatchThread.pumpEvents(Unknown Source)
            at java.awt.EventDispatchThread.pumpEvents(Unknown Source)
            at java.awt.EventDispatchThread.run(Unknown Source)

            2) BGISEQ500_PE100: org.lorainelab.igb.bam.BAM$BamIndexNotFoundException: Could not find Bam Index File.
            at org.lorainelab.igb.bam.BAM.findIndexFile(BAM.java:433)
            at org.lorainelab.igb.bam.BAM.getBamIndexUriStr(BAM.java:139)
            at org.lorainelab.igb.bam.BAM.getSAMFileReader(BAM.java:123)
            at org.lorainelab.igb.bam.BAM.init(BAM.java:157)
            at org.lorainelab.igb.bam.XAM.getChromosomeList(XAM.java:100)
            at com.affymetrix.genometry.quickload.QuickLoadSymLoader.loadAndAddSymmetries(QuickLoadSymLoader.java:153)
            at com.affymetrix.genometry.quickload.QuickLoadSymLoader.loadSymmetriesThread(QuickLoadSymLoader.java:139)
            at com.affymetrix.genometry.quickload.QuickLoadSymLoader.loadFeatures(QuickLoadSymLoader.java:119)
            at com.affymetrix.igb.view.load.GeneralLoadUtils.loadFeaturesForSym(GeneralLoadUtils.java:751)
            at com.affymetrix.igb.view.load.GeneralLoadUtils$2.runInBackground(GeneralLoadUtils.java:693)
            at com.affymetrix.igb.view.load.GeneralLoadUtils$2.runInBackground(GeneralLoadUtils.java:688)
            at com.affymetrix.genometry.thread.CThreadWorker.doInBackground(CThreadWorker.java:73)
            at javax.swing.SwingWorker$1.call(Unknown Source)
            at java.util.concurrent.FutureTask.run(Unknown Source)
            at javax.swing.SwingWorker.run(Unknown Source)
            at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
            at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
            at java.lang.Thread.run(Unknown Source)

            Show
            chirag24 Chirag Chandrahas Shetty (Inactive) added a comment - Getting errors for two tracks: 1) baylor_NGMKR_all_reads: Exception in thread "AWT-EventQueue-0" java.lang.IllegalStateException: Not connected at sun.net.ftp.impl.FtpClient.issueCommand(Unknown Source) at sun.net.ftp.impl.FtpClient.issueCommandCheck(Unknown Source) at sun.net.ftp.impl.FtpClient.setType(Unknown Source) at sun.net.ftp.FtpClient.setBinaryType(Unknown Source) at sun.net.www.protocol.ftp.FtpURLConnection.getInputStream(Unknown Source) at com.affymetrix.genometry.util.LocalUrlCacher.RetrieveFromURL(LocalUrlCacher.java:460) at com.affymetrix.genometry.util.LocalUrlCacher.getInputStream(LocalUrlCacher.java:284) at com.affymetrix.genometry.util.LocalUrlCacher.getInputStream(LocalUrlCacher.java:154) at com.affymetrix.genometry.util.LocalUrlCacher.getInputStream(LocalUrlCacher.java:115) at com.affymetrix.genometry.util.LocalUrlCacher.isURIReachable(LocalUrlCacher.java:859) at com.affymetrix.igb.view.load.FeatureTreeView$FeatureTreeCellEditor.isReachable(FeatureTreeView.java:723) at com.affymetrix.igb.view.load.FeatureTreeView$FeatureTreeCellEditor.access$600(FeatureTreeView.java:655) at com.affymetrix.igb.view.load.FeatureTreeView$FeatureTreeCellEditor$FeatureLoadAction.actionPerformed(FeatureTreeView.java:683) at javax.swing.AbstractButton.fireActionPerformed(Unknown Source) at javax.swing.AbstractButton$Handler.actionPerformed(Unknown Source) at javax.swing.DefaultButtonModel.fireActionPerformed(Unknown Source) at javax.swing.JToggleButton$ToggleButtonModel.setPressed(Unknown Source) at javax.swing.plaf.basic.BasicButtonListener.mouseReleased(Unknown Source) at java.awt.AWTEventMulticaster.mouseReleased(Unknown Source) at java.awt.Component.processMouseEvent(Unknown Source) at javax.swing.JComponent.processMouseEvent(Unknown Source) at java.awt.Component.processEvent(Unknown Source) at java.awt.Container.processEvent(Unknown Source) at java.awt.Component.dispatchEventImpl(Unknown Source) at java.awt.Container.dispatchEventImpl(Unknown Source) at java.awt.Component.dispatchEvent(Unknown Source) at javax.swing.plaf.basic.BasicTreeUI$MouseInputHandler.mouseReleased(Unknown Source) at java.awt.AWTEventMulticaster.mouseReleased(Unknown Source) at java.awt.Component.processMouseEvent(Unknown Source) at javax.swing.JComponent.processMouseEvent(Unknown Source) at java.awt.Component.processEvent(Unknown Source) at java.awt.Container.processEvent(Unknown Source) at java.awt.Component.dispatchEventImpl(Unknown Source) at java.awt.Container.dispatchEventImpl(Unknown Source) at java.awt.Component.dispatchEvent(Unknown Source) at java.awt.LightweightDispatcher.retargetMouseEvent(Unknown Source) at java.awt.LightweightDispatcher.processMouseEvent(Unknown Source) at java.awt.LightweightDispatcher.dispatchEvent(Unknown Source) at java.awt.Container.dispatchEventImpl(Unknown Source) at java.awt.Window.dispatchEventImpl(Unknown Source) at java.awt.Component.dispatchEvent(Unknown Source) at java.awt.EventQueue.dispatchEventImpl(Unknown Source) at java.awt.EventQueue.access$500(Unknown Source) at java.awt.EventQueue$3.run(Unknown Source) at java.awt.EventQueue$3.run(Unknown Source) at java.security.AccessController.doPrivileged(Native Method) at java.security.ProtectionDomain$JavaSecurityAccessImpl.doIntersectionPrivilege(Unknown Source) at java.security.ProtectionDomain$JavaSecurityAccessImpl.doIntersectionPrivilege(Unknown Source) at java.awt.EventQueue$4.run(Unknown Source) at java.awt.EventQueue$4.run(Unknown Source) at java.security.AccessController.doPrivileged(Native Method) at java.security.ProtectionDomain$JavaSecurityAccessImpl.doIntersectionPrivilege(Unknown Source) at java.awt.EventQueue.dispatchEvent(Unknown Source) at java.awt.EventDispatchThread.pumpOneEventForFilters(Unknown Source) at java.awt.EventDispatchThread.pumpEventsForFilter(Unknown Source) at java.awt.EventDispatchThread.pumpEventsForHierarchy(Unknown Source) at java.awt.EventDispatchThread.pumpEvents(Unknown Source) at java.awt.EventDispatchThread.pumpEvents(Unknown Source) at java.awt.EventDispatchThread.run(Unknown Source) 2) BGISEQ500_PE100: org.lorainelab.igb.bam.BAM$BamIndexNotFoundException: Could not find Bam Index File. at org.lorainelab.igb.bam.BAM.findIndexFile(BAM.java:433) at org.lorainelab.igb.bam.BAM.getBamIndexUriStr(BAM.java:139) at org.lorainelab.igb.bam.BAM.getSAMFileReader(BAM.java:123) at org.lorainelab.igb.bam.BAM.init(BAM.java:157) at org.lorainelab.igb.bam.XAM.getChromosomeList(XAM.java:100) at com.affymetrix.genometry.quickload.QuickLoadSymLoader.loadAndAddSymmetries(QuickLoadSymLoader.java:153) at com.affymetrix.genometry.quickload.QuickLoadSymLoader.loadSymmetriesThread(QuickLoadSymLoader.java:139) at com.affymetrix.genometry.quickload.QuickLoadSymLoader.loadFeatures(QuickLoadSymLoader.java:119) at com.affymetrix.igb.view.load.GeneralLoadUtils.loadFeaturesForSym(GeneralLoadUtils.java:751) at com.affymetrix.igb.view.load.GeneralLoadUtils$2.runInBackground(GeneralLoadUtils.java:693) at com.affymetrix.igb.view.load.GeneralLoadUtils$2.runInBackground(GeneralLoadUtils.java:688) at com.affymetrix.genometry.thread.CThreadWorker.doInBackground(CThreadWorker.java:73) at javax.swing.SwingWorker$1.call(Unknown Source) at java.util.concurrent.FutureTask.run(Unknown Source) at javax.swing.SwingWorker.run(Unknown Source) at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source) at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) at java.lang.Thread.run(Unknown Source)
            Hide
            nfreese Nowlan Freese added a comment -

            The Baylor_NGMLR_all_reads file for the Ashkenazim son appears to have been removed from the GIAB ftp. I have removed it from the annots.xml.
            ftp://ftp-trace.ncbi.nlm.nih.gov/giab/ftp/data/AshkenazimTrio/HG002_NA24385_son/PacBio_MtSinai_NIST/Baylor_NGMLR_bam_GRCh37/all_reads.fa.giab_h002_ngmlr-0.2.3_mapped.bam

            The BGISEQ500_PE100 for NA12878 does not have an index file available. I have removed it from the annots.xml.
            ftp://ftp-trace.ncbi.nlm.nih.gov/giab/ftp/data/NA12878/BGISEQ500/standard_library/BGISEQ_PE100_NA12878.sorted.bam

            Show
            nfreese Nowlan Freese added a comment - The Baylor_NGMLR_all_reads file for the Ashkenazim son appears to have been removed from the GIAB ftp. I have removed it from the annots.xml. ftp://ftp-trace.ncbi.nlm.nih.gov/giab/ftp/data/AshkenazimTrio/HG002_NA24385_son/PacBio_MtSinai_NIST/Baylor_NGMLR_bam_GRCh37/all_reads.fa.giab_h002_ngmlr-0.2.3_mapped.bam The BGISEQ500_PE100 for NA12878 does not have an index file available. I have removed it from the annots.xml. ftp://ftp-trace.ncbi.nlm.nih.gov/giab/ftp/data/NA12878/BGISEQ500/standard_library/BGISEQ_PE100_NA12878.sorted.bam
            Hide
            nfreese Nowlan Freese added a comment - - edited

            I created a bitbucket repo with the GIAB quickload here. We can update the igbDefaultPrefs.json (attached) found in core/igb-preferences/src/main/resources/ to include the new repo. Seems to be working well with initial testing.

            Note that the url to include the quickload needs to be

            Or we can add the quickload info to the DNA-Seq repo.

            [~aloraine] please advise on how you would like to proceed.

            Show
            nfreese Nowlan Freese added a comment - - edited I created a bitbucket repo with the GIAB quickload here . We can update the igbDefaultPrefs.json (attached) found in core/igb-preferences/src/main/resources/ to include the new repo. Seems to be working well with initial testing. Note that the url to include the quickload needs to be https://bitbucket.org/nfreese/giab-quickload/raw/main/quickload/ Or we can add the quickload info to the DNA-Seq repo. [~aloraine] please advise on how you would like to proceed.
            Hide
            ann.loraine Ann Loraine added a comment - - edited

            Rather than point IGB to the master branch, I think we ought to point it instead to specific branches with names like "release-0.0.1"

            This will enable us to freely modify the master branch as needed in future.
            Also, I would recommend change the "master" branch name to "main" as it is much easier to explain to people. The audience for this repo will likely include biologists with a little bit of knowledge about git but maybe not a lot. Explaining the term "master branch" is always difficult because it sounds like we mean "master copy" but that's not what it is. It's just another branch, after all! It just happens to be what you're "on" first when you clone the repo or display it on a hosting service like github or bitbucket.

            Here is a "how-to" on renaming master to main: https://www.git-tower.com/learn/git/faq/git-rename-master-to-main/

            Show
            ann.loraine Ann Loraine added a comment - - edited Rather than point IGB to the master branch, I think we ought to point it instead to specific branches with names like "release-0.0.1" This will enable us to freely modify the master branch as needed in future. Also, I would recommend change the "master" branch name to "main" as it is much easier to explain to people. The audience for this repo will likely include biologists with a little bit of knowledge about git but maybe not a lot. Explaining the term "master branch" is always difficult because it sounds like we mean "master copy" but that's not what it is. It's just another branch, after all! It just happens to be what you're "on" first when you clone the repo or display it on a hosting service like github or bitbucket. Here is a "how-to" on renaming master to main: https://www.git-tower.com/learn/git/faq/git-rename-master-to-main/
            Hide
            ann.loraine Ann Loraine added a comment - - edited

            btw this is really cool! I only wish it was a bit faster to get the data from the ftp site into IGB.

            Show
            ann.loraine Ann Loraine added a comment - - edited btw this is really cool! I only wish it was a bit faster to get the data from the ftp site into IGB.
            Hide
            nfreese Nowlan Freese added a comment -

            Unfortunately the ftp site does seem to be quite slow. Many of the files are also very large with a high depth of sequencing, so loading data for even a single gene can take a while.

            I have updated the branch from master to main.

            I'm not sure if having specific branches with release versions would be necessary. Using branches would force us to update the igbDefaultPrefs.json file every time we made a change to the quickload. I added a v1.0.0 tag to the GIAB commit so that it is marked as the first release.

            Show
            nfreese Nowlan Freese added a comment - Unfortunately the ftp site does seem to be quite slow. Many of the files are also very large with a high depth of sequencing, so loading data for even a single gene can take a while. I have updated the branch from master to main. I'm not sure if having specific branches with release versions would be necessary. Using branches would force us to update the igbDefaultPrefs.json file every time we made a change to the quickload. I added a v1.0.0 tag to the GIAB commit so that it is marked as the first release.
            Hide
            ann.loraine Ann Loraine added a comment -

            Does the path to the quickload site itself include the tag name? If it is always pointing to main, then we will be unable to make any changes to main without having to test every single release of IGB that might be installed on user's computers. Doing that would be a huge pain in the neck.

            Show
            ann.loraine Ann Loraine added a comment - Does the path to the quickload site itself include the tag name? If it is always pointing to main, then we will be unable to make any changes to main without having to test every single release of IGB that might be installed on user's computers. Doing that would be a huge pain in the neck.
            Hide
            ann.loraine Ann Loraine added a comment -

            A tag is merely a pointer to a specific commit. I'm not sure that you can re-attach it to a different commit after the fact. So I'm not sure how this will work....

            Show
            ann.loraine Ann Loraine added a comment - A tag is merely a pointer to a specific commit. I'm not sure that you can re-attach it to a different commit after the fact. So I'm not sure how this will work....
            Hide
            nfreese Nowlan Freese added a comment -

            I have created a pull request with the changes to igbDefaultPrefs.json here:
            https://bitbucket.org/lorainelab/integrated-genome-browser/pull-requests/855

            To make the mirror I created a github repository and used the instructions here to create a git mirror.

            git clone --mirror https://github.com/exampleuser/repository-to-mirror.git
            cd repository-to-mirror.git
            git remote set-url --push origin https://github.com/exampleuser/mirrored
            git fetch -p origin
            git push --mirror

            Show
            nfreese Nowlan Freese added a comment - I have created a pull request with the changes to igbDefaultPrefs.json here: https://bitbucket.org/lorainelab/integrated-genome-browser/pull-requests/855 To make the mirror I created a github repository and used the instructions here to create a git mirror. git clone --mirror https://github.com/exampleuser/repository-to-mirror.git cd repository-to-mirror.git git remote set-url --push origin https://github.com/exampleuser/mirrored git fetch -p origin git push --mirror
            Hide
            ann.loraine Ann Loraine added a comment -

            PR is merged into IGB code base.

            Show
            ann.loraine Ann Loraine added a comment - PR is merged into IGB code base.
            Hide
            ann.loraine Ann Loraine added a comment - - edited

            Moved to complete. Thank you Nowlan Freese!

            If there are additional tasks to be done (???), let's make new Jira issue for them.

            Show
            ann.loraine Ann Loraine added a comment - - edited Moved to complete. Thank you Nowlan Freese ! If there are additional tasks to be done (???), let's make new Jira issue for them.
            Hide
            nfreese Nowlan Freese added a comment -

            Tested on 9.1.8 jar with no issues.

            Show
            nfreese Nowlan Freese added a comment - Tested on 9.1.8 jar with no issues.

              People

              • Assignee:
                nfreese Nowlan Freese
                Reporter:
                ann.loraine Ann Loraine
              • Votes:
                0 Vote for this issue
                Watchers:
                Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: