Details
-
Type: New Feature
-
Status: Closed (View Workflow)
-
Priority: Major
-
Resolution: Done
-
Affects Version/s: None
-
Fix Version/s: 9.1.6 Major Release
-
Labels:None
-
Story Points:3
-
Epic Link:
-
Sprint:Fall 1: 14 Sep - 25 Sep, Fall 2: 28 Sep - 9 Oct, Fall 3: Oct 12 - Oct 23, Fall 6 Nov 30 - Dec 11
Description
Currently, IGB assumes that the genome sequence for a genome version resides physically on the same server as the Quickload metadata files and that the genome version file is named for the genome.
(For details see, the wiki documentation on IGB Quickload.)
Modify this so that a user can specific the genome file using the "file" tag.
e.g.,
<file name="foo.2bit" reference="true"/>
and
<file name="https://example.com/foo.2bit" reference="True"/>
and
<file name="http://example.com/foo.2bit" reference="True"/>
For example 2bit files corresponding to various genomes, see:
This location contains multiple subdirectories for different genome versions. For example,
contains a file named mm10.2bit (http://hgdownload.soe.ucsc.edu/goldenPath/mm10/bigZips/mm10.2bit) which contains the same data shown for IGB genome version M_musculus_Dec_2011.
Attachments
- quickload.tar.gz
- 2 kB
- SARS-genome.png
- 81 kB
Activity
Sure!
First, read this:
Sharing data using QuickLoad sites (IGB users guide): https://wiki.transvar.org/display/igbman/Sharing+data+using+QuickLoad+sites
Attaching "tarball" file with simple Quickload site. Example data are from the first isolate of the novel coronavirus, sequenced and published back in January 2020. Image shows what IGB looks like when you select this genome.
Select species: Sarbecovirus SARS-CoV-19
Reminder: Unpack "tarball" with:
tar -xzf quickload.tar.gz
Once this feature is implemented, then IGB will load the sequence data directly from the referenced URL when they click the "Load Sequence" button.
What I understand from the ticket is that the genome sequence could be in a location other than quickload site? So we need the user to specify that within the annots.xml ? Is this correct?
Yes, this would be a new feature. If the user specifies the location of the 2bit file, then IGB will get the genomic reference sequence from there. If they have not done that, then IGB will do the current behavior.
There is one other wrinkle: A given genome version may be represented in multiple Quickload sites. IGB currently tries to find the genome reference sequence 2bit file from them in an order designated in the IGB preferences JSON file. If you have any questions about how this order is determined, let me know.
I am unable to load the genome sequence for the quickload site attached. Is it something to do with the url specified in the annots file? I got this figured out. I had to change the name of the 2bit file same as the genome version name
Code diff - https://bitbucket.org/noorzahara/integrated-genome-browser-local1/branch/IGBF-2357#diff
Testing steps -
Uncompress the quickload attached using After tar -xzf quickload.tar.gz
Add the quickload to IGB by selecting configure->Add
Download the 2 bit file from http://hgdownload.soe.ucsc.edu/goldenPath/wuhCor1/bigZips/wuhCor1.2bit and rename it to match the genome version (S_SARS-CoV-2_Wuhan-Hu-1_Jan_2020.2bit) and place it within S_SARS-CoV-2_Wuhan-Hu-1_Jan_2020 folder of the quickload attached
Select species Sarbecovirus SARS-CoV-19 and also the available genome version
Load the sequence.
Repeat the above steps but this time remove the 2bit file from quickload and uncomment the file tag in the annots.xml i.e ( <!-- <file name="http://hgdownload.soe.ucsc.edu/goldenPath/wuhCor1/bigZips/wuhCor1.2bit"
reference="true"
/> -->)
Sequence should still get loaded
[~aloraine] - Kindly review the code and also let me know if the testing steps above make sense.
Code looks good. Noor Zahara - please submit PR.
Moving the ticket back to "Needs Testing" for functional review.
Nowlan Freese - If the testing passes, I think we should include this in 9.1.5. It will make our quickload sites much lighter.
Tested the executable by following the steps mentioned by Noor Zahara and works as expected.
Fixed null pointer exception - https://bitbucket.org/noorzahara/integrated-genome-browser-local1/branch/IGBF-2357-1#diff
Please submit PR.
Merged and built master branch installers.
I have tested this ticket on mac. Works as expected for both scenarios mentioned in the testing process mentioned by Noor Zahara
Because this change breaks 9.1.4, let's try a different syntax.
Instead of:
<file name="..." reference="true"/>
let's implement a new tag:
<reference file="..." />
Dr. Freese has tested the proposed change and confirmed it does not break 9.1.4.
Noor Zahara - On Tuesday would you be able to look at this?
Exception in thread "AWT-EventQueue-0" java.lang.NullPointerException
at com.affymetrix.igb.view.load.FeatureTreeView.addOrFindNode(FeatureTreeView.java:263)
at com.affymetrix.igb.view.load.FeatureTreeView.lambda$createTree$206(FeatureTreeView.java:244)
at java.util.stream.ForEachOps$ForEachOp$OfRef.accept(ForEachOps.java:184)
at java.util.stream.ReferencePipeline$2$1.accept(ReferencePipeline.java:175)
at java.util.ArrayList$ArrayListSpliterator.forEachRemaining(ArrayList.java:1382)
at java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:481)
at java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:471)
at java.util.stream.ForEachOps$ForEachOp.evaluateSequential(ForEachOps.java:151)
at java.util.stream.ForEachOps$ForEachOp$OfRef.evaluateSequential(ForEachOps.java:174)
at java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234)
at java.util.stream.ReferencePipeline.forEach(ReferencePipeline.java:418)
at com.affymetrix.igb.view.load.FeatureTreeView.createTree(FeatureTreeView.java:243)
at com.affymetrix.igb.view.load.FeatureTreeView.initOrRefreshTree(FeatureTreeView.java:162)
at com.affymetrix.igb.view.load.GeneralLoadView.refreshTree(GeneralLoadView.java:439)
at com.affymetrix.genometry.util.ThreadUtils.runOnEventQueue(ThreadUtils.java:34)
at com.affymetrix.igb.view.load.GeneralLoadView.refreshTreeView(GeneralLoadView.java:431)
at com.affymetrix.igb.view.SeqGroupView.versionNameChanged(SeqGroupView.java:440)
at com.affymetrix.igb.view.SeqGroupView.groupSelectionChanged(SeqGroupView.java:382)
at com.affymetrix.genometry.GenometryModel.fireGroupSelectionEvent(GenometryModel.java:161)
at com.affymetrix.genometry.GenometryModel.setSelectedGenomeVersion(GenometryModel.java:147)
at com.affymetrix.igb.view.SeqGroupView$InitVersionWorker.finished(SeqGroupView.java:691)
at com.affymetrix.genometry.thread.CThreadWorker.done(CThreadWorker.java:51)
at javax.swing.SwingWorker$5.run(SwingWorker.java:737)
at javax.swing.SwingWorker$DoSubmitAccumulativeRunnable.run(SwingWorker.java:832)
at sun.swing.AccumulativeRunnable.run(AccumulativeRunnable.java:112)
at javax.swing.SwingWorker$DoSubmitAccumulativeRunnable.actionPerformed(SwingWorker.java:842)
at javax.swing.Timer.fireActionPerformed(Timer.java:313)
at javax.swing.Timer$DoPostEvent.run(Timer.java:245)
at java.awt.event.InvocationEvent.dispatch(InvocationEvent.java:311)
at java.awt.EventQueue.dispatchEventImpl(EventQueue.java:758)
at java.awt.EventQueue.access$500(EventQueue.java:97)
at java.awt.EventQueue$3.run(EventQueue.java:709)
at java.awt.EventQueue$3.run(EventQueue.java:703)
at java.security.AccessController.doPrivileged(Native Method)
at java.security.ProtectionDomain$JavaSecurityAccessImpl.doIntersectionPrivilege(ProtectionDomain.java:74)
at java.awt.EventQueue.dispatchEvent(EventQueue.java:728)
at java.awt.EventDispatchThread.pumpOneEventForFilters(EventDispatchThread.java:205)
at java.awt.EventDispatchThread.pumpEventsForFilter(EventDispatchThread.java:116)
at java.awt.EventDispatchThread.pumpEventsForHierarchy(EventDispatchThread.java:105)
at java.awt.EventDispatchThread.pumpEvents(EventDispatchThread.java:101)
at java.awt.EventDispatchThread.pumpEvents(EventDispatchThread.java:93)
at java.awt.EventDispatchThread.run(EventDispatchThread.java:82)
[~aloraine] please advise how you would like to proceed.
After further investigation the issue stems from a potential bug introduced in 8.4.0 which requires that each file in the quickload annots.xml have a title. Note that the title is currently listed as optional in the igb users guide. The quickload attached to this issue for testing does not include a title in the annots.xml for the 2bit reference. This is what causes the null pointer exception and failure to load any data in IGB 8.4.0 - 9.1.4.
If a title is added to the reference 2bit file in the annots.xml then no null pointer exception is thrown for IGB 9.1.4. The reference title will appear in the available data and be selectable. If the user selects the reference title and clicks load data the sequence will be loaded in the coordinates track. This allows for 2bit or fa reference sequence files to be remotely specified in IGB 9.1.4. If the 2bit file is available in the quickload then clicking load sequence will load the sequence in the coordinates track.
In IGB 9.1.6 the title is ignored and the reference does not appear in the available data.
The easiest resolution to this issue would be to make no further code changes in IGB, but update the users guide to state that the title is required. New quickloads that include a remote reference sequence will appear with a reference data option in the available data for versions of IGB less than 9.1.6.
It may be possible to edit IGB code to add a new XML root element, but this may require some significant code changes.
A clarification question:
- If a 2bit file is provided in the quickload and the 2bit file is listed using the "file" tag with proper "name" and "title" attributes, what happens when the user clicks "Load Data"? Does the sequence get loaded along with all the other tracks with "Load Mode" set to "Region in View"?
(If yes, this almost counts as a feature, not a bug!)
Regardless of the answer to the question, I would be OK with the described behavior, since we can easily avoid breaking 9.1.4 by including the "title" attribute in our annots.xml files.
In 9.1.4
If a 2bit file is provided in the quickload and the 2bit file is listed:
If the user selects the 2bit file in the available data then no track appears for the 2bit file, but if the user clicks load data then the sequence is loaded for the region in view in the coordinates track along with whatever data/annotations are present.
After discussion with Dr. Loraine the current decision is to make no further changes to the IGB code. I have updated the IGB users guide so that the title attribute is now listed as being required.
Closing issue.
[~aloraine] - Could you tell me where I could read more about it in IGB wiki or steps to reproduce the above? I am not totally clear about this yet.