Uploaded image for project: 'Bug Repository'
  1. Bug Repository
  2. BUG-604

try loading 1000 genomes file into IGB

    Details

    • Type: Bug
    • Status: Closed (View Workflow)
    • Priority: Major
    • Resolution: Fixed
    • Labels:
      None

      Description

      TASK:

      Write a SHORT vignette (succinct as possible) for the User's Guide describing accessing 1000 genomes file from AWS

      Before you get started, spend an hour reading up on the 1000 genomes project, what it is, and so on.

      Also, before you get started, see this post on IGV groups:

      https://groups.google.com/forum/#!topic/igv-help/Z6vF2n8nzSc[1-25]

      Hi everyone,

      I would like to compare my own bam file with data from the 1000 Genomes list (I usually pull the last one up through "File -> Load from Server").

      I succeeded in writing a batch script that shows my own bam file and saves a snapshot, but how can I add the track from the 1000 Genomes data to this?

      Thanks for any help with this!

      Best,
      ~Lina

      Hi,

      If you know the URL to the file, either ftp or http, you can use "Load from URL..." and paste it in, or the batch load command. For example,

      ftp://ftp-trace.ncbi.nih.gov/1000genomes/ftp/data/HG00111/alignment/HG00111.chrom11.ILLUMINA.bwa.GBR.low_coverage.20111114.bam

      the same data can be loaded from the "cloud" with this URL

      http://1000genomes.s3.amazonaws.com/data/HG00111/alignment/HG00111.chrom11.ILLUMINA.bwa.GBR.low_coverage.20111114.bam

      In general I recommend use of the cloud URLs, performance will be much better. Unfortunately browsing the cloud dataset is not easy without a tool, however it usually works to first find the data browsing the ftp site at ftp://ftp-trace.ncbi.nih.gov/1000genomes/ftp/data/, the replacing the first part of the URL with the cloud eqiuvalent, i.e. replace

      ftp://ftp-trace.ncbi.nih.gov/1000genomes/ftp

      with

      http://1000genomes.s3.amazonaws.com

      For VCF files I recommend just downloading them, they aren't so huge and performance is better. If you want to load them remotely the cloud URLs are your only option (other than downloading the file) as ftp is not supported for tabix indexed files.

      http://1000genomes.s3.amazonaws.com/release/20110521/ALL.chr11.phase1_release_v3.20101123.snps_indels_svs.genotypes.vcf.gz

      Jim

      • show quoted text -

      On Thursday, August 2, 2012 5:54:24 PM UTC-4, Lina Faller wrote:

      Hi everyone,

      I would like to compare my own bam file with data from the 1000 Genomes list (I usually pull the last one up through "File -> Load from Server").

      I succeeded in writing a batch script that shows my own bam file and saves a snapshot, but how can I add the track from the 1000 Genomes data to this?

      Thanks for any help with this!

      Best,
      ~Lina

        Attachments

          Issue Links

            Activity

            No work has yet been logged on this issue.

              People

              • Assignee:
                Unassigned
                Reporter:
                ann.loraine Ann Loraine
              • Votes:
                0 Vote for this issue
                Watchers:
                0 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: