Details

    • Type: Support
    • Status: Closed (View Workflow)
    • Priority: Major
    • Resolution: Done
    • Labels:
      None

      Description

      From: "Moreau, Morgane" <morgane.moreau@jcu.edu.au>
      Date: Tue, Oct 27, 2015 at 2:45 AM
      Subject: Re: Chip-seq in galaxy
      To: Freese, Nowlan <nfreese@uncc.edu>

      Dear Ms Freese,

      I followed the video tutorial on Chip-seq analysis and visualization on Galaxy.

      I am mapping my reads to the E. coli genome, custom build, in a galaxy instance (GVL-QLD)

      I have two problems I was hoping you could help me with.

      First when I map with Bowtie I end up with a SAM file in which some rows have the sequence in the qual score columns and the qual score in the OPT column. While other rows are fine. ) I didn’t manage to fix that so I mapped my reads with BWA. After removing the unmapped reads I tried pick calling with MACS and it keeps failing.

      I have 70,000 sequence in my treatment sample and 150,000 in my control (input).

      I get this error message:

      INFO @ Tue, 27 Oct 2015 17:28:06:

      1. ARGUMENTS LIST:
      1. name = MACS_in_Galaxy
      1. format = SAM
      1. ChIP-seq file = /mnt/galaxy/files/000/117/dataset_117750.dat
      1. control file = /mnt/galaxy/files/000/117/dataset_117749.dat
      1. effective genome size = 4.00e+06

      #

      And that one in the generation errors:

      WARNING @ Tue, 27 Oct 2015 17:28:10: Too few paired peaks (0) so I can not build the model! Lower your MFOLD parameter may erase this error.

      WARNING @ Tue, 27 Oct 2015 17:28:10: Process is terminated!

        Attachments

          Activity

          Hide
          mason Mason Meyer (Inactive) added a comment -

          From: Freese, Nowlan <nfreese@uncc.edu>
          Date: Tue, Oct 27, 2015 at 11:38 AM
          Subject: Re: Chip-seq in galaxy
          To: "Moreau, Morgane" <morgane.moreau@jcu.edu.au>

          Hi Dr. Moreau,

          Regarding the second issue, you are correct, the error is caused by a lack of peaks being found. This could be caused by a few things.

          MFOLD - The MFOLD might be too high for your data. It's possible your peaks are quite small (the fold enrichment compared to the control). If you tried an MFOLD of 5 and began to see peaks, then I would try a value as low as 3 to see how many peaks it finds.

          P-Value - The other thing to try would be to use a less stringent p-value. It could be that some of the peaks are being removed due to a stringent p-value.

          As to the first issue, I have never heard of Bowtie inadvertently switching columns. This sounds like it may be an issue with this specific instance of Galaxy. How effective was BWA in aligning your reads? Are the 70,000 reads for the treatment before or after aligning? The issue with not finding enough peaks could be due to low coverage, assuming a genome size of 4.6 million bases and a read length of 36, but it's hard to say for sure.

          I would continue trying different MFOLD values and p-values to see how that changes the number of peaks you see. If the aligner wasn't very efficient, I would also consider re-aligning to see if you can improve the overall alignment.

          Nowlan

          PS - Thank you for watching my videos, I hope you found them helpful!

          Show
          mason Mason Meyer (Inactive) added a comment - From: Freese, Nowlan <nfreese@uncc.edu> Date: Tue, Oct 27, 2015 at 11:38 AM Subject: Re: Chip-seq in galaxy To: "Moreau, Morgane" <morgane.moreau@jcu.edu.au> Hi Dr. Moreau, Regarding the second issue, you are correct, the error is caused by a lack of peaks being found. This could be caused by a few things. MFOLD - The MFOLD might be too high for your data. It's possible your peaks are quite small (the fold enrichment compared to the control). If you tried an MFOLD of 5 and began to see peaks, then I would try a value as low as 3 to see how many peaks it finds. P-Value - The other thing to try would be to use a less stringent p-value. It could be that some of the peaks are being removed due to a stringent p-value. As to the first issue, I have never heard of Bowtie inadvertently switching columns. This sounds like it may be an issue with this specific instance of Galaxy. How effective was BWA in aligning your reads? Are the 70,000 reads for the treatment before or after aligning? The issue with not finding enough peaks could be due to low coverage, assuming a genome size of 4.6 million bases and a read length of 36, but it's hard to say for sure. I would continue trying different MFOLD values and p-values to see how that changes the number of peaks you see. If the aligner wasn't very efficient, I would also consider re-aligning to see if you can improve the overall alignment. Nowlan PS - Thank you for watching my videos, I hope you found them helpful!

            People

            • Assignee:
              nfreese Nowlan Freese
              Reporter:
              mason Mason Meyer (Inactive)
            • Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: