[TEST-23] IGB visualization of files (.bam and .bed) output by TopHat 2.0 (using Galaxy)-Human adrenal tissue (paired-end) - JIRA UNCC

Details

Type: Task
Status: Open (View Workflow)
Priority: Major
Resolution: Unresolved
Affects Version/s: None
Fix Version/s: None
Labels:
None
Environment:

test.galaxyproject.org & IGB 8.2 (using a Mac OS X 10.9.3)

Description

Date: June 4,2014

Purpose: To test aspects of Galaxy/IGB integration, specifically, to test IGB visualization of files (.bam and .bed) output by Tophat 2.0 (using Galaxy).

Files Tested: RNA-Seq Data from Homo sapien adrenal tissue (Paired-end)
1) adrenal 1.fastq.fastqsanger
2) adrenal 2.fastq.fastqsanger

Workflow:

1) I uploaded .fastq files to Galaxy, [Get Data==>Upload file]. My files were uploaded to Galaxy by browsing to the file location on my computer.

2) I performed a quality control check on data using the FASTQC tool in Galaxy [NGS: QC & Manipulation==>FASTQC:Read QC]. This allowed me to determine if my data set had any issues that need to be noted before analyzing. I clicked on "View data" (the eyeball icon) to view the FASTQC report.
*Issue I encountered: FASTQC tool fails to complete task (error: Picked up _JAVA_O). It was not working correctly on the Galaxy test site on 6/3/14 , but it was working correctly on 6/4/14. It is working correctly on the Galaxy main site with .fastq files as well as .fastqsanger files.

4) I determined that my data set was ready for further analysis based on the report.

5) Before running Tophat2 to map processed reads to the genome it was necessary to ensure that my .fastq file has Sanger-scaled quality values with ASCII offset 33. I made sure this is the case by running FASTQ Groomer [NGS: QC & Manipulation==>FASTQ Groomer] on my .fastq file. I selected the FASTQ Groomer tool, then selected my .fastq file to groom and then selected "Sanger & Illumina 1.8+" as the Input FASTQ quality scores type and hit "Execute".

6) I had a new file in my history that had been groomed and I was ready to run TopHat 2.0 [NGS: RNA Analysis==>Tophat2]. I selected the tool, then selected "Paired-end" for my data set. I selected my groomed RNA-Seq FASTQ files, then selected my reference genome [Human(Homo sapiens)(b37): hg19 Canonical female] (as directed in RNA-Seq exercise). I selected to use Tophat's default settings. I did not specify read group. When finished, I hit "Execute" to begin the task.
*Issue I encountered earlier but already corrected in previous step: .fastq file is not selectable within the Tophat 2 widget. This most likely means that the .fastq file does not contain the compatible FASTQ quality scores type that Tophat uses (must have Sanger-scaled quality values with ASCII offset 33).
-To Fix: I ran the FASTQ Groomer tool [NGS: QC & Manipulation==>FASTQ Groomer] as outlined in step #4. I selected "Sanger & Illumina 1.8+" as the Input FASTQ quality scores type and hit "Execute". After doing that my .fastq file was compatible with Tophat and I was able to select my groomed file within the Tophat widget.

7) Tophat 2.0 will output 5 files into your Galaxy history: "Accepted Hits" (.bam), "Splice Junctions" (.bed), "Deletions" (.bed), and "Insertions" (.bed). I clicked on the title of each file to open a drop-down menu below containing additional information and options. To visualize the file within the Integrated Genome Browser (IGB) I found where it says "display in IGB View" in the drop-down menu and clicked on the "View" link.

8) An IGB webpage opened up in a new window of my browser stating that my data was ready to view. IGB is properly installed onto my computer and the program was running, so I clicked on the button that says "Click to go to IGB".

9) This brought me to the Integrated Genome Browser interface. There was an empty track of my file among the tracks displayed. I navigated to my chromosome/scaffold of interest on the right-hand side of the IGB window. I zoomed to my region of interest by adjusting the zoom slider located across the top of the IGB window. Then I clicked on my track and hit "Load Data" located in the upper, right-hand corner of the IGB window.
*Issue I encountered: Data loads but the IGB track seems to be white, but empty.
-To Fix: I used the window on the right-hand side of IGB to locate my chromosome of interest. If was unsure of my chromosome of interest so I went back to my Galaxy history and clicked on "View data" (the eyeball icon) beside the "Splice Junction" (.bed) file output by Tophat to show a data table of Tophat's results for "Splice Junctions". The first column indicated the chromosome that contained splice junctions and so this is where I navigated to within IGB to visualize data.
*Issue I encountered: Can't visualize the "Insertions" (.bed) file output by Tophat in IGB. I learned that this is because the location of an insertion is actually in the space between bases, so it would technically be incorrect for IGB to overlay the insertions in the same position that bases occupy.

10) I went back to my Galaxy history and repeated steps #6-8 to add the rest of the files output by Tophat to IGB. This made it possible for me to visualize Tophat output files simultaneously. An image was exported as well so that I could have a copy of the IGB visualization.

Attachments

Options
- Sort By Name
- Sort By Date
- Ascending
- Descending
- Thumbnails
- List
- Download All

Attachments

GalaxyIGB Integration_TopHat-Homosapien_Adrenaltissue_Chr19.png
161 kB
04/Jun/14 12:45 PM

Activity

Ascending order - Click to sort in descending order

Mason Meyer (Inactive) created issue - 04/Jun/14 12:45 PM

Mason Meyer (Inactive) made changes - 04/Jun/14 1:49 PM

Field	Original Value	New Value
Description	Date: June 4,2014 Purpose: To test aspects of Galaxy/IGB integration, specifically, to test IGB visualization of files (.bam and .bed) output by TopHat 2.0 (using Galaxy). Workflow: 1) Upload file(s) to Galaxy. 2) Perform a quality control check on data using the FASTQC tool in Galaxy [NGS: QC & Manipulation==>FASTQC:Read QC]. This will allow you to determine if your data set has any problems of which you should be aware of before doing any further analysis. Click on view data (the eyeball icon) to view the FASTQC report. 4) Determine if your data set is ready for futher analysis based on the report. 3) When looking at the "Per base sequence quality" section of the FASTQC report you may notice some base positions that have a low median (or bottom quartile) score, depending on your data set. It is up to the analyst to determine what should be considered a "low" median score for your data, but median scores below 20 are often considered to be "low" scores. If you notice ...	Date: June 4,2014 Purpose: To test aspects of Galaxy/IGB integration, specifically, to test IGB visualization of files (.bam and .bed) output by Tophat 2.0 (using Galaxy). Workflow: 1) Upload .fastq file(s) to Galaxy. 2) Perform a quality control check on data using the FASTQC tool in Galaxy [NGS: QC & Manipulation==>FASTQC:Read QC]. This will allow you to determine if your data set has any problems of which you should be aware of before doing any further analysis. Click on view data (the eyeball icon) to view the FASTQC report. 4) Determine if your data set is ready for further analysis based on the report. 3) When looking at the "Per base sequence quality" section of the FASTQC report you may notice some base positions that have a low median (or bottom quartile) score, depending on your data set. It is up to the analyst to determine what should be considered a "low" median score for your data, but median scores below 20 are often considered to be low scores. If you notice low scores it may be useful to trim your data appropriately using [NGS: QC & Manipulation==>Trim sequences]. 4) Before running Tophat2 to map processed reads to the genome it is necessary to ensure that your .fastq file has Sanger-scaled quality values with ASCII offset 33. You can make sure this is the case by running FASTQ Groomer [NGS: QC & Manipulation==>FASTQ Groomer] on your .fastq file. Select the FASTQ Groomer tool, then select your .fastq file to groom and then select "Sanger & Illumina 1.8+" as the Input FASTQ quality scores type and hit "Execute". 5) You will now have a new file in your history that has been groomed. You are now ready to run TopHat 2.0 [NGS: RNA Analysis==>Tophat2]. Select the tool. Now select whether your data set is Single-end or Paired-end. Then select your reference genome (you can use a built in genome or one from your Galaxy history. Select whether to use Tophat's default settings or to use additional parameters. Then select whether you want to specify the read group or not. When finished, hit "Execute" to begin the task. 6) Tophat 2.0 will output 5 files into your Galaxy history: "Accepted Hits" (.bam), "Splice Junctions" (.bed), "Deletions" (.bed), and "Insertions" (.bed). Click on the title of each file to open a drop-down menu below containing additional information and options. To visualize the file within the Integrated Genome Browser (IGB) find where it says "display in IGB View" and click on the "View" link. 7) An IGB webpage will open up in a new window of your browser stating that your data is ready to view. Make sure that IGB is properly installed onto your computer and that the program is running and then click on the button that says "Click to go to IGB". 8)This will bring you to the Integrated Genome Browser interface. You will see an empty track of your file among the tracks displayed. Navigate to your chromosome/scaffold of interest on the right-hand side of the IGB window. Zoom to your region of interest by adjusting the zoom slider located across the top of the IGB window. Click on your track and then hit "Load Data" located in the upper, right-hand corner of the IGB window (the more zoomed in you are, the quicker the data will load due to the fact that IGB loads data only within the region being viewed.) 9) Go back to your Galaxy history and repeat steps #6-8 to add the rest of the files output by Tophat to IGB. This makes it possible to visualize Tophat output files simultaneously. 10) Any additional questions may be posted to the IGB help forum at http://sourceforge.net/p/genoviz/discussion/439786

Mason Meyer (Inactive) made changes - 04/Jun/14 1:58 PM

Description

Date: June 4,2014

Purpose: To test aspects of Galaxy/IGB integration, specifically, to test IGB visualization of files (.bam and .bed) output by Tophat 2.0 (using Galaxy).

Workflow:

1) Upload .fastq file(s) to Galaxy.

2) Perform a quality control check on data using the FASTQC tool in Galaxy [NGS: QC & Manipulation==>FASTQC:Read QC]. This will allow you to determine if your data set has any problems of which you should be aware of before doing any further analysis. Click on view data (the eyeball icon) to view the FASTQC report.

4) Determine if your data set is ready for further analysis based on the report.

3) When looking at the "Per base sequence quality" section of the FASTQC report you may notice some base positions that have a low median (or bottom quartile) score, depending on your data set. It is up to the analyst to determine what should be considered a "low" median score for your data, but median scores below 20 are often considered to be low scores. If you notice low scores it may be useful to trim your data appropriately using [NGS: QC & Manipulation==>Trim sequences].

4) Before running Tophat2 to map processed reads to the genome it is necessary to ensure that your .fastq file has Sanger-scaled quality values with ASCII offset 33. You can make sure this is the case by running FASTQ Groomer [NGS: QC & Manipulation==>FASTQ Groomer] on your .fastq file. Select the FASTQ Groomer tool, then select your .fastq file to groom and then select "Sanger & Illumina 1.8+" as the Input FASTQ quality scores type and hit "Execute".

5) You will now have a new file in your history that has been groomed. You are now ready to run TopHat 2.0 [NGS: RNA Analysis==>Tophat2]. Select the tool. Now select whether your data set is Single-end or Paired-end. Then select your reference genome (you can use a built in genome or one from your Galaxy history. Select whether to use Tophat's default settings or to use additional parameters. Then select whether you want to specify the read group or not. When finished, hit "Execute" to begin the task.

6) Tophat 2.0 will output 5 files into your Galaxy history: "Accepted Hits" (.bam), "Splice Junctions" (.bed), "Deletions" (.bed), and "Insertions" (.bed). Click on the title of each file to open a drop-down menu below containing additional information and options. To visualize the file within the Integrated Genome Browser (IGB) find where it says "display in IGB View" and click on the "View" link.

7) An IGB webpage will open up in a new window of your browser stating that your data is ready to view. Make sure that IGB is properly installed onto your computer and that the program is running and then click on the button that says "Click to go to IGB".

8)This will bring you to the Integrated Genome Browser interface. You will see an empty track of your file among the tracks displayed. Navigate to your chromosome/scaffold of interest on the right-hand side of the IGB window. Zoom to your region of interest by adjusting the zoom slider located across the top of the IGB window. Click on your track and then hit "Load Data" located in the upper, right-hand corner of the IGB window (the more zoomed in you are, the quicker the data will load due to the fact that IGB loads data only within the region being viewed.)

9) Go back to your Galaxy history and repeat steps #6-8 to add the rest of the files output by Tophat to IGB. This makes it possible to visualize Tophat output files simultaneously.

10) Any additional questions may be posted to the IGB help forum at http://sourceforge.net/p/genoviz/discussion/439786

Date: June 4,2014

Purpose: To test aspects of Galaxy/IGB integration, specifically, to test IGB visualization of files (.bam and .bed) output by Tophat 2.0 (using Galaxy).

Workflow:

1) Upload .fastq file(s) to Galaxy, [Get Data==>Upload file]. Once you upload your file you should see it in your Galaxy "history" on the right-hand side of the interface.

2) Perform a quality control check on data using the FASTQC tool in Galaxy [NGS: QC & Manipulation==>FASTQC:Read QC]. This will allow you to determine if your data set has any problems of which you should be aware of before doing any further analysis. Click on view data (the eyeball icon) to view the FASTQC report.

4) Determine if your data set is ready for further analysis based on the report.

3) When looking at the "Per base sequence quality" section of the FASTQC report you may notice some base positions that have a low median (or bottom quartile) score, depending on your data set. It is up to the analyst to determine what should be considered a "low" median score for your data, but median scores below 20 are often considered to be low scores. If you notice low scores it may be useful to trim your data appropriately using the Trim Sequences tool [NGS: QC & Manipulation==>Trim sequences].

4) Before running Tophat2 to map processed reads to the genome it is necessary to ensure that your .fastq file has Sanger-scaled quality values with ASCII offset 33. You can make sure this is the case by running FASTQ Groomer [NGS: QC & Manipulation==>FASTQ Groomer] on your .fastq file. Select the FASTQ Groomer tool, then select your .fastq file to groom and then select "Sanger & Illumina 1.8+" as the Input FASTQ quality scores type and hit "Execute".

5) You will now have a new file in your history that has been groomed. You are now ready to run TopHat 2.0 [NGS: RNA Analysis==>Tophat2]. Select the tool. Now select whether your data set is Single-end or Paired-end. Then select your reference genome (you can use a built in genome or one from your Galaxy history. Select whether to use Tophat's default settings or to use additional parameters. Then select whether you want to specify the read group or not. When finished, hit "Execute" to begin the task.

6) Tophat 2.0 will output 5 files into your Galaxy history: "Accepted Hits" (.bam), "Splice Junctions" (.bed), "Deletions" (.bed), and "Insertions" (.bed). Click on the title of each file to open a drop-down menu below containing additional information and options. To visualize the file within the Integrated Genome Browser (IGB) find where it says "display in IGB View" and click on the "View" link.

7) An IGB webpage will open up in a new window of your browser stating that your data is ready to view. Make sure that IGB is properly installed onto your computer and that the program is running and then click on the button that says "Click to go to IGB".

8)This will bring you to the Integrated Genome Browser interface. You will see an empty track of your file among the tracks displayed. Navigate to your chromosome/scaffold of interest on the right-hand side of the IGB window. Zoom to your region of interest by adjusting the zoom slider located across the top of the IGB window. Click on your track and then hit "Load Data" located in the upper, right-hand corner of the IGB window (the more zoomed in you are, the quicker the data will load due to the fact that IGB loads data only within the region being viewed.)

9) Go back to your Galaxy history and repeat steps #6-8 to add the rest of the files output by Tophat to IGB. This makes it possible to visualize Tophat output files simultaneously.

10) Any additional questions may be posted to the IGB help forum at http://sourceforge.net/p/genoviz/discussion/439786

Mason Meyer (Inactive) made changes - 06/Jun/14 5:55 AM

Description

Date: June 4,2014

Purpose: To test aspects of Galaxy/IGB integration, specifically, to test IGB visualization of files (.bam and .bed) output by Tophat 2.0 (using Galaxy).

Workflow:

1) Upload .fastq file(s) to Galaxy, [Get Data==>Upload file]. Once you upload your file you should see it in your Galaxy "history" on the right-hand side of the interface.

2) Perform a quality control check on data using the FASTQC tool in Galaxy [NGS: QC & Manipulation==>FASTQC:Read QC]. This will allow you to determine if your data set has any problems of which you should be aware of before doing any further analysis. Click on view data (the eyeball icon) to view the FASTQC report.

4) Determine if your data set is ready for further analysis based on the report.

3) When looking at the "Per base sequence quality" section of the FASTQC report you may notice some base positions that have a low median (or bottom quartile) score, depending on your data set. It is up to the analyst to determine what should be considered a "low" median score for your data, but median scores below 20 are often considered to be low scores. If you notice low scores it may be useful to trim your data appropriately using the Trim Sequences tool [NGS: QC & Manipulation==>Trim sequences].

4) Before running Tophat2 to map processed reads to the genome it is necessary to ensure that your .fastq file has Sanger-scaled quality values with ASCII offset 33. You can make sure this is the case by running FASTQ Groomer [NGS: QC & Manipulation==>FASTQ Groomer] on your .fastq file. Select the FASTQ Groomer tool, then select your .fastq file to groom and then select "Sanger & Illumina 1.8+" as the Input FASTQ quality scores type and hit "Execute".

5) You will now have a new file in your history that has been groomed. You are now ready to run TopHat 2.0 [NGS: RNA Analysis==>Tophat2]. Select the tool. Now select whether your data set is Single-end or Paired-end. Then select your reference genome (you can use a built in genome or one from your Galaxy history. Select whether to use Tophat's default settings or to use additional parameters. Then select whether you want to specify the read group or not. When finished, hit "Execute" to begin the task.

6) Tophat 2.0 will output 5 files into your Galaxy history: "Accepted Hits" (.bam), "Splice Junctions" (.bed), "Deletions" (.bed), and "Insertions" (.bed). Click on the title of each file to open a drop-down menu below containing additional information and options. To visualize the file within the Integrated Genome Browser (IGB) find where it says "display in IGB View" and click on the "View" link.

7) An IGB webpage will open up in a new window of your browser stating that your data is ready to view. Make sure that IGB is properly installed onto your computer and that the program is running and then click on the button that says "Click to go to IGB".

8)This will bring you to the Integrated Genome Browser interface. You will see an empty track of your file among the tracks displayed. Navigate to your chromosome/scaffold of interest on the right-hand side of the IGB window. Zoom to your region of interest by adjusting the zoom slider located across the top of the IGB window. Click on your track and then hit "Load Data" located in the upper, right-hand corner of the IGB window (the more zoomed in you are, the quicker the data will load due to the fact that IGB loads data only within the region being viewed.)

9) Go back to your Galaxy history and repeat steps #6-8 to add the rest of the files output by Tophat to IGB. This makes it possible to visualize Tophat output files simultaneously.

10) Any additional questions may be posted to the IGB help forum at http://sourceforge.net/p/genoviz/discussion/439786

Date: June 4,2014

Purpose: To test aspects of Galaxy/IGB integration, specifically, to test IGB visualization of files (.bam and .bed) output by Tophat 2.0 (using Galaxy).

Files Tested: RNA-Seq Data from Homo sapien adrenal tissue (Paired-end)
        1) adrenal 1.fastq.fastqsanger
        2) adrenal 2.fastq.fastqsanger

Workflow:

1) I uploaded .fastq files to Galaxy, [Get Data==>Upload file]. My files were uploaded to Galaxy by browsing to the file location on my computer.

2) I performed a quality control check on data using the FASTQC tool in Galaxy [NGS: QC & Manipulation==>FASTQC:Read QC]. This allowed me to determine if my data set has any problems of which I should be aware of before doing any further analysis. I clicked on "View data" (the eyeball icon) to view the FASTQC report.
    *Issue I encountered: FASTQC tool fails to complete task (error: Picked up _JAVA_O). It was not working correctly on the Galaxy test site on 6/3/14 , but it was working correctly on 6/4/14. It is working correctly on the Galaxy main site with .fastq files as well as .fastqsanger files.

4) I determined that my data set was ready for further analysis based on the report.

5) Before running Tophat2 to map processed reads to the genome it was necessary to ensure that my .fastq file has Sanger-scaled quality values with ASCII offset 33. I made sure this is the case by running FASTQ Groomer [NGS: QC & Manipulation==>FASTQ Groomer] on my .fastq file. I selected the FASTQ Groomer tool, then selected my .fastq file to groom and then selected "Sanger & Illumina 1.8+" as the Input FASTQ quality scores type and hit "Execute".

6) I had a new file in my history that had been groomed and I was ready to run TopHat 2.0 [NGS: RNA Analysis==>Tophat2]. I selected the tool, then selected Paired-end for my data set. I selected my groomed RNA-Seq FASTQ files, then selected my reference genome. I selected to use Tophat's default settings. I did not specify read group. When finished, I hit "Execute" to begin the task.
    *Issue I encountered earlier but already corrected in previous step: .fastq file is not selectable within the Tophat 2 widget. This most likely means that the .fastq file does not contain the compatible FASTQ quality scores type that Tophat uses (must have Sanger-scaled quality values with ASCII offset 33).
    -To Fix: I ran the FASTQ Groomer tool [NGS: QC & Manipulation==>FASTQ Groomer] as outlined in step #4. I selected "Sanger & Illumina 1.8+" as the Input FASTQ quality scores type and hit "Execute". After doing that my .fastq file was compatible with Tophat and I was able to select my groomed file within the Tophat widget.

7) Tophat 2.0 will output 5 files into your Galaxy history: "Accepted Hits" (.bam), "Splice Junctions" (.bed), "Deletions" (.bed), and "Insertions" (.bed). I clicked on the title of each file to open a drop-down menu below containing additional information and options. To visualize the file within the Integrated Genome Browser (IGB) I found where it says "display in IGB View" in the drop-down menu and clicked on the "View" link.

8) An IGB webpage opened up in a new window of my browser stating that my data was ready to view. IGB is properly installed onto my computer and the program was running, so I clicked on the button that says "Click to go to IGB".

9) This brought me to the Integrated Genome Browser interface. There was an empty track of my file among the tracks displayed. I navigated to my chromosome/scaffold of interest on the right-hand side of the IGB window. I zoomed to my region of interest by adjusting the zoom slider located across the top of the IGB window. Then I clicked on my track and hit "Load Data" located in the upper, right-hand corner of the IGB window.
    *Issue I encountered: Data loads but the IGB track seems to be white, but empty.
    -To Fix: I used the window on the right-hand side of IGB to locate my chromosome of interest. If was unsure of my chromosome of interest so I went back to my Galaxy history and clicked on "View data" (the eyeball icon) beside the "Splice Junction" (.bed) file output by Tophat to show a data table of Tophat's results for "Splice Junctions". The first column indicated the chromosome that contained splice junctions and so this is where I navigated to within IGB to visualize data.
    *Issue I encountered: Can't visualize the "Insertions" (.bed) file output by Tophat in IGB. I learned that this is because the location of an insertion is actually in the space between bases, so it would technically be incorrect for IGB to overlay the insertions in the same position that bases occupy.

10) I went back to my Galaxy history and repeated steps #6-8 to add the rest of the files output by Tophat to IGB. This made it possible for me to visualize Tophat output files simultaneously. An image was exported as well so that I could have a copy of the IGB visualization.

Mason Meyer (Inactive) made changes - 06/Jun/14 6:11 AM

Environment

test.galaxyproject.org & IGB 8.2 (using a Mac OS X 10.9.3)

Mason Meyer (Inactive) made changes - 06/Jun/14 6:23 AM

Summary	IGB visualization of files (.bam and .bed) output by TopHat 2.0 (using Galaxy)	IGB visualization of files (.bam and .bed) output by TopHat 2.0 (using Galaxy)-Human adrenal tissue (paired-end)
Description	Date: June 4,2014 Purpose: To test aspects of Galaxy/IGB integration, specifically, to test IGB visualization of files (.bam and .bed) output by Tophat 2.0 (using Galaxy). Files Tested: RNA-Seq Data from Homo sapien adrenal tissue (Paired-end) 1) adrenal 1.fastq.fastqsanger 2) adrenal 2.fastq.fastqsanger Workflow: 1) I uploaded .fastq files to Galaxy, [Get Data==>Upload file]. My files were uploaded to Galaxy by browsing to the file location on my computer. 2) I performed a quality control check on data using the FASTQC tool in Galaxy [NGS: QC & Manipulation==>FASTQC:Read QC]. This allowed me to determine if my data set has any problems of which I should be aware of before doing any further analysis. I clicked on "View data" (the eyeball icon) to view the FASTQC report. Issue I encountered: FASTQC tool fails to complete task (error: Picked up _JAVA_O). It was not working correctly on the Galaxy test site on 6/3/14 , but it was working correctly on 6/4/14. It is working correctly on the Galaxy main site with .fastq files as well as .fastqsanger files. 4) I determined that my data set was ready for further analysis based on the report. 5) Before running Tophat2 to map processed reads to the genome it was necessary to ensure that my .fastq file has Sanger-scaled quality values with ASCII offset 33. I made sure this is the case by running FASTQ Groomer [NGS: QC & Manipulation==>FASTQ Groomer] on my .fastq file. I selected the FASTQ Groomer tool, then selected my .fastq file to groom and then selected "Sanger & Illumina 1.8+" as the Input FASTQ quality scores type and hit "Execute". 6) I had a new file in my history that had been groomed and I was ready to run TopHat 2.0 [NGS: RNA Analysis==>Tophat2]. I selected the tool, then selected Paired-end for my data set. I selected my groomed RNA-Seq FASTQ files, then selected my reference genome. I selected to use Tophat's default settings. I did not specify read group. When finished, I hit "Execute" to begin the task. Issue I encountered earlier but already corrected in previous step: .fastq file is not selectable within the Tophat 2 widget. This most likely means that the .fastq file does not contain the compatible FASTQ quality scores type that Tophat uses (must have Sanger-scaled quality values with ASCII offset 33). -To Fix: I ran the FASTQ Groomer tool [NGS: QC & Manipulation==>FASTQ Groomer] as outlined in step #4. I selected "Sanger & Illumina 1.8+" as the Input FASTQ quality scores type and hit "Execute". After doing that my .fastq file was compatible with Tophat and I was able to select my groomed file within the Tophat widget. 7) Tophat 2.0 will output 5 files into your Galaxy history: "Accepted Hits" (.bam), "Splice Junctions" (.bed), "Deletions" (.bed), and "Insertions" (.bed). I clicked on the title of each file to open a drop-down menu below containing additional information and options. To visualize the file within the Integrated Genome Browser (IGB) I found where it says "display in IGB View" in the drop-down menu and clicked on the "View" link. 8) An IGB webpage opened up in a new window of my browser stating that my data was ready to view. IGB is properly installed onto my computer and the program was running, so I clicked on the button that says "Click to go to IGB". 9) This brought me to the Integrated Genome Browser interface. There was an empty track of my file among the tracks displayed. I navigated to my chromosome/scaffold of interest on the right-hand side of the IGB window. I zoomed to my region of interest by adjusting the zoom slider located across the top of the IGB window. Then I clicked on my track and hit "Load Data" located in the upper, right-hand corner of the IGB window. Issue I encountered: Data loads but the IGB track seems to be white, but empty. -To Fix: I used the window on the right-hand side of IGB to locate my chromosome of interest. If was unsure of my chromosome of interest so I went back to my Galaxy history and clicked on "View data" (the eyeball icon) beside the "Splice Junction" (.bed) file output by Tophat to show a data table of Tophat's results for "Splice Junctions". The first column indicated the chromosome that contained splice junctions and so this is where I navigated to within IGB to visualize data. Issue I encountered: Can't visualize the "Insertions" (.bed) file output by Tophat in IGB. I learned that this is because the location of an insertion is actually in the space between bases, so it would technically be incorrect for IGB to overlay the insertions in the same position that bases occupy. 10) I went back to my Galaxy history and repeated steps #6-8 to add the rest of the files output by Tophat to IGB. This made it possible for me to visualize Tophat output files simultaneously. An image was exported as well so that I could have a copy of the IGB visualization.	Date: June 4,2014 Purpose: To test aspects of Galaxy/IGB integration, specifically, to test IGB visualization of files (.bam and .bed) output by Tophat 2.0 (using Galaxy). Files Tested: RNA-Seq Data from Homo sapien adrenal tissue (Paired-end) 1) adrenal 1.fastq.fastqsanger 2) adrenal 2.fastq.fastqsanger Workflow: 1) I uploaded .fastq files to Galaxy, [Get Data==>Upload file]. My files were uploaded to Galaxy by browsing to the file location on my computer. 2) I performed a quality control check on data using the FASTQC tool in Galaxy [NGS: QC & Manipulation==>FASTQC:Read QC]. This allowed me to determine if my data set has any problems of which I should be aware of before doing any further analysis. I clicked on "View data" (the eyeball icon) to view the FASTQC report. Issue I encountered: FASTQC tool fails to complete task (error: Picked up _JAVA_O). It was not working correctly on the Galaxy test site on 6/3/14 , but it was working correctly on 6/4/14. It is working correctly on the Galaxy main site with .fastq files as well as .fastqsanger files. 4) I determined that my data set was ready for further analysis based on the report. 5) Before running Tophat2 to map processed reads to the genome it was necessary to ensure that my .fastq file has Sanger-scaled quality values with ASCII offset 33. I made sure this is the case by running FASTQ Groomer [NGS: QC & Manipulation==>FASTQ Groomer] on my .fastq file. I selected the FASTQ Groomer tool, then selected my .fastq file to groom and then selected "Sanger & Illumina 1.8+" as the Input FASTQ quality scores type and hit "Execute". 6) I had a new file in my history that had been groomed and I was ready to run TopHat 2.0 [NGS: RNA Analysis==>Tophat2]. I selected the tool, then selected "Paired-end" for my data set. I selected my groomed RNA-Seq FASTQ files, then selected my reference genome [Human(Homo sapiens)(b37): hg19 Canonical female] (as directed in RNA-Seq exercise). I selected to use Tophat's default settings. I did not specify read group. When finished, I hit "Execute" to begin the task. Issue I encountered earlier but already corrected in previous step: .fastq file is not selectable within the Tophat 2 widget. This most likely means that the .fastq file does not contain the compatible FASTQ quality scores type that Tophat uses (must have Sanger-scaled quality values with ASCII offset 33). -To Fix: I ran the FASTQ Groomer tool [NGS: QC & Manipulation==>FASTQ Groomer] as outlined in step #4. I selected "Sanger & Illumina 1.8+" as the Input FASTQ quality scores type and hit "Execute". After doing that my .fastq file was compatible with Tophat and I was able to select my groomed file within the Tophat widget. 7) Tophat 2.0 will output 5 files into your Galaxy history: "Accepted Hits" (.bam), "Splice Junctions" (.bed), "Deletions" (.bed), and "Insertions" (.bed). I clicked on the title of each file to open a drop-down menu below containing additional information and options. To visualize the file within the Integrated Genome Browser (IGB) I found where it says "display in IGB View" in the drop-down menu and clicked on the "View" link. 8) An IGB webpage opened up in a new window of my browser stating that my data was ready to view. IGB is properly installed onto my computer and the program was running, so I clicked on the button that says "Click to go to IGB". 9) This brought me to the Integrated Genome Browser interface. There was an empty track of my file among the tracks displayed. I navigated to my chromosome/scaffold of interest on the right-hand side of the IGB window. I zoomed to my region of interest by adjusting the zoom slider located across the top of the IGB window. Then I clicked on my track and hit "Load Data" located in the upper, right-hand corner of the IGB window. Issue I encountered: Data loads but the IGB track seems to be white, but empty. -To Fix: I used the window on the right-hand side of IGB to locate my chromosome of interest. If was unsure of my chromosome of interest so I went back to my Galaxy history and clicked on "View data" (the eyeball icon) beside the "Splice Junction" (.bed) file output by Tophat to show a data table of Tophat's results for "Splice Junctions". The first column indicated the chromosome that contained splice junctions and so this is where I navigated to within IGB to visualize data. Issue I encountered: Can't visualize the "Insertions" (.bed) file output by Tophat in IGB. I learned that this is because the location of an insertion is actually in the space between bases, so it would technically be incorrect for IGB to overlay the insertions in the same position that bases occupy. 10) I went back to my Galaxy history and repeated steps #6-8 to add the rest of the files output by Tophat to IGB. This made it possible for me to visualize Tophat output files simultaneously. An image was exported as well so that I could have a copy of the IGB visualization.

Mason Meyer (Inactive) made changes - 06/Jun/14 11:03 AM

Description

Date: June 4,2014

Purpose: To test aspects of Galaxy/IGB integration, specifically, to test IGB visualization of files (.bam and .bed) output by Tophat 2.0 (using Galaxy).

Files Tested: RNA-Seq Data from Homo sapien adrenal tissue (Paired-end)
        1) adrenal 1.fastq.fastqsanger
        2) adrenal 2.fastq.fastqsanger

Workflow:

1) I uploaded .fastq files to Galaxy, [Get Data==>Upload file]. My files were uploaded to Galaxy by browsing to the file location on my computer.

2) I performed a quality control check on data using the FASTQC tool in Galaxy [NGS: QC & Manipulation==>FASTQC:Read QC]. This allowed me to determine if my data set has any problems of which I should be aware of before doing any further analysis. I clicked on "View data" (the eyeball icon) to view the FASTQC report.
    *Issue I encountered: FASTQC tool fails to complete task (error: Picked up _JAVA_O). It was not working correctly on the Galaxy test site on 6/3/14 , but it was working correctly on 6/4/14. It is working correctly on the Galaxy main site with .fastq files as well as .fastqsanger files.

4) I determined that my data set was ready for further analysis based on the report.

5) Before running Tophat2 to map processed reads to the genome it was necessary to ensure that my .fastq file has Sanger-scaled quality values with ASCII offset 33. I made sure this is the case by running FASTQ Groomer [NGS: QC & Manipulation==>FASTQ Groomer] on my .fastq file. I selected the FASTQ Groomer tool, then selected my .fastq file to groom and then selected "Sanger & Illumina 1.8+" as the Input FASTQ quality scores type and hit "Execute".

6) I had a new file in my history that had been groomed and I was ready to run TopHat 2.0 [NGS: RNA Analysis==>Tophat2]. I selected the tool, then selected "Paired-end" for my data set. I selected my groomed RNA-Seq FASTQ files, then selected my reference genome [Human(Homo sapiens)(b37): hg19 Canonical female] (as directed in RNA-Seq exercise). I selected to use Tophat's default settings. I did not specify read group. When finished, I hit "Execute" to begin the task.
    *Issue I encountered earlier but already corrected in previous step: .fastq file is not selectable within the Tophat 2 widget. This most likely means that the .fastq file does not contain the compatible FASTQ quality scores type that Tophat uses (must have Sanger-scaled quality values with ASCII offset 33).
    -To Fix: I ran the FASTQ Groomer tool [NGS: QC & Manipulation==>FASTQ Groomer] as outlined in step #4. I selected "Sanger & Illumina 1.8+" as the Input FASTQ quality scores type and hit "Execute". After doing that my .fastq file was compatible with Tophat and I was able to select my groomed file within the Tophat widget.

7) Tophat 2.0 will output 5 files into your Galaxy history: "Accepted Hits" (.bam), "Splice Junctions" (.bed), "Deletions" (.bed), and "Insertions" (.bed). I clicked on the title of each file to open a drop-down menu below containing additional information and options. To visualize the file within the Integrated Genome Browser (IGB) I found where it says "display in IGB View" in the drop-down menu and clicked on the "View" link.

8) An IGB webpage opened up in a new window of my browser stating that my data was ready to view. IGB is properly installed onto my computer and the program was running, so I clicked on the button that says "Click to go to IGB".

9) This brought me to the Integrated Genome Browser interface. There was an empty track of my file among the tracks displayed. I navigated to my chromosome/scaffold of interest on the right-hand side of the IGB window. I zoomed to my region of interest by adjusting the zoom slider located across the top of the IGB window. Then I clicked on my track and hit "Load Data" located in the upper, right-hand corner of the IGB window.
    *Issue I encountered: Data loads but the IGB track seems to be white, but empty.
    -To Fix: I used the window on the right-hand side of IGB to locate my chromosome of interest. If was unsure of my chromosome of interest so I went back to my Galaxy history and clicked on "View data" (the eyeball icon) beside the "Splice Junction" (.bed) file output by Tophat to show a data table of Tophat's results for "Splice Junctions". The first column indicated the chromosome that contained splice junctions and so this is where I navigated to within IGB to visualize data.
    *Issue I encountered: Can't visualize the "Insertions" (.bed) file output by Tophat in IGB. I learned that this is because the location of an insertion is actually in the space between bases, so it would technically be incorrect for IGB to overlay the insertions in the same position that bases occupy.

10) I went back to my Galaxy history and repeated steps #6-8 to add the rest of the files output by Tophat to IGB. This made it possible for me to visualize Tophat output files simultaneously. An image was exported as well so that I could have a copy of the IGB visualization.

Date: June 4,2014

Purpose: To test aspects of Galaxy/IGB integration, specifically, to test IGB visualization of files (.bam and .bed) output by Tophat 2.0 (using Galaxy).

Files Tested: RNA-Seq Data from Homo sapien adrenal tissue (Paired-end)
        1) adrenal 1.fastq.fastqsanger
        2) adrenal 2.fastq.fastqsanger

Workflow:

1) I uploaded .fastq files to Galaxy, [Get Data==>Upload file]. My files were uploaded to Galaxy by browsing to the file location on my computer.

2) I performed a quality control check on data using the FASTQC tool in Galaxy [NGS: QC & Manipulation==>FASTQC:Read QC]. This allowed me to determine if my data set had any issues that need to be noted before analyzing. I clicked on "View data" (the eyeball icon) to view the FASTQC report.
    *Issue I encountered: FASTQC tool fails to complete task (error: Picked up _JAVA_O). It was not working correctly on the Galaxy test site on 6/3/14 , but it was working correctly on 6/4/14. It is working correctly on the Galaxy main site with .fastq files as well as .fastqsanger files.

4) I determined that my data set was ready for further analysis based on the report.

5) Before running Tophat2 to map processed reads to the genome it was necessary to ensure that my .fastq file has Sanger-scaled quality values with ASCII offset 33. I made sure this is the case by running FASTQ Groomer [NGS: QC & Manipulation==>FASTQ Groomer] on my .fastq file. I selected the FASTQ Groomer tool, then selected my .fastq file to groom and then selected "Sanger & Illumina 1.8+" as the Input FASTQ quality scores type and hit "Execute".

6) I had a new file in my history that had been groomed and I was ready to run TopHat 2.0 [NGS: RNA Analysis==>Tophat2]. I selected the tool, then selected "Paired-end" for my data set. I selected my groomed RNA-Seq FASTQ files, then selected my reference genome [Human(Homo sapiens)(b37): hg19 Canonical female] (as directed in RNA-Seq exercise). I selected to use Tophat's default settings. I did not specify read group. When finished, I hit "Execute" to begin the task.
    *Issue I encountered earlier but already corrected in previous step: .fastq file is not selectable within the Tophat 2 widget. This most likely means that the .fastq file does not contain the compatible FASTQ quality scores type that Tophat uses (must have Sanger-scaled quality values with ASCII offset 33).
    -To Fix: I ran the FASTQ Groomer tool [NGS: QC & Manipulation==>FASTQ Groomer] as outlined in step #4. I selected "Sanger & Illumina 1.8+" as the Input FASTQ quality scores type and hit "Execute". After doing that my .fastq file was compatible with Tophat and I was able to select my groomed file within the Tophat widget.

7) Tophat 2.0 will output 5 files into your Galaxy history: "Accepted Hits" (.bam), "Splice Junctions" (.bed), "Deletions" (.bed), and "Insertions" (.bed). I clicked on the title of each file to open a drop-down menu below containing additional information and options. To visualize the file within the Integrated Genome Browser (IGB) I found where it says "display in IGB View" in the drop-down menu and clicked on the "View" link.

8) An IGB webpage opened up in a new window of my browser stating that my data was ready to view. IGB is properly installed onto my computer and the program was running, so I clicked on the button that says "Click to go to IGB".

9) This brought me to the Integrated Genome Browser interface. There was an empty track of my file among the tracks displayed. I navigated to my chromosome/scaffold of interest on the right-hand side of the IGB window. I zoomed to my region of interest by adjusting the zoom slider located across the top of the IGB window. Then I clicked on my track and hit "Load Data" located in the upper, right-hand corner of the IGB window.
    *Issue I encountered: Data loads but the IGB track seems to be white, but empty.
    -To Fix: I used the window on the right-hand side of IGB to locate my chromosome of interest. If was unsure of my chromosome of interest so I went back to my Galaxy history and clicked on "View data" (the eyeball icon) beside the "Splice Junction" (.bed) file output by Tophat to show a data table of Tophat's results for "Splice Junctions". The first column indicated the chromosome that contained splice junctions and so this is where I navigated to within IGB to visualize data.
    *Issue I encountered: Can't visualize the "Insertions" (.bed) file output by Tophat in IGB. I learned that this is because the location of an insertion is actually in the space between bases, so it would technically be incorrect for IGB to overlay the insertions in the same position that bases occupy.

10) I went back to my Galaxy history and repeated steps #6-8 to add the rest of the files output by Tophat to IGB. This made it possible for me to visualize Tophat output files simultaneously. An image was exported as well so that I could have a copy of the IGB visualization.

People

Assignee:

Mason Meyer (Inactive)

Reporter:

Mason Meyer (Inactive)

Votes:

0 Vote for this issue

Watchers:

0 Start watching this issue

Dates

Created:

04/Jun/14 12:45 PM

Updated:

06/Jun/14 11:03 AM