[IGBF-3636] Single Cell RNA dataset Backup - JIRA UNCC

Details

Type: Task
Status: Closed (View Workflow)
Priority: Major
Resolution: Done
Affects Version/s: None
Fix Version/s: None
Labels:
None

Story Points:
2
Epic Link:
Support NSF pollen grant
Sprint:
Spring 5

Description

A new dataset for the Mark Johnson lab.

Requires an FTP pull. and an upload to the Cluster for backup.

Time is of the essence as the sequencing center will wipe the data after 30 days.

Raja is also pulling the data and will back it up at Brown U.

Attachments

Activity

Ascending order - Click to sort in descending order

Robert Reid created issue - 06/Mar/24 10:19 AM

Robert Reid made changes - 06/Mar/24 10:19 AM

Field	Original Value	New Value
Epic Link		IGBF-2993 [ 21429 ]

Robert Reid made changes - 06/Mar/24 10:19 AM

Status

To-Do [ 10305 ]

In Progress [ 3 ]

Hide

Permalink

Robert Reid added a comment - 06/Mar/24 10:20 AM

Hi Rasha,

An update for your current order is now available:

Service Line: Next Generation Sequencing
Order Type: Sequencing Only
Order Number: 30-986962648
Order Name: Rasha ALTHIAB ALMASAUD
PO Number: PUR-0147243
Project Status: Data Processing and QC, Processing
Comments:

If there are any issues with your project or samples, we will reach out to you shortly.
You can check the status of your project in real time with the link below. Moreover, we will notify you again at the next status change.

https://clims4.genewiz.com/OrderSummary/Summary?OrderId=2b3e6ea8-0836-43d4-80a2-5a034a471c2a&lobId=3&serviceId=3160

Associated data files with this project can be accessed at any time with the link below to the "My Results" section of your CLIMS account. Due to size limitations, the raw FASTQ files and bioinformatics reports (if applicable) are not posted here and are delivered via hard drive or sFTP upon project completion.

https://clims4.genewiz.com/OrderSummary/ViewResult?OrderId=2b3e6ea8-0836-43d4-80a2-5a034a471c2a&lobId=3&serviceId=3160

Please do not hesitate to contact us with any questions.

Thank you for choosing GENEWIZ/Azenta!

David King
Next Generation Sequencing
Azenta US, Inc.
111 Corporate Blvd.
South Plainfield, NJ 07080
+1-908-222-0711, ext. 1
+1-908-333-4511 (fax)
www.GENEWIZ.com
NGS@azenta.com
Specializing in Genomics Since 1999

Show

Robert Reid added a comment - 06/Mar/24 10:20 AM Hi Rasha, An update for your current order is now available: Service Line: Next Generation Sequencing Order Type: Sequencing Only Order Number: 30-986962648 Order Name: Rasha ALTHIAB ALMASAUD PO Number: PUR-0147243 Project Status: Data Processing and QC, Processing Comments: If there are any issues with your project or samples, we will reach out to you shortly. You can check the status of your project in real time with the link below. Moreover, we will notify you again at the next status change. https://clims4.genewiz.com/OrderSummary/Summary?OrderId=2b3e6ea8-0836-43d4-80a2-5a034a471c2a&lobId=3&serviceId=3160 Associated data files with this project can be accessed at any time with the link below to the "My Results" section of your CLIMS account. Due to size limitations, the raw FASTQ files and bioinformatics reports (if applicable) are not posted here and are delivered via hard drive or sFTP upon project completion. https://clims4.genewiz.com/OrderSummary/ViewResult?OrderId=2b3e6ea8-0836-43d4-80a2-5a034a471c2a&lobId=3&serviceId=3160 Please do not hesitate to contact us with any questions. Thank you for choosing GENEWIZ/Azenta! David King Next Generation Sequencing Azenta US, Inc. 111 Corporate Blvd. South Plainfield, NJ 07080 +1-908-222-0711, ext. 1 +1-908-333-4511 (fax) www.GENEWIZ.com NGS@azenta.com Specializing in Genomics Since 1999

Robert Reid made changes - 06/Mar/24 10:42 AM

Summary

Sincle Cell RNA dataset Backup

Single Cell RNA dataset Backup

Hide

Permalink

Robert Reid added a comment - 06/Mar/24 12:33 PM

Azenta Project 30-986962648

Show

Robert Reid added a comment - 06/Mar/24 12:33 PM Azenta Project 30-986962648

Hide

Permalink

Robert Reid added a comment - 06/Mar/24 12:33 PM

30 files.

Show

Robert Reid added a comment - 06/Mar/24 12:33 PM 30 files.

Hide

Permalink

Robert Reid added a comment - 07/Mar/24 10:26 AM

Data pulled to Rob's local drive.
Next step, move to cluster.

Show

Robert Reid added a comment - 07/Mar/24 10:26 AM Data pulled to Rob's local drive. Next step, move to cluster.

Hide

Permalink

Robert Reid added a comment - 11/Mar/24 8:36 AM - edited

Data moved onto cluster:

/projects/tomato_genome/rnaseq$ cd 30-986962648-Rasha-singleCell/

MD5 sums still require checking.

I need to re-run the rsync to ensure all file made it across as well.

Show

Robert Reid added a comment - 11/Mar/24 8:36 AM - edited Data moved onto cluster: /projects/tomato_genome/rnaseq$ cd 30-986962648-Rasha-singleCell/ MD5 sums still require checking. I need to re-run the rsync to ensure all file made it across as well.

Hide

Permalink

Robert Reid added a comment - 11/Mar/24 10:10 AM

Their original report:
Table 3.1 Overall Sequencing Statistics
Project # Reads Yield (Mbases) Mean Quality Score % Bases >= 30
30-986962648 1,731,259,077 519,379 33.51 70.70

Need to check the cluster files reflect this and we will be done with this ticket.

Show

Robert Reid added a comment - 11/Mar/24 10:10 AM Their original report: Table 3.1 Overall Sequencing Statistics Project # Reads Yield (Mbases) Mean Quality Score % Bases >= 30 30-986962648 1,731,259,077 519,379 33.51 70.70 Need to check the cluster files reflect this and we will be done with this ticket.

Hide

Permalink

Robert Reid added a comment - 12/Mar/24 10:35 AM - edited

The original MD5 sums reported by Azenta:
/projects/tomato_genome/rnaseq/30-986962648-Rasha-singleCell/00_fastq

1 line command to see all of the MD5
for file in *md5;do cat $file; done
8385565fbce53bbcc47eedc00e40a4ee ./1_S1_L001_I1_001.fastq.gz
253f30177f6ef940be21ce7344e0e1b7 ./1_S1_L001_I2_001.fastq.gz
864e999b8c444d477a0d68046da07210 ./1_S1_L001_R1_001.fastq.gz
5fdb505a254d522ed73887b8ad7bc135 ./1_S1_L001_R2_001.fastq.gz
116177f09c3b54526c002c6abb66710e ./2_S1_L001_I1_001.fastq.gz
f729927e6f889982d06de7fcf151d731 ./2_S1_L001_I2_001.fastq.gz
83f32c42be6ce90dc72691f4d84faa1d ./2_S1_L001_R1_001.fastq.gz
6309aa304d0b17104ede19c1203df495 ./2_S1_L001_R2_001.fastq.gz
fb47d7cfb8ef0ce389d6b74c693c1d8e ./3_S1_L001_I1_001.fastq.gz
88a091e3e5b31202cb57f5e97251e99f ./3_S1_L001_I2_001.fastq.gz
d37f458c2559c9c7da9f320efd9f040e ./3_S1_L001_R1_001.fastq.gz
6e95e26239fe076b42cf709b33a66493 ./3_S1_L001_R2_001.fastq.gz
a59b413031cf89ecc3c525a1bbcc7027 ./4_S1_L001_I1_001.fastq.gz
2df8c3a412d8df654f4aa4f1bcc7a0a2 ./4_S1_L001_I2_001.fastq.gz
e7def1b03b270c59107f92ee0b2a416c ./4_S1_L001_R1_001.fastq.gz
422d399ec511130cc15d0bf7c30c071f ./4_S1_L001_R2_001.fastq.gz

Next we run MD5 on each fastq file.
for file in *gz; do echo $file; md5sum $file >> summarymd5.txt; done

8385565fbce53bbcc47eedc00e40a4ee 1_S1_L001_I1_001.fastq.gz
253f30177f6ef940be21ce7344e0e1b7 1_S1_L001_I2_001.fastq.gz
864e999b8c444d477a0d68046da07210 1_S1_L001_R1_001.fastq.gz
5fdb505a254d522ed73887b8ad7bc135 1_S1_L001_R2_001.fastq.gz
116177f09c3b54526c002c6abb66710e 2_S1_L001_I1_001.fastq.gz
f729927e6f889982d06de7fcf151d731 2_S1_L001_I2_001.fastq.gz
83f32c42be6ce90dc72691f4d84faa1d 2_S1_L001_R1_001.fastq.gz
fb47d7cfb8ef0ce389d6b74c693c1d8e 3_S1_L001_I1_001.fastq.gz
88a091e3e5b31202cb57f5e97251e99f 3_S1_L001_I2_001.fastq.gz
d37f458c2559c9c7da9f320efd9f040e 3_S1_L001_R1_001.fastq.gz
6e95e26239fe076b42cf709b33a66493 3_S1_L001_R2_001.fastq.gz
a59b413031cf89ecc3c525a1bbcc7027 4_S1_L001_I1_001.fastq.gz
2df8c3a412d8df654f4aa4f1bcc7a0a2 4_S1_L001_I2_001.fastq.gz
e7def1b03b270c59107f92ee0b2a416c 4_S1_L001_R1_001.fastq.gz
422d399ec511130cc15d0bf7c30c071f 4_S1_L001_R2_001.fastq.gz

Step 3 we compare the hashes:
awk '

{ print $1 }' originalmd5.txt > tmporig
awk '{ print $1 }

' summarymd5.txt > tmpnew
diff tmporig tmpnew

They are all the same but one!!!

6309aa304d0b17104ede19c1203df495 ./2_S1_L001_R2_001.fastq.gz

Show

Robert Reid added a comment - 12/Mar/24 10:35 AM - edited The original MD5 sums reported by Azenta: /projects/tomato_genome/rnaseq/30-986962648-Rasha-singleCell/00_fastq 1 line command to see all of the MD5 for file in *md5;do cat $file; done 8385565fbce53bbcc47eedc00e40a4ee ./1_S1_L001_I1_001.fastq.gz 253f30177f6ef940be21ce7344e0e1b7 ./1_S1_L001_I2_001.fastq.gz 864e999b8c444d477a0d68046da07210 ./1_S1_L001_R1_001.fastq.gz 5fdb505a254d522ed73887b8ad7bc135 ./1_S1_L001_R2_001.fastq.gz 116177f09c3b54526c002c6abb66710e ./2_S1_L001_I1_001.fastq.gz f729927e6f889982d06de7fcf151d731 ./2_S1_L001_I2_001.fastq.gz 83f32c42be6ce90dc72691f4d84faa1d ./2_S1_L001_R1_001.fastq.gz 6309aa304d0b17104ede19c1203df495 ./2_S1_L001_R2_001.fastq.gz fb47d7cfb8ef0ce389d6b74c693c1d8e ./3_S1_L001_I1_001.fastq.gz 88a091e3e5b31202cb57f5e97251e99f ./3_S1_L001_I2_001.fastq.gz d37f458c2559c9c7da9f320efd9f040e ./3_S1_L001_R1_001.fastq.gz 6e95e26239fe076b42cf709b33a66493 ./3_S1_L001_R2_001.fastq.gz a59b413031cf89ecc3c525a1bbcc7027 ./4_S1_L001_I1_001.fastq.gz 2df8c3a412d8df654f4aa4f1bcc7a0a2 ./4_S1_L001_I2_001.fastq.gz e7def1b03b270c59107f92ee0b2a416c ./4_S1_L001_R1_001.fastq.gz 422d399ec511130cc15d0bf7c30c071f ./4_S1_L001_R2_001.fastq.gz Next we run MD5 on each fastq file. for file in *gz; do echo $file; md5sum $file >> summarymd5.txt; done 8385565fbce53bbcc47eedc00e40a4ee 1_S1_L001_I1_001.fastq.gz 253f30177f6ef940be21ce7344e0e1b7 1_S1_L001_I2_001.fastq.gz 864e999b8c444d477a0d68046da07210 1_S1_L001_R1_001.fastq.gz 5fdb505a254d522ed73887b8ad7bc135 1_S1_L001_R2_001.fastq.gz 116177f09c3b54526c002c6abb66710e 2_S1_L001_I1_001.fastq.gz f729927e6f889982d06de7fcf151d731 2_S1_L001_I2_001.fastq.gz 83f32c42be6ce90dc72691f4d84faa1d 2_S1_L001_R1_001.fastq.gz fb47d7cfb8ef0ce389d6b74c693c1d8e 3_S1_L001_I1_001.fastq.gz 88a091e3e5b31202cb57f5e97251e99f 3_S1_L001_I2_001.fastq.gz d37f458c2559c9c7da9f320efd9f040e 3_S1_L001_R1_001.fastq.gz 6e95e26239fe076b42cf709b33a66493 3_S1_L001_R2_001.fastq.gz a59b413031cf89ecc3c525a1bbcc7027 4_S1_L001_I1_001.fastq.gz 2df8c3a412d8df654f4aa4f1bcc7a0a2 4_S1_L001_I2_001.fastq.gz e7def1b03b270c59107f92ee0b2a416c 4_S1_L001_R1_001.fastq.gz 422d399ec511130cc15d0bf7c30c071f 4_S1_L001_R2_001.fastq.gz Step 3 we compare the hashes: awk ' { print $1 }' originalmd5.txt > tmporig awk '{ print $1 } ' summarymd5.txt > tmpnew diff tmporig tmpnew They are all the same but one!!! 6309aa304d0b17104ede19c1203df495 ./2_S1_L001_R2_001.fastq.gz

Hide

Permalink

Robert Reid added a comment - 12/Mar/24 11:34 AM

It just so happens they have included an extra folder called S2.

In there is 1 file. 2_S1_L001_R2_001.fastq.gz

This MD5 is
6309aa304d0b17104ede19c1203df495 2_S1_L001_R2_001.fastq.gz

This is the correct one!!!!

I am moving this file so that they are all in the same folder.

Show

Robert Reid added a comment - 12/Mar/24 11:34 AM It just so happens they have included an extra folder called S2. In there is 1 file. 2_S1_L001_R2_001.fastq.gz This MD5 is 6309aa304d0b17104ede19c1203df495 2_S1_L001_R2_001.fastq.gz This is the correct one!!!! I am moving this file so that they are all in the same folder.

Hide

Permalink

Robert Reid added a comment - 12/Mar/24 11:35 AM

Time for Review!

I need review to go to HPC and go to this folder:

/projects/tomato_genome/rnaseq/30-986962648-Rasha-singleCell/00_fastq

Here we need to check that the files are in check via MD5sum.

You can check the 2 files originalmd5.txt and summarymd5.txt.
Make sure that they match.

Show

Robert Reid added a comment - 12/Mar/24 11:35 AM Time for Review! I need review to go to HPC and go to this folder: /projects/tomato_genome/rnaseq/30-986962648-Rasha-singleCell/00_fastq Here we need to check that the files are in check via MD5sum. You can check the 2 files originalmd5.txt and summarymd5.txt. Make sure that they match.

Robert Reid made changes - 12/Mar/24 11:36 AM

Status

In Progress [ 3 ]

Needs 1st Level Review [ 10005 ]

Robert Reid made changes - 12/Mar/24 11:36 AM

Assignee

Robert Reid [ robertreid ]

Molly Davis [ molly ]

Hide

Permalink

Molly Davis added a comment - 12/Mar/24 3:20 PM

Review:

The md5 files seem to be the same. They don't contain much information besides names. I am not sure if there are other results that conclude the md5 check worked.

Show

Molly Davis added a comment - 12/Mar/24 3:20 PM Review : The md5 files seem to be the same. They don't contain much information besides names. I am not sure if there are other results that conclude the md5 check worked.

Molly Davis made changes - 12/Mar/24 3:21 PM

Assignee

Molly Davis [ molly ]

Robert Reid [ robertreid ]

Molly Davis made changes - 12/Mar/24 3:21 PM

Status

Needs 1st Level Review [ 10005 ]

First Level Review in Progress [ 10301 ]

Molly Davis made changes - 12/Mar/24 3:21 PM

Status

First Level Review in Progress [ 10301 ]

Ready for Pull Request [ 10304 ]