Uploaded image for project: 'IGB'
  1. IGB
  2. IGBF-3683

Update SRA to use the correct sample codes for Muday lab time course data

    Details

    • Type: Task
    • Status: Closed (View Workflow)
    • Priority: Major
    • Resolution: Done
    • Affects Version/s: None
    • Fix Version/s: None
    • Labels:
      None

      Description

      As shown in the linked ticket, we discovered that 16 samples from the Muday lab's flavonoid time course data were assigned incorrect sample codes.

      We decided to update those records in the SRA to use the correct sample codes.

      Robert Reid is working on that.

        Attachments

          Issue Links

            Activity

            Hide
            Mdavis4290 Molly Davis added a comment -

            Check for SRA update:

            • Visually checking one of the samples to see if it has been changed correctly
            • "SRA record SRR25478288 should be re-annotated with sample code V.34.30.9"
            • By using Run Selector I searched for the specific SRR name and found that it is still annotated as "F.34.30.9" and not the corrected version "V.34.30.9"
            • The SRA has not been updated with the re-annotated names
            Show
            Mdavis4290 Molly Davis added a comment - Check for SRA update : Visually checking one of the samples to see if it has been changed correctly "SRA record SRR25478288 should be re-annotated with sample code V.34.30.9" By using Run Selector I searched for the specific SRR name and found that it is still annotated as "F.34.30.9" and not the corrected version "V.34.30.9" The SRA has not been updated with the re-annotated names
            Hide
            robofjoy Robert Reid added a comment -

            Biosample data is all correct upon checking.

            The metatable for SRA still contains the same 16 errors (V and F get switched in the labelling).
            Attempting to alter the metatable to reflect this.
            So in their records, the metatable has the error with 16 mislabelled samples.

            In OUR file that was submitted
            (https://docs.google.com/spreadsheets/d/1n4nsE4E8lykivizPtQyf17XJnL7FRELR/edit?usp=sharing&ouid=100714234126361751017&rtpof=true&sd=true )

            it is all correct!

            Show
            robofjoy Robert Reid added a comment - Biosample data is all correct upon checking. The metatable for SRA still contains the same 16 errors (V and F get switched in the labelling). Attempting to alter the metatable to reflect this. So in their records, the metatable has the error with 16 mislabelled samples. In OUR file that was submitted ( https://docs.google.com/spreadsheets/d/1n4nsE4E8lykivizPtQyf17XJnL7FRELR/edit?usp=sharing&ouid=100714234126361751017&rtpof=true&sd=true ) it is all correct!
            Hide
            robofjoy Robert Reid added a comment -

            Pulling the table down from NCBI, these lines are the ones that get switched:

            Show
            robofjoy Robert Reid added a comment - Pulling the table down from NCBI, these lines are the ones that get switched:
            Hide
            robofjoy Robert Reid added a comment - - edited

            When we go from top down via the Bioproject:

            https://www.ncbi.nlm.nih.gov/bioproject/PRJNA1017816

            We see the 72 samples.
            In the SRA table, I can see the errors as so:

            I made changes and submitted to the SRA.
            Changes now look like this:

            ! Can't seem to post the image!

            SRR25478273
            SRR25478277
            SRR25478274
            SRR25478276
            SRR25478278
            SRR25478272
            SRR25478291
            SRR25478260
            SRR25478259
            SRR25478262
            SRR25478292
            SRR25478289
            SRR25478261
            SRR25478288
            SRR25478275
            SRR25478279

            Show
            robofjoy Robert Reid added a comment - - edited When we go from top down via the Bioproject: https://www.ncbi.nlm.nih.gov/bioproject/PRJNA1017816 We see the 72 samples. In the SRA table, I can see the errors as so: I made changes and submitted to the SRA. Changes now look like this: ! Can't seem to post the image! SRR25478273 SRR25478277 SRR25478274 SRR25478276 SRR25478278 SRR25478272 SRR25478291 SRR25478260 SRR25478259 SRR25478262 SRR25478292 SRR25478289 SRR25478261 SRR25478288 SRR25478275 SRR25478279
            Hide
            robofjoy Robert Reid added a comment -

            We now wait.
            Once we get word that SRA has made changes, we can retest the download.

            Show
            robofjoy Robert Reid added a comment - We now wait. Once we get word that SRA has made changes, we can retest the download.
            Hide
            robofjoy Robert Reid added a comment -

            SRA is now updated!
            According to them.

            I think we are ready to test once again if things are correct!
            Assigning to Molly for testing which will act as the review.

            Show
            robofjoy Robert Reid added a comment - SRA is now updated! According to them. I think we are ready to test once again if things are correct! Assigning to Molly for testing which will act as the review.
            Hide
            Mdavis4290 Molly Davis added a comment - - edited

            After reviewing the changed names in run selector it seems that the "Library Names" were changed for the samples that needed to be re-annotated. But the other columns "replicate_code" and " Sample Name" were not changed. I put a picture of an example below. Each column will need to match before we can move forward with running it through nextflow again. Moving back to to do.

            Show
            Mdavis4290 Molly Davis added a comment - - edited After reviewing the changed names in run selector it seems that the "Library Names" were changed for the samples that needed to be re-annotated. But the other columns "replicate_code" and " Sample Name" were not changed. I put a picture of an example below. Each column will need to match before we can move forward with running it through nextflow again. Moving back to to do.
            Hide
            robofjoy Robert Reid added a comment -

            Uhoh, that will get trickier.
            Those columns are not editable via the manage data tabs.

            I will send an email to the SRA to get the Replicate code and library name columns to be in sync with Sample name column.

            Rob

            Show
            robofjoy Robert Reid added a comment - Uhoh, that will get trickier. Those columns are not editable via the manage data tabs. I will send an email to the SRA to get the Replicate code and library name columns to be in sync with Sample name column. Rob
            Hide
            robofjoy Robert Reid added a comment -

            Apparently we do not contact the SRA people. The SRA data is correct.
            We need to contact the Biosample people. biosamplehelp@ncbi.nlm.nih.gov

            I will ping them and request the changes, as I cannot edit that metatable directly.

            Rob

            Show
            robofjoy Robert Reid added a comment - Apparently we do not contact the SRA people. The SRA data is correct. We need to contact the Biosample people. biosamplehelp@ncbi.nlm.nih.gov I will ping them and request the changes, as I cannot edit that metatable directly. Rob
            Hide
            robofjoy Robert Reid added a comment -

            PRJNA1017816-changes2Biosample-reAssignSRR.xlsx

            I have sent them what I think is a simple excel sheet highlighting these differences at the Biosample level.
            We will see how they respond.
            R

            Show
            robofjoy Robert Reid added a comment - PRJNA1017816-changes2Biosample-reAssignSRR.xlsx I have sent them what I think is a simple excel sheet highlighting these differences at the Biosample level. We will see how they respond. R
            Hide
            robofjoy Robert Reid added a comment -

            Changes made!
            Time to review if all changes have truly been made.

            For review:

            Check the SRA and its various tables to see if there are any of the mislabels in there.

            Show
            robofjoy Robert Reid added a comment - Changes made! Time to review if all changes have truly been made. For review: Check the SRA and its various tables to see if there are any of the mislabels in there.
            Hide
            Mdavis4290 Molly Davis added a comment -

            Review: I use the Run selector for SRP460750 on NCBI to check the newly re-annotated runs. I will review each of the 16 that needed to be re-annoted based on the check SRA markdown created.

            • SRR25478288 was re-annotated with sample code V.34.30.9
            • SRR25478289 was re-annotated with sample code V.28.30.9
            • SRR25478291 was re-annotated with sample code V.34.15.9
            • SRR25478292 was re-annotated with sample code V.28.15.9
            • SRR25478276 was re-annotated with sample code V.34.75.9
            • SRR25478277 was re-annotated with sample code V.28.75.9
            • SRR25478278 was re-annotated with sample code V.34.45.9
            • SRR25478279 was re-annotated with sample code V.28.45.9
            • SRR25478272 was re-annotated with sample code F.34.30.9
            • SRR25478273 was re-annotated with sample code F.28.30.9
            • SRR25478274 was re-annotated with sample code F.34.15.9
            • SRR25478275 was re-annotated with sample code F.28.15.9
            • SRR25478259 was re-annotated with sample code F.34.75.9
            • SRR25478260 was re-annotated with sample code F.28.75.9
            • SRR25478261 was re-annotated with sample code F.34.45.9
            • SRR25478262 was re-annotated with sample code F.28.45.9

            After checking each of the samples that need to be re-annotated on SRA I see no problem with the data. Assigning ticket back to Dr. Reid to finalize and move to done! Then Once this ticket is closed new tickets can be made to rerun the new re-annotated data through the pipeline again.

            Show
            Mdavis4290 Molly Davis added a comment - Review : I use the Run selector for SRP460750 on NCBI to check the newly re-annotated runs. I will review each of the 16 that needed to be re-annoted based on the check SRA markdown created. SRR25478288 was re-annotated with sample code V.34.30.9 SRR25478289 was re-annotated with sample code V.28.30.9 SRR25478291 was re-annotated with sample code V.34.15.9 SRR25478292 was re-annotated with sample code V.28.15.9 SRR25478276 was re-annotated with sample code V.34.75.9 SRR25478277 was re-annotated with sample code V.28.75.9 SRR25478278 was re-annotated with sample code V.34.45.9 SRR25478279 was re-annotated with sample code V.28.45.9 SRR25478272 was re-annotated with sample code F.34.30.9 SRR25478273 was re-annotated with sample code F.28.30.9 SRR25478274 was re-annotated with sample code F.34.15.9 SRR25478275 was re-annotated with sample code F.28.15.9 SRR25478259 was re-annotated with sample code F.34.75.9 SRR25478260 was re-annotated with sample code F.28.75.9 SRR25478261 was re-annotated with sample code F.34.45.9 SRR25478262 was re-annotated with sample code F.28.45.9 After checking each of the samples that need to be re-annotated on SRA I see no problem with the data. Assigning ticket back to Dr. Reid to finalize and move to done! Then Once this ticket is closed new tickets can be made to rerun the new re-annotated data through the pipeline again.

              People

              • Assignee:
                robofjoy Robert Reid
                Reporter:
                ann.loraine Ann Loraine
              • Votes:
                0 Vote for this issue
                Watchers:
                Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: