Uploaded image for project: 'IGB'
  1. IGB
  2. IGBF-3246

Generate list of EBSeq-HMM clusters for varieties and conditions

    Details

    • Type: Task
    • Status: Closed (View Workflow)
    • Priority: Major
    • Resolution: Done
    • Affects Version/s: None
    • Fix Version/s: None
    • Labels:
      None

      Description

      Run EBSeq-HMM on SL5 gene count files and report the lists of genes within each cluster.

      • Create a new git repository for clustering code within hotpollen for the code
      • Create a new merged table with results from "seedlingPollen" and "mark-2022-timeseries" in the Charlotte cluster file system
      • Run EB-Seq on the merged file
      • Produce human-readable, usable text files with results and tracking information indicating the source of the data used and also the version of the code that was used
      • Share with collaborators: results files and markdown report

        Attachments

          Issue Links

            Activity

            ann.loraine Ann Loraine created issue -
            ann.loraine Ann Loraine made changes -
            Field Original Value New Value
            Epic Link IGBF-2993 [ 21429 ]
            ann.loraine Ann Loraine made changes -
            Link This issue relates to IGBF-3242 [ IGBF-3242 ]
            ann.loraine Ann Loraine made changes -
            Link This issue relates to IGBF-3244 [ IGBF-3244 ]
            Mdavis4290 Molly Davis made changes -
            Description Run EBSeq-HMM on SL5 gene count files *and* report the lists of genes within each cluster.

            * Create a new git repository for clustering code within hotpollen for the code
            * Create a new merged table with results from "seedlingPollen" and "mark-2022-timeseries" in the Charlotte cluster file system
            * Run EB-Seq on the merged file
            * Produce human-readable, usable text files with results and tracking information indicating the source of the data used and also the version of the code that was used
            * Share with collaborators: results files and markdown report
            Run EBSeq-HMM on SL5 gene count files *and* report the lists of genes within each cluster.

            * Create a new git repository for clustering code within hotpollen for the code
            *{color:#14892c} Create a new merged table with results from "seedlingPollen" and "mark-2022-timeseries" in the Charlotte cluster file system {color}
            * {color:#14892c}Run EB-Seq on the merged file{color}
            * Produce human-readable, usable text files with results and tracking information indicating the source of the data used and also the version of the code that was used
            * Share with collaborators: results files and markdown report
            Mdavis4290 Molly Davis made changes -
            Description Run EBSeq-HMM on SL5 gene count files *and* report the lists of genes within each cluster.

            * Create a new git repository for clustering code within hotpollen for the code
            *{color:#14892c} Create a new merged table with results from "seedlingPollen" and "mark-2022-timeseries" in the Charlotte cluster file system {color}
            * {color:#14892c}Run EB-Seq on the merged file{color}
            * Produce human-readable, usable text files with results and tracking information indicating the source of the data used and also the version of the code that was used
            * Share with collaborators: results files and markdown report
            Run EBSeq-HMM on SL5 gene count files *and* report the lists of genes within each cluster.

            * Create a new git repository for clustering code within hotpollen for the code
            * {color:#14892c} Create a new merged table with results from "seedlingPollen" and "mark-2022-timeseries" in the Charlotte cluster file system {color}
            * {color:#14892c}Run EB-Seq on the merged file{color}
            * Produce human-readable, usable text files with results and tracking information indicating the source of the data used and also the version of the code that was used
            * Share with collaborators: results files and markdown report
            Mdavis4290 Molly Davis made changes -
            Hide
            Mdavis4290 Molly Davis added a comment - - edited

            Update:

            • Dr. Reid created a new merged table with results from "seedlingPollen" and "mark-2022-timeseries" in the Charlotte cluster file system.
              File: mark-timeseries-SL5-salmoncounts.txt
            • I ran EB-Seq on the merged file and produced cluster path txt files.
            • I am still trying to figure our how to create human readable files. I am having difficulty making it a spreadsheet because there would be a different number of rows or columns to each label.
            • Currently also trying to make Venn diagrams to compare cluster files.

            Next Step: Make directory and share with collaborators.

            Show
            Mdavis4290 Molly Davis added a comment - - edited Update: Dr. Reid created a new merged table with results from "seedlingPollen" and "mark-2022-timeseries" in the Charlotte cluster file system. File: mark-timeseries-SL5-salmoncounts.txt I ran EB-Seq on the merged file and produced cluster path txt files. I am still trying to figure our how to create human readable files. I am having difficulty making it a spreadsheet because there would be a different number of rows or columns to each label. Currently also trying to make Venn diagrams to compare cluster files. Next Step: Make directory and share with collaborators.
            Mdavis4290 Molly Davis made changes -
            Assignee Molly Davis [ molly ]
            Mdavis4290 Molly Davis made changes -
            Status To-Do [ 10305 ] In Progress [ 3 ]
            Hide
            ann.loraine Ann Loraine added a comment - - edited

            Result from today's discussion:

            1) Run the clustering algorithm on the following sample groups:

            • Heinz stress, MP, 30 min, 60 min, 120 min
            • Heinz control, MP, 30 min, 60 min, 120 min
            • Tamaulipas stress, MP, 30 min, 60 min, 120 min
            • Tamaulipas control, MP, 30 min, 60 min, 120 min
            • Nagcarlang stress, MP, 30 min, 60 min, 120 min
            • Nagcarlang control, MP, 30 min, 60 min, 120 min

            This will create some EBSeq-HMM objects corresponding to clusters, where each cluster has a name like "UP-UP-UP".

            2) Next, write out them into one tab-separated file with the following columns:

            • cluster name (e.g., UP-UP-UP)
            • SL5 gene name
            • genotype (e.g., Heinz, Nagcarlang or Tamaulipas)
            • temperature (e.g., C.28, S.34)
            Show
            ann.loraine Ann Loraine added a comment - - edited Result from today's discussion: 1) Run the clustering algorithm on the following sample groups: Heinz stress, MP, 30 min, 60 min, 120 min Heinz control, MP, 30 min, 60 min, 120 min Tamaulipas stress, MP, 30 min, 60 min, 120 min Tamaulipas control, MP, 30 min, 60 min, 120 min Nagcarlang stress, MP, 30 min, 60 min, 120 min Nagcarlang control, MP, 30 min, 60 min, 120 min This will create some EBSeq-HMM objects corresponding to clusters, where each cluster has a name like "UP-UP-UP". 2) Next, write out them into one tab-separated file with the following columns: cluster name (e.g., UP-UP-UP) SL5 gene name genotype (e.g., Heinz, Nagcarlang or Tamaulipas) temperature (e.g., C.28, S.34)
            Mdavis4290 Molly Davis made changes -
            Attachment EBSeq_tomato.Rmd [ 17670 ]
            Mdavis4290 Molly Davis made changes -
            Attachment all_data_clusters.csv [ 17671 ]
            Hide
            Mdavis4290 Molly Davis added a comment -
            Show
            Mdavis4290 Molly Davis added a comment - Made commit to bitbucket: https://bitbucket.org/hotpollen/time-series/commits/cd23c7c2c4688e470925d063a11c0b225800e07c Includes R markdown, count file, and a final dataframe as a csv file. EBSeq_tomato.Rmd mark-timeseries-SL5-salmoncounts.txt all_data_clusters.csv
            Mdavis4290 Molly Davis made changes -
            Description Run EBSeq-HMM on SL5 gene count files *and* report the lists of genes within each cluster.

            * Create a new git repository for clustering code within hotpollen for the code
            * {color:#14892c} Create a new merged table with results from "seedlingPollen" and "mark-2022-timeseries" in the Charlotte cluster file system {color}
            * {color:#14892c}Run EB-Seq on the merged file{color}
            * Produce human-readable, usable text files with results and tracking information indicating the source of the data used and also the version of the code that was used
            * Share with collaborators: results files and markdown report
            Run EBSeq-HMM on SL5 gene count files *and* report the lists of genes within each cluster.

            * Create a new git repository for clustering code within hotpollen for the code
            * Create a new merged table with results from "seedlingPollen" and "mark-2022-timeseries" in the Charlotte cluster file system
            * Run EB-Seq on the merged file
            * Produce human-readable, usable text files with results and tracking information indicating the source of the data used and also the version of the code that was used
            * Share with collaborators: results files and markdown report
            Mdavis4290 Molly Davis made changes -
            Assignee Molly Davis [ molly ]
            Mdavis4290 Molly Davis made changes -
            Status In Progress [ 3 ] Needs 1st Level Review [ 10005 ]
            Hide
            Mdavis4290 Molly Davis added a comment - - edited

            Update:

            • [~RobertReid] reviewed the files.

            Need to do:

            Show
            Mdavis4290 Molly Davis added a comment - - edited Update: [~RobertReid] reviewed the files. Need to do: Change print commands to just include head() of tables. Add files to drive for tomato group Drive location: https://drive.google.com/drive/folders/1_I6PLSoBg4H3Y1MzY2qhxjk-FBmM4cmQ?usp=share_link Update bitbucket with new changes to Rmd file with commit.
            Mdavis4290 Molly Davis made changes -
            Status Needs 1st Level Review [ 10005 ] First Level Review in Progress [ 10301 ]
            Mdavis4290 Molly Davis made changes -
            Status First Level Review in Progress [ 10301 ] To-Do [ 10305 ]
            Mdavis4290 Molly Davis made changes -
            Status To-Do [ 10305 ] In Progress [ 3 ]
            Mdavis4290 Molly Davis made changes -
            Assignee Molly Davis [ molly ]
            Mdavis4290 Molly Davis made changes -
            Assignee Molly Davis [ molly ]
            Mdavis4290 Molly Davis made changes -
            Assignee Molly Davis [ molly ]
            Mdavis4290 Molly Davis made changes -
            Status In Progress [ 3 ] Needs 1st Level Review [ 10005 ]
            Mdavis4290 Molly Davis made changes -
            Status Needs 1st Level Review [ 10005 ] First Level Review in Progress [ 10301 ]
            Mdavis4290 Molly Davis made changes -
            Status First Level Review in Progress [ 10301 ] Ready for Pull Request [ 10304 ]
            Mdavis4290 Molly Davis made changes -
            Status Ready for Pull Request [ 10304 ] Pull Request Submitted [ 10101 ]
            Mdavis4290 Molly Davis made changes -
            Status Pull Request Submitted [ 10101 ] Reviewing Pull Request [ 10303 ]
            Mdavis4290 Molly Davis made changes -
            Status Reviewing Pull Request [ 10303 ] Merged Needs Testing [ 10002 ]
            Mdavis4290 Molly Davis made changes -
            Status Merged Needs Testing [ 10002 ] Post-merge Testing In Progress [ 10003 ]
            Show
            Mdavis4290 Molly Davis added a comment - Commit to Bitbucket: https://bitbucket.org/hotpollen/time-series/commits/eb84ceac06f5ef1c2e89d7960019d94a3be3c73e
            Mdavis4290 Molly Davis made changes -
            Resolution Done [ 10000 ]
            Status Post-merge Testing In Progress [ 10003 ] Closed [ 6 ]

              People

              • Assignee:
                Mdavis4290 Molly Davis
                Reporter:
                ann.loraine Ann Loraine
              • Votes:
                0 Vote for this issue
                Watchers:
                2 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: