Uploaded image for project: 'IGB'
  1. IGB
  2. IGBF-3236

Interpret nucleotide symbols when searching by residue

    Details

    • Story Points:
      3
    • Sprint:
      Summer 3 2023 June 12, Summer 4 2023 June 26, Summer 5 2023 July 10

      Description

      Situation: Under the Advanced Search tab the Search can be set to residues. This is extremely useful as a user can search for primer locations or motifs. While the Residues search does allow for wildcards (.[]*) it does not appear to understand nucleotide symbols such as R [G/A] Y [C/T] etc. So a motif found in a paper such as CACRTS does not work correctly under the Advanced Search for Residues in IGB.

      Task: Expand the logic for IGB Advanced Search for Residues so that IGB can understand Nucleotide Symbols.

      R A or G
      Y C or T
      S G or C
      W A or T
      K G or T
      M A or C
      B C or G or T
      D A or G or T
      H A or C or T
      V A or C or G
      N any base

      For example, if a user were to currently use the Advanced Search for Residues in IGB to look for the motif RYSNATCG IGB would not be able to find the motif, as IGB does not understand what RYSN refers to. New logic needs to be added to IGB so that when searching, IGB understands that R can match to either A or G, Y matches C or T, etc.

        Attachments

        1. escape_working.png
          escape_working.png
          35 kB
        2. pattern_error.png
          pattern_error.png
          86 kB
        3. regex.png
          regex.png
          3 kB
        4. result.png
          result.png
          101 kB
        5. with_escape_character.png
          with_escape_character.png
          31 kB
        6. with_escape_character.png
          with_escape_character.png
          31 kB
        7. without_escape_character.png
          without_escape_character.png
          40 kB
        8. without_escape_character.png
          without_escape_character.png
          40 kB

          Issue Links

            Activity

            nfreese Nowlan Freese created issue -
            nfreese Nowlan Freese made changes -
            Field Original Value New Value
            Epic Link IGBF-1765 [ 17855 ]
            nfreese Nowlan Freese made changes -
            Priority Major [ 3 ] Minor [ 4 ]
            Sprint Spring 2 2023 Jan 16 [ 162 ]
            Labels beginner
            nfreese Nowlan Freese made changes -
            Sprint Summer 2 2023 May 29 [ 171 ]
            nfreese Nowlan Freese made changes -
            Assignee Kaushik Gopu [ kgopu ]
            ann.loraine Ann Loraine made changes -
            Sprint Summer 2 2023 May 29 [ 171 ] Summer 4 2023 June 26 [ 173 ]
            ann.loraine Ann Loraine made changes -
            Sprint Summer 4 2023 June 26 [ 173 ] Summer 3 2023 June 12 [ 172 ]
            nfreese Nowlan Freese made changes -
            Description Situation: Under the Advanced Search tab the Search can be set to residues. This is extremely useful as a user can search for primer locations or motifs. While the Residues search does allow for wildcards (.[]*) it does not appear to understand nucleotide symbols such as R [G/A] Y [C/T] etc. So a motif found in a paper such as CACRTS does not work correctly under the Advanced Search for Residues in IGB.

            Task: Expand the logic for IGB Advanced Search for Residues so that IGB can understand Nucleotide Symbols.

            R A or G
            Y C or T
            S G or C
            W A or T
            K G or T
            M A or C
            B C or G or T
            D A or G or T
            H A or C or T
            V A or C or G
            N any base
            Situation: Under the Advanced Search tab the Search can be set to residues. This is extremely useful as a user can search for primer locations or motifs. While the Residues search does allow for wildcards (.[]*) it does not appear to understand nucleotide symbols such as R [G/A] Y [C/T] etc. So a motif found in a paper such as CACRTS does not work correctly under the Advanced Search for Residues in IGB.

            Task: Expand the logic for IGB Advanced Search for Residues so that IGB can understand Nucleotide Symbols.

            R A or G
            Y C or T
            S G or C
            W A or T
            K G or T
            M A or C
            B C or G or T
            D A or G or T
            H A or C or T
            V A or C or G
            N any base

            For example, if a user were to currently use the Advanced Search for Residues in IGB to look for the motif RYSNATCG IGB would not be able to find the motif, as IGB does not understand what RYSN refers to. New logic needs to be added to IGB so that when searching IGB understands that R can match to either A or G.
            nfreese Nowlan Freese made changes -
            Description Situation: Under the Advanced Search tab the Search can be set to residues. This is extremely useful as a user can search for primer locations or motifs. While the Residues search does allow for wildcards (.[]*) it does not appear to understand nucleotide symbols such as R [G/A] Y [C/T] etc. So a motif found in a paper such as CACRTS does not work correctly under the Advanced Search for Residues in IGB.

            Task: Expand the logic for IGB Advanced Search for Residues so that IGB can understand Nucleotide Symbols.

            R A or G
            Y C or T
            S G or C
            W A or T
            K G or T
            M A or C
            B C or G or T
            D A or G or T
            H A or C or T
            V A or C or G
            N any base

            For example, if a user were to currently use the Advanced Search for Residues in IGB to look for the motif RYSNATCG IGB would not be able to find the motif, as IGB does not understand what RYSN refers to. New logic needs to be added to IGB so that when searching IGB understands that R can match to either A or G.
            Situation: Under the Advanced Search tab the Search can be set to residues. This is extremely useful as a user can search for primer locations or motifs. While the Residues search does allow for wildcards (.[]*) it does not appear to understand nucleotide symbols such as R [G/A] Y [C/T] etc. So a motif found in a paper such as CACRTS does not work correctly under the Advanced Search for Residues in IGB.

            Task: Expand the logic for IGB Advanced Search for Residues so that IGB can understand Nucleotide Symbols.

            R A or G
            Y C or T
            S G or C
            W A or T
            K G or T
            M A or C
            B C or G or T
            D A or G or T
            H A or C or T
            V A or C or G
            N any base

            For example, if a user were to currently use the Advanced Search for Residues in IGB to look for the motif RYSNATCG IGB would not be able to find the motif, as IGB does not understand what RYSN refers to. New logic needs to be added to IGB so that when searching, IGB understands that R can match to either A or G.
            nfreese Nowlan Freese made changes -
            Description Situation: Under the Advanced Search tab the Search can be set to residues. This is extremely useful as a user can search for primer locations or motifs. While the Residues search does allow for wildcards (.[]*) it does not appear to understand nucleotide symbols such as R [G/A] Y [C/T] etc. So a motif found in a paper such as CACRTS does not work correctly under the Advanced Search for Residues in IGB.

            Task: Expand the logic for IGB Advanced Search for Residues so that IGB can understand Nucleotide Symbols.

            R A or G
            Y C or T
            S G or C
            W A or T
            K G or T
            M A or C
            B C or G or T
            D A or G or T
            H A or C or T
            V A or C or G
            N any base

            For example, if a user were to currently use the Advanced Search for Residues in IGB to look for the motif RYSNATCG IGB would not be able to find the motif, as IGB does not understand what RYSN refers to. New logic needs to be added to IGB so that when searching, IGB understands that R can match to either A or G.
            Situation: Under the Advanced Search tab the Search can be set to residues. This is extremely useful as a user can search for primer locations or motifs. While the Residues search does allow for wildcards (.[]*) it does not appear to understand nucleotide symbols such as R [G/A] Y [C/T] etc. So a motif found in a paper such as CACRTS does not work correctly under the Advanced Search for Residues in IGB.

            Task: Expand the logic for IGB Advanced Search for Residues so that IGB can understand Nucleotide Symbols.

            R A or G
            Y C or T
            S G or C
            W A or T
            K G or T
            M A or C
            B C or G or T
            D A or G or T
            H A or C or T
            V A or C or G
            N any base

            For example, if a user were to currently use the Advanced Search for Residues in IGB to look for the motif RYSNATCG IGB would not be able to find the motif, as IGB does not understand what RYSN refers to. New logic needs to be added to IGB so that when searching, IGB understands that R can match to either A or G, Y matches C or T, etc.
            nfreese Nowlan Freese made changes -
            Status To-Do [ 10305 ] In Progress [ 3 ]
            kgopu Kaushik Gopu made changes -
            Attachment result.png [ 17904 ]
            nfreese Nowlan Freese made changes -
            Link This issue relates to IGBF-3370 [ IGBF-3370 ]
            ann.loraine Ann Loraine made changes -
            Sprint Summer 3 2023 June 12 [ 172 ] Summer 3 2023 June 12, Summer 4 2023 June 26 [ 172, 173 ]
            ann.loraine Ann Loraine made changes -
            Rank Ranked higher
            kgopu Kaushik Gopu made changes -
            Attachment pattern_error.png [ 17906 ]
            kgopu Kaushik Gopu made changes -
            Attachment escape_working.png [ 17907 ]
            kgopu Kaushik Gopu made changes -
            Attachment without_escape_character.png [ 17909 ]
            Attachment with_escape_character.png [ 17910 ]
            kgopu Kaushik Gopu made changes -
            Attachment without_escape_character.png [ 17911 ]
            Attachment with_escape_character.png [ 17912 ]
            kgopu Kaushik Gopu made changes -
            Comment [ *How I handled "N" case:*

            I have created one regex, which is ((?<![\\])(?<![Q]))[Nn]((?<![\\])(?<![E]))
            *Explanation of above regex:*

            if there is character "N" in search sequence, it basically checks whether it is surrounded by escape characters or not. if yes, no substitution else substitute with respective symbols.

            *Breakdown of regex:*
            (?<![\\]): not \
            (?<![Q]) : not Q
            [Nn]: if there is N or n ( since it case insensitive search)
            ((?<![\\]): not \
            (?<![E])): not E
            Overall, don't do anything if N is surrounded by \Q and \E else replace.

            we can test any regex expression using this [https://regex101.com/] online tool(switch to java 8 before testing). as of now it works fine but I want to spend some time for testing it and after that I'll be pushing changes to remote.

            please check the attached image for the results below.

             !without_escape_character.png|thumbnail! !with_escape_character.png|thumbnail!
            ]
            kgopu Kaushik Gopu made changes -
            Attachment regex.png [ 17913 ]
            kgopu Kaushik Gopu made changes -
            Status In Progress [ 3 ] Needs 1st Level Review [ 10005 ]
            nfreese Nowlan Freese made changes -
            Assignee Kaushik Gopu [ kgopu ] Nowlan Freese [ nfreese ]
            nfreese Nowlan Freese made changes -
            Status Needs 1st Level Review [ 10005 ] First Level Review in Progress [ 10301 ]
            nfreese Nowlan Freese made changes -
            Status First Level Review in Progress [ 10301 ] Needs 1st Level Review [ 10005 ]
            nfreese Nowlan Freese made changes -
            Comment [ {code}((?<![\\])(?<![Q]))[Nn]((?<![\\])(?<![E])){code} ]
            nfreese Nowlan Freese made changes -
            Assignee Nowlan Freese [ nfreese ] Kaushik Gopu [ kgopu ]
            nfreese Nowlan Freese made changes -
            Status Needs 1st Level Review [ 10005 ] First Level Review in Progress [ 10301 ]
            nfreese Nowlan Freese made changes -
            Status First Level Review in Progress [ 10301 ] To-Do [ 10305 ]
            nfreese Nowlan Freese made changes -
            Status To-Do [ 10305 ] In Progress [ 3 ]
            nfreese Nowlan Freese made changes -
            Status In Progress [ 3 ] Needs 1st Level Review [ 10005 ]
            nfreese Nowlan Freese made changes -
            Assignee Kaushik Gopu [ kgopu ] Nowlan Freese [ nfreese ]
            nfreese Nowlan Freese made changes -
            Status Needs 1st Level Review [ 10005 ] First Level Review in Progress [ 10301 ]
            nfreese Nowlan Freese made changes -
            Assignee Nowlan Freese [ nfreese ] Kaushik Gopu [ kgopu ]
            nfreese Nowlan Freese made changes -
            Status First Level Review in Progress [ 10301 ] To-Do [ 10305 ]
            ann.loraine Ann Loraine made changes -
            Sprint Summer 3 2023 June 12, Summer 4 2023 June 26 [ 172, 173 ] Summer 3 2023 June 12, Summer 4 2023 June 26, Summer 5 2023 July 10 [ 172, 173, 174 ]
            ann.loraine Ann Loraine made changes -
            Rank Ranked higher
            kgopu Kaushik Gopu made changes -
            Status To-Do [ 10305 ] In Progress [ 3 ]
            kgopu Kaushik Gopu made changes -
            Status In Progress [ 3 ] Needs 1st Level Review [ 10005 ]
            nfreese Nowlan Freese made changes -
            Assignee Kaushik Gopu [ kgopu ] Nowlan Freese [ nfreese ]
            nfreese Nowlan Freese made changes -
            Status Needs 1st Level Review [ 10005 ] First Level Review in Progress [ 10301 ]
            nfreese Nowlan Freese made changes -
            Assignee Nowlan Freese [ nfreese ] Kaushik Gopu [ kgopu ]
            nfreese Nowlan Freese made changes -
            Status First Level Review in Progress [ 10301 ] Ready for Pull Request [ 10304 ]
            kgopu Kaushik Gopu made changes -
            Status Ready for Pull Request [ 10304 ] Pull Request Submitted [ 10101 ]
            nfreese Nowlan Freese made changes -
            Assignee Kaushik Gopu [ kgopu ]
            ann.loraine Ann Loraine made changes -
            Status Pull Request Submitted [ 10101 ] Reviewing Pull Request [ 10303 ]
            ann.loraine Ann Loraine made changes -
            Assignee Ann Loraine [ aloraine ]
            ann.loraine Ann Loraine made changes -
            Status Reviewing Pull Request [ 10303 ] To-Do [ 10305 ]
            ann.loraine Ann Loraine made changes -
            Assignee Ann Loraine [ aloraine ] Kaushik Gopu [ kgopu ]
            kgopu Kaushik Gopu made changes -
            Status To-Do [ 10305 ] In Progress [ 3 ]
            kgopu Kaushik Gopu made changes -
            Status In Progress [ 3 ] Needs 1st Level Review [ 10005 ]
            kgopu Kaushik Gopu made changes -
            Status Needs 1st Level Review [ 10005 ] First Level Review in Progress [ 10301 ]
            kgopu Kaushik Gopu made changes -
            Status First Level Review in Progress [ 10301 ] Needs 1st Level Review [ 10005 ]
            kgopu Kaushik Gopu made changes -
            Status Needs 1st Level Review [ 10005 ] First Level Review in Progress [ 10301 ]
            kgopu Kaushik Gopu made changes -
            Status First Level Review in Progress [ 10301 ] Needs 1st Level Review [ 10005 ]
            kgopu Kaushik Gopu made changes -
            Status Needs 1st Level Review [ 10005 ] First Level Review in Progress [ 10301 ]
            nfreese Nowlan Freese made changes -
            Status First Level Review in Progress [ 10301 ] Ready for Pull Request [ 10304 ]
            nfreese Nowlan Freese made changes -
            Status Ready for Pull Request [ 10304 ] Pull Request Submitted [ 10101 ]
            nfreese Nowlan Freese made changes -
            Assignee Kaushik Gopu [ kgopu ]
            ann.loraine Ann Loraine made changes -
            Status Pull Request Submitted [ 10101 ] Reviewing Pull Request [ 10303 ]
            ann.loraine Ann Loraine made changes -
            Assignee Ann Loraine [ aloraine ]
            ann.loraine Ann Loraine made changes -
            Status Reviewing Pull Request [ 10303 ] Merged Needs Testing [ 10002 ]
            ann.loraine Ann Loraine made changes -
            Assignee Ann Loraine [ aloraine ]
            nfreese Nowlan Freese made changes -
            Status Merged Needs Testing [ 10002 ] Post-merge Testing In Progress [ 10003 ]
            nfreese Nowlan Freese made changes -
            Assignee Nowlan Freese [ nfreese ]
            nfreese Nowlan Freese made changes -
            Assignee Nowlan Freese [ nfreese ] Kaushik Gopu [ kgopu ]
            nfreese Nowlan Freese made changes -
            Resolution Done [ 10000 ]
            Status Post-merge Testing In Progress [ 10003 ] Closed [ 6 ]
            nfreese Nowlan Freese made changes -
            Link This issue relates to IGBF-3403 [ IGBF-3403 ]
            nfreese Nowlan Freese made changes -
            Fix Version/s 9.1.12 Major Release [ 10800 ]
            nfreese Nowlan Freese made changes -
            Fix Version/s 10.0.0 [ 10900 ]
            Fix Version/s 9.1.12 Major Release [ 10800 ]

              People

              • Assignee:
                kgopu Kaushik Gopu
                Reporter:
                nfreese Nowlan Freese
              • Votes:
                0 Vote for this issue
                Watchers:
                3 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: