6 Replies Latest reply on Mar 4, 2016 9:19 PM by Jeff D

    REGEXP_REPLACE - Call Matched String ?

    Darin Coulter

      Hello,

       

      I'm struggling to identify the proper syntax for replacing every occurrence for the string pattern in the REGEXP_REPLACE function in Tableau Desktop.

       

      I've attached a sample workbook that has like data and my current status of that function ([Delimited Version]).

       

      I've got a string of characters in my dataset that I need to insert a - or a | between every 2 alphanumeric characters.

       

      So - I want FN0317BKAP to be replaced to FN-03-17-BK-AP.  This will allow me to look for certain combinations like the 17 with another calc field:

       

      CONTAINS([Delimited Version], "17").

       

      My current formula for the replace function is REGEXP_REPLACE([Description Codes], '(.{2}(?!$))', '\1-').

       

      If I did this in HP Vertica - this works fine as the \1 indicates a matched pattern.   I don't know how to get Tableau Desktop to do the same thing?  Right now I'm ending up with:

      1-1-1-1-AP if we use my previous example.

       

      I appreciate the help.

       

      Thanks,

      Darin

        • 1. Re: REGEXP_REPLACE - Call Matched String ?
          pooja.gandhi

          Hi Darin!

           

          I saw this thread yesterday, but didn't have access to Tableau at the time. What you need is REGEXP_EXTRACT_NTH because you want the substring matched to the nth capturing group. So the syntax would be something like this:

           

            REGEXP_REPLACE(REGEXP_EXTRACT_NTH([Description Codes], '(\w{2})', 1)+ '-'

          + REGEXP_EXTRACT_NTH([Description Codes], '(\w{2})(\w{2})', 2)+ '-'

          + REGEXP_EXTRACT_NTH([Description Codes], '(\w{2})(\w{2})(\w{2})', 3)+ '-'

          + REGEXP_EXTRACT_NTH([Description Codes], '(\w{2})(\w{2})(\w{2})(\w{2})', 4) + '-'

          + IFNULL(REGEXP_EXTRACT_NTH([Description Codes], '(\w{2})(\w{2})(\w{2})(\w{2})(\w{2})', 5) + '-', " ")

          + IFNULL(REGEXP_EXTRACT_NTH([Description Codes], '(\w{2})(\w{2})(\w{2})(\w{2})(\w{2})(\w{2})', 6) + '-', " ")

          + IFNULL(REGEXP_EXTRACT_NTH([Description Codes], '(\w{2})(\w{2})(\w{2})(\w{2})(\w{2})(\w{2})(\w{2})', 7) + '-', " ")

          + IFNULL(REGEXP_EXTRACT_NTH([Description Codes], '(\w{2})(\w{2})(\w{2})(\w{2})(\w{2})(\w{2})(\w{2})(\w{2})', 8) + '-', " ")

          + IFNULL(REGEXP_EXTRACT_NTH([Description Codes], '(\w{2})(\w{2})(\w{2})(\w{2})(\w{2})(\w{2})(\w{2})(\w{2})(\w{2})', 9), " "), '(-)\s', " ")

           

          Looks like all the strings have atleast 4 groups. Some strings have 5 and some 6 etc hence the use of IFNULL()  for those groups, so that it throws a space for those strings that do not have groups 5-9. You basically want to keep going until you have covered all groups. Although I am 110% sure there is a much better way of doing this (I am not sure though). The outer REPLACE function is taking off that extra hyphen that gets placed at the end of the resulted string. Hope this at least gets you started.

           

           

          Pooja.

          2 of 2 people found this helpful
          • 2. Re: REGEXP_REPLACE - Call Matched String ?
            Darin Coulter

            Thank you Pooja!

             

            It's disappointing to me, though, that what I can do in Vertica with 1 line of code - is going to require 20 here because there are between 1 and 20 codes possible rather than just the 9.

             

            I'll take this and run with it - as it does what I need.  I just wish there was a more 'global' option to say "every time this pattern occurs, whether it is the 1st or 20th - replace it with 'XYZ'".

             

            Thanks for your contribution here and in the Community in general - very much appreciated!

             

            Darin

            • 3. Re: REGEXP_REPLACE - Call Matched String ?
              pooja.gandhi

              Hi Darin!

               

              Found the right 1 line syntax:

               

              It is REGEXP_REPLACE([Description Codes], '(\w{2})', '$1-')

               

              But I can not get rid of the last hyphen even if I use a nested replace. Not sure why!

               

              2 of 2 people found this helpful
              • 4. Re: REGEXP_REPLACE - Call Matched String ?
                Darin Coulter

                Perfect!

                That one liner is:  REGEXP_REPLACE([Description Codes], '(.{2}(?!$))', '$1-')

                 

                I knew there had to be a much easier way!! I really just needed to replace my original \1 with $1.

                 

                Thank you so much!

                 

                Darin

                • 5. Re: REGEXP_REPLACE - Call Matched String ?
                  pooja.gandhi

                  YES! Exactly! Well, glad I could help

                  1 of 1 people found this helpful
                  • 6. Re: REGEXP_REPLACE - Call Matched String ?
                    Jeff D

                    Hi Pooja and Darin, here's a nested replace that should work

                     

                    REGEXP_REPLACE(REGEXP_REPLACE([Description Codes], '(\w{2})', '$1-'), '(.*)-','$1')
                    

                     

                    (you may have to scroll to see the whole line)

                     

                    2016-03-04_211456.png

                    1 of 1 people found this helpful