1 2 Previous Next 24 Replies Latest reply on Mar 22, 2017 11:31 AM by Vinay Kumar

    Extract only numbers from a text field

    Bernard Chang

      Hi guys, I am just looking for a simple formula to extract the numeric characters from a phone number field.  Please see the picture for the desired output.


      I have tried regular expression REGEXP_EXTRACT([Phone Number],"([0-9]+)") but that only gives me the first set of the number.  I can write a nested replace function to do the job but that is just UGLY.  Just wondering if there is a neater way to do it. 

        • 1. Re: Extract only numbers from a text field
          KK Molugu

          Bernard Chang


          Unfortunately, I don't have 9.0 right now, but T is using standard Regex patters and try (\d+) as your pattern and see if it pulls all numbers in the given string..



          • 2. Re: Extract only numbers from a text field
            Jim Wahl

            Doesn't look like Tableau supports the global modifier to return all matches. A bit strange that REGEX_REPLACE would be more constrained than the regular REPLACE().



            or an EXTRACT_ALL. ...

            I'm curious what you come up with and hopefully it's not something like


            REGEXP_EXTRACT([Phone], '([0-9]+)') +

            IFNULL(REGEXP_EXTRACT([Phone], '[0-9]+[^0-9]+([0-9]+)'), '') +

            IFNULL(REGEXP_EXTRACT([Phone], '[0-9]+[^0-9]+[0-9]+[^0-9]+([0-9]+)'), '') +

            IFNULL(REGEXP_EXTRACT([Phone], '[0-9]+[^0-9]+[0-9]+[^0-9]+[0-9]+[^0-9]+([0-9]+)'), '') +

            IFNULL(REGEXP_EXTRACT([Phone], '[0-9]+[^0-9]+[0-9]+[^0-9]+[0-9]+[^0-9]+[0-9]+[^0-9]+([0-9]+)'), '')

            1 of 1 people found this helpful
            • 3. Re: Extract only numbers from a text field
              Rody Zakovich

              Hi Jim,


              I discovered this as well, it appears that RegEx in Tableau will not "loop" through and return all matches, but rather the first match it finds in the string, for example when using (/d+).


              Hopefully this can be expanded in later releases.




              1 of 1 people found this helpful
              • 4. Re: Extract only numbers from a text field

                The help text in the Calculated Field window (screenshot below) gives the following example:

                REGEXP_REPLACE('abc  123', '\s','-') = 'abc--123'

                which seems to imply that it finds all instances to replace, but I don't think that example is correct.


                The help pages at http://onlinehelp.tableau.com/v9.0/pro/online/windows/en-us/help.htm#functions_functions_additional.html?Highlight=regex…

                have a different example

                REGEXP_REPLACE('abc 123', '\s', '-') = 'abc-123'

                As you said, not ideal, but it was looking like running a chain of five calculated fields

                accomplished the goal

                Calc1: REGEXP_REPLACE([Phone#],"[\D]","")

                Calc2: REGEXP_REPLACE([Calc1],"[\D]","")

                Calc3: REGEXP_REPLACE([Calc2],"[\D]","")



                • 5. Re: Extract only numbers from a text field
                  Bernard Chang

                  I came up with something like this eventually. It is far from ideal but it does the job for me for the time being.

                  Hopefully Tableau can beef up the Regexp function in the next version.


                  replace(replace(replace(replace(replace(replace(replace(replace(replace(replace(replace(replace(replace(lower([Phone]),"-","")," ",""),".",""),"#",""),"/",""),"e",""),"x",""),"t",""),"(",""),")",""),":",""),"@",""),"+","")

                  • 6. Re: Extract only numbers from a text field



                    You could something like this maybe. The other helpers gave a nice solution, but this will avoid having to use 5 calcs to reach a solution.


                    REGEXP_REPLACE(REGEXP_REPLACE([String], '[\D]{1,5}',""), '[\D]' , "")


                    The internal regex loop will take care of replacing {1,5} repetition of a non -digit character so that takes care of any 1st or 1 to 5 non digits simultaneously. You are then left with only one non-digit character in each string which is taken care by the outer regex replace loop.



                    • 7. Re: Extract only numbers from a text field
                      Jim Wahl

                      Pooja --


                      I think your calc is just going to replace the first two non-digit occurrences?  Also out of curiosity why limit the first group to between 1 and five chars?


                      It works for

                      (888) 1234567



                      but (I'm guessing) not

                      (888) 123 4567

                      (888) 123-4567

                      +31 (0)6 12 34 56

                      +31 (0)6 12 34 56

                      So you'd need to build it out







                                              REGEXP_REPLACE([phone], '[\D]+',"")

                                          , '[\D]+' , "")

                                      , '[\D]+' , "")

                                  , '[\D]+' , "")

                              , '[\D]+' , "")

                          , '[\D]+' , "")

                      , '[\D]+' , "")


                      (And now it resembles a layout container on a dashboard that's been moved twice three times. )


                      In javascript, postgres, ... you can use the 'g' option to return all matches instead of just the first group. I've also seen a REPLACE_ALL(). But I don't see anything similar in Tableau.

                      • 8. Re: Extract only numbers from a text field

                        Hi Jim,


                        The {1,5} was specifically for the \sExt. Part in the 4th row. You are right that why limit the characters to {1,5} but this was just specific to the problem asked. To keep it more open, you can actually just use {1,}. With your example included the non-digit characters are 7 in one particular string (+31 (0)6 12 34 56) and they are not back to back unlike the \sExt. so the {1,5} is not required because with each outer loop a non-digit character is replaced. The {1,} would work if that is the max number of non-digit characters back to back which in this case it was. The inner loop in my example took off the \sExt. (5 non-digit characters at once). All you were left with was then only 1 non-digit character which was take care by the outer loop.

                        That would just avoid 5 calcs as previously suggested by other helpers. My point was just having a reduction in calculated steps (I am a fan of less steps ) to achieve the results that was being seeked and did not bother to think outside the box




                        • 9. Re: Extract only numbers from a text field
                          Jim Wahl

                          Makes sense -- I'm not a regexpert, and I wasn't sure if I was missing something.

                          • 10. Re: Extract only numbers from a text field

                            Neither am I Thanks for your input!

                            • 11. Re: Extract only numbers from a text field
                              Jim Wahl

                              FYI -- created an idea here to add support for REGEXP_REPLACEALL OR REGEXP_EXTRACTALL.


                              • 13. Re: Extract only numbers from a text field
                                KK Molugu

                                +1... voted.. would be nice to inherit all standard regex paterns than new functions.. or if there is a param to replace globally...

                                • 14. Re: Extract only numbers from a text field
                                  Ross Bunker

                                  Hi everyone,

                                  Wanted to update this thread with some internal info.  REGEXP_REPLACE was supposed to work globally.  It does so for many implementations (postgres, oracle, impala, etc...).  Data engine extracts were the _only_ case where we messed up and it wasn't global.  We have actually fixed this in the forthcoming 9.1 release.  We haven't fixed it in a 9.0 maintenance release out of an abundance of caution.  We didn't want to break workbooks that relied on the old behavior in a maintenance release.  Esp. since there was a workaround that people had been using.


                                  That said, I see that a lot of people were calling for a separate version of the function to handle the ALL case.  Since the basic version will now do ALL, i'm wondering if there is demand out there for a ONE/FIRST version that only replaces the first occurrence.  I don't know that we could support it on all platforms, but I'm just curious if the lack of a FIRST implementation will cause hardship for anyone.


                                  If so, I'd suggest creating a new Idea to vote on for the ONE/FIRST implementation.




                                  1 2 Previous Next