4 Replies Latest reply on Dec 31, 2012 6:18 AM by Xavier P

    Simple data extraction question

    Xavier P

      To all data rockstars out there who spend time cleaning data, I have free text information from marketing that I need to extract.


      For example, in some rows I have an email address.


      McDonald's | Ron McDonald's | ron@bigmac.com


      Whenever the email is there, I want to extract it.  It is always the last part of the field and is always preceded with a vertical dash.


      How would you go about extracting this?


      Check presence of @ and exclude everything to the left of | ?


      Sounds like a simple calc but not sure how to go around it.


      Thank you and happy new year!

        • 1. Re: Simple data extraction question
          Shawn Wallwork

          Xavier, I'm a little unclear about the actual form of your data. What sort of file holds this data? You seem to be saying you have rows of data that contain long strings separated by the piping character. Are these text files? You know you can bring these into Excel as 'other' delimited text files:



          If this is an option, it's your easiest route, because now all you have to do is trim the space off the front of the email address in either Excel or Tableau. If this is not an option, you're still better off trying to manipulate these strings in Excel rather than Tableau, because Excel supports the wildcard * (all) character and has robust find/replace functionality. We could probably come up with a way to do this in Tableau, but it would almost certainly be a tortured formula.


          Always best to manipulate, shape, reshape your data before you bring it into Tableau, then let Tableau do what it does best -- visualize your data.



          • 2. Re: Simple data extraction question
            Jonathan Drummey

            In case this wasn't clear from Shawn's explanation, Tableau can import with the pipe (vertical bar) delimiter, there are some caveats: Spaces between the delimiter can be preserved in both column names and values, so you'd need to manually trim column names by renaming them in the Dimensions & Measures windows in Tableau, and then create calculated fields using the TRIM() function to clean up the necessary dimensions and measures.



            1 of 1 people found this helpful
            • 3. Re: Simple data extraction question
              Xavier P

              Shawn, thanks ever so much, that is why I like posting here, clever cookies like yourself can think out of the box, this worked a treat!!

              • 4. Re: Simple data extraction question
                Xavier P

                Thanks Jonathan, was responding to Shawn whilst you replied, his solution was fine so am going to stick with that one, but thanks for the alternative option.  Happy new year to you Both!