8 Replies Latest reply on Feb 17, 2017 1:57 PM by Peter Galimutti

    Best way to parse a string field and compare unique values over time

    Emile Petrone

      I have a field that has a string with values delineated with a "|", example a tags field with values like "blue|round|old". As the field can contain anywhere between 1-5 tags, splitting doesn't seem to be the right solution as it creates 5 new columns where each record has n values filling those new fields.

       

      What is the best way to:

       

      * Get the unique list of values

      * Compare those values over time

       

      I have been trying Parameters but that seems to show only 1 parameter at a time. With the unique values, I was trying to create a parameter for each tag, then I can filter based upon contains the parameter. However I'm stuck as to how to get those to compare against each other on a single graph with multiple overlaying lines.

        • 1. Re: Best way to parse a string field and compare unique values over time
          Peter Galimutti

          Hi Emile,

          Can you please attach a small sample of data with date field and the string in question.

           

          Thanks,

          Peter

          • 2. Re: Best way to parse a string field and compare unique values over time
            Joe Oppelt

            The best way to do this is to reshape your data outside Tableau.  Separate rows for each value.

             

            Otherwise, as you noted, 5 separate calcs to hold the 5 values (or NULL when there are less than 5.)

             

            Parameters are only single-value creatures as you have found out.

             

            But you can have 5 parameters.

             

            (Question:  Are the 5 tags always going to contain a limited list of possible tag values?  For instance, you listed "blue|round|old".  Let's say that another row has "blue|round|old|tall|scary".  Are those the only 5 possible tag values?  Or can your row have up to 5 tags, but those values can be ANYTHING.  Do the positions within the string have significance?  Or can "blue" be in any position in the string list?)

             

            I asked all of that because the way you are using this will influence how your set up those parameters.

             

            For example, I have a chart with three lines on it.  Users have 3 separate parameters with two choices:  Y or N.  They can turn lines on or off independently.

             

            So you can have 5 parameters that let the user specify values.  And then you would have 5 calcs that correspond to the 5 parameters and do CONTAINS on the string using the corresponding parameter.  And then you would run as many of the measures on your chart as the user specifies.

             

            This is how it looks on my dashboard:

             

            • 3. Re: Best way to parse a string field and compare unique values over time
              Emile Petrone
              Original TitleCast
              Jurassic WorldChris Pratt|Bryce Dallas Howard|Irrfan Khan|Vincent D'Onofrio|Nick Robinson
              Mad Max: Fury RoadTom Hardy|Charlize Theron|Hugh Keays-Byrne|Nicholas Hoult|Josh Helman
              InsurgentShailene Woodley|Theo James|Kate Winslet|Ansel Elgort|Miles Teller
              Star Wars: The Force AwakensHarrison Ford|Mark Hamill|Carrie Fisher|Adam Driver|Daisy Ridley
              • 4. Re: Best way to parse a string field and compare unique values over time
                Emile Petrone

                Thanks Joe for the help. To answer your questions

                 

                * My first thought was to clean the data outside of Tableau, but as a new Tableau user, I didn't know if there was an elegant Tableau solution

                * Yes we are talking about a limited set of values (fewer than 100 unique values)

                * Position of values does not matter, they can be random

                • 5. Re: Best way to parse a string field and compare unique values over time
                  Joe Oppelt

                  And what do you want this data to do?  How do you need to compare "string 1" against "string 2"?

                   

                  Maybe a sample workbook would save us some time.  Otherwise I'm just going to be playing 20 questions here. 

                  • 6. Re: Best way to parse a string field and compare unique values over time
                    Emile Petrone

                    I added a sample set of data above to show you what I'm looking at. Essentially I have thousands of films (too large to upload the whole thing), and I want to visualize trends in actors, companies, over time. So it would require parsing those string values to then correspond to the release year for plotting over time.

                     

                    The tricky part is parsing the strings to then correspond back to the date field with the year the film was released.

                     

                    Examples of things this data would show:

                    * How a studio grew over the years (showing the number of films they make each year over the last 100 years)

                    * An actor's career over time (showing the films they were in each year - the rise and natural fall over their career)

                    • 7. Re: Best way to parse a string field and compare unique values over time
                      Joe Oppelt

                      Create your 5 parameters.  I would just make them STRING data type.  Accept any value.  User types in a value. 


                      For each parameter, have a corresponding calc that does your CONTAINS([string to parse],[parameter-1])

                       

                      What, exactly you do with a TRUE condition is where you have to get creative.  The calc will evaluate for all rows in the database.  Whatever you do, it will only be recorded for the actors or companies the user requests.  All the rest of the rows will just have NULL.

                       

                      For instance:

                       

                      IF not ISNULL([parameter-1]) and (CONTAINS([string to parse],[parameter-1]) then 1 END

                       

                      Each row that the actor shows up in will have a 1 for this calc, and all the rest will be NULL.  Then, SUM([Calc 1]) will show how many times he showed up in the time period your sheet shows.  (If done by yeas of movie date, then it will show how many times he was in a movie that year, for example.)

                       

                      Then you would have 5 of these.

                       

                      If a parameter is null, then all rows in the database will get a null for that calc.

                       

                      You can run 5 (or as many as you want) lines on a chart by using the MeasureNames/MeasureValues method.  It's a standard technique, so you can fine a lot of examples out there with a google search if you don't know how to do that.  (Or I would show you if you made a workbook.)

                       

                      As for uploading an example, it doesn't take your full data set.  100 rows.  Even 20 rows.

                       

                      Here is a demo on how to do something like that:

                       

                      Anonymize your Tableau Package Data for Sharing

                      • 8. Re: Best way to parse a string field and compare unique values over time
                        Peter Galimutti

                        Do you have a database or and ETL tool to work with? if so which one.

                         

                        I don't think Tableau will have a easier and straightforward way to normalize de-normalized data. The ETL capabilities of Tableau are still limited.

                         

                        Here is my take on preparing the data:

                         

                        1. First split all the values, like you did in tableau

                        2. Unpivot them

                         

                        for Example

                         

                        Jurassic world,2012,Chris Pratt

                        Jurassic world,2012,Bryce Dallas Howard

                        Jurassic world,2012,Irrfan Khan

                        Jurassic world,2012,Vincent D'Onofrio

                        Jurassic world,2012,Nick Robinson

                         

                        Now you can reload this data in Tableau and do necessary analysis.

                         

                        HTH,

                        Peter