8 Replies Latest reply on Mar 7, 2017 6:48 PM by Bill W

    Continuous Text parsing until no more occurrences.

    Bill W



      I am trying to write out a text parser to parse a series of 2 digit numbers out of a single string dimension, then total them up with the others included on the view.


      I only have 9.2, so please post suggested solutions in Tableau 9.2.


      First step, was to create the initial parse - left(mid([Assets 1 text to parse],FIND([Assets 1 text to parse],"bps")-2),2), The dimension 'Asset 1 text to parse' contains the text string. This parses out the first 10 which is correct, but I need to figure out a way for it to parse out the 50 and 25 and display it on next rows related to Team A.


      Ideally, all said it done. I'd want the output to look like this so I can total it across the board. Additionally, probably will need to put in a condition for if it doesn't contain bps, but it appears its working from that standpoint.


      Please see my workbook attached and the solution I am trying to obtain.



        • 1. Re: Continuous Text parsing until no more occurrences.
          Joe Oppelt

          I'm looking at this.



          I'm not really sure what you are looking to do.


          If you want to see if a string has some substring in it, use the CONTAINS() function.


          It returns a Boolean value.


          IF CONTAINS([string you want to look in], [string you want to find]) THEN (do what you want to do if true)  END


          If you want to know where the string starts in the target string, use the FIND() function.  That returns a value for the position it starts in.  (Returns 0 if it's not in there at all.)


          Or are you looking to see how many of the search string occurrences exist in the string you are searching?


          I do that by using the REPLACE() function, which will replace all the occurrences of the string, and I replace it with "" (nothing).  Then I test the new length against the old length and divide the difference by the length of the string I was looking for.  The result is the number of occurrences in there.

          • 2. Re: Continuous Text parsing until no more occurrences.
            Bill W

            Evening Joe,


            Please see the solution image. That's ultimately what I am trying to produce.


            Contains() function won't work I believe, the Find() function as I was describing above would probably be the best course of action. Additionally, I am not trying to replace() anything, simply find, return and repeat until it reads the entire string.


            What I am trying to do is search the string where "bps" appears then return the 2 digit number to the left of it. However, I want to do it for each occurrence where "bps" appears in the string. So, for this example I'd like to display 50 and 25 in separate rows as the both come before the "bps".


            Hopefully, this explanation helps . Sorry if I was unclear.



            • 3. Re: Continuous Text parsing until no more occurrences.
              Joe Oppelt

              There is no iterative mechanism like a DO-Loop in Tableau.

              I am beginning to get what you are looking for now.  It just wasn't clear to me in the first description.


              If there can be an unlimited number of "bps" in the string, you're going to find this very difficult.  I wouldn't do this parsing in Tableau.


              If there is a fixed number of occurrences (or even a maximum number of occurrences), you will need a separate calc for each one.  I have some ideas, but I want to aim for the actual end goal and not tailor what I suggest to the specific text here.  What will your actual data be like when you parse it?

              • 4. Re: Continuous Text parsing until no more occurrences.
                Joe Oppelt

                And will the number to the left of "bps" always be 2 digits?  Could you ever have a value of 5, for instance?  Or 100?  We can accommodate a varying length there.  If always 2 digits, it's easier to hard-code that, but now is the time to consider that possibility.

                • 5. Re: Continuous Text parsing until no more occurrences.
                  Jamieson Christian



                  You've strayed solidly into the realm where an ETL solution is more appropriate than trying to use Tableau to do this. You will not get Tableau to process an arbitrary number of substrings; however, if you know the upper limit of numbers that you may need to parse out, you can get by with something like this:


                  1. Create a REGEXP_REPLACE() calculation to distill the text down to a simple comma-delimited string of numbers.


                  2. Create a calculated field to parse out N numbers from the above-generated string, and add them together. (In my example, I assumed no more than 6 numbers that need parsed out.


                  3. Here is the result:


                  You can then SUM that field with your other Asset fields to get the grand total for Team A (if that's what you were looking to do).


                  Workbook attached, but it's in version 10.13, so you may not be able to open it (sorry).

                  1 of 1 people found this helpful
                  • 6. Re: Continuous Text parsing until no more occurrences.
                    Bill W

                    Wow, Jamieson....


                    That's some slick coding. Question, how might I break up the 'Parsed numbers' into individual rows? Like, how I want to see it per my solution image.


                    Thank you very much.


                    Jason, I have no other option but to do the parsing in Tableau per my limited resources.


                    Additionally, it'll always be a 2 digit number and additionally per my image I attached that's what I wanted the data to come out like.

                    • 7. Re: Continuous Text parsing until no more occurrences.
                      Jamieson Christian



                      Now you're even more solidly into the realm of ETL.


                      In general, Tableau does not generate rows. If only generates columns (i.e. calculated fields). Generating data rows is quite definitely an ETL thing — Tableau just works with the rows it has.


                      If you want to generate rows, you'll have to resort to a scaffolding approach. At a high level, it looks like this:


                      1. Create a scaffolding data source that can be JOIN'd to your main data. The scaffolding would literally just consist of a "Row" column and N rows containing the values 1, 2, 3, 4… up to the maximum number of possible values you might find in you parsed string. (In my earlier example, I would create scaffolding with 6 rows.)
                      2. JOIN the scaffolding data source to your main data source. Leave the JOIN criteria empty. This will cause every row in your main data source to replicate N times.
                      3. Create a calculated field that uses just one of the SPLIT() calls from my earlier formula, but replace the last parameter (which was 1, 2, 3, 4… in the original example) with the "Row" value from the JOIN'd scaffolding.
                      4. Apply a data source filter to exclude NULLs in this calculated field, which will ensure that you don't have extraneous rows if the actual values parsed out of a given string is less than N.


                      I can put together an example of this approach in 10.1.4, but I don't know if you'll be able to view it. Let me know if you want an example.


                      Also… I was a bit confused by your sample data, give this requirement. Cell D3 in your snapshot shows "Assets 2" = "25", but it equals "30" in the two rows above. This would not happen as a result of automatic row generation. Maybe it was just a bug in the mockup you were doing in Excel, but it gave me pause.

                      • 8. Re: Continuous Text parsing until no more occurrences.
                        Bill W

                        Thanks Jameison, your answer was correct. I was able to do a parse and then utilize the split to break it out. I used a First() and a Previous_Value() function in order to break then out into separate rows.


                        My apology for the delay.