1 2 Previous Next 17 Replies Latest reply on Aug 3, 2017 2:59 AM by kettan

    Cross-Join, Legacy Connection & Custom SQL: How can I use 'LIKE' in a RegEx-like way to unpivot my data?

    hassan.dar

      Tableau Workbook: Cannot provide (due to IT restrictions)

      Tableau Version: 9.2.4 (unable to upgrade due to IT restrictions)

      Data Connection: Excel File (Legacy Connection) - data is inside an Excel Table linked to a SharePoint list

       

      Pagingkettan as he must know the Jet SQL 'LIKE' syntax pretty well by now

       

      Hello,

       

      TL;DR: For a Legacy connection, how can I use SQL 'LIKE' in a RegEx-like way, in combination with Cross-Join #5 to un-pivot the following string value into multiple rows (while also removing the garbage characters (e.g. ;#85)):

       

      D-RTJ-HD-001;#30;#D-RTJ-SUKU-001;#32;#D-RTJ-SUKU-002;#33;#D-RTJ-SUKU-003;#34;#D-RTJ-SUKU-004;#35;#D-RTJ-SUKU-005;#36;#D-RTJ-MERK-001;#1;#D-SOLAR-004;#40;#D-SOLAR-014;#114;#D-FLARE-007;#85;#D-FLARE-009;#140;#D-FLARE-010;#141;#D-RTJ-AXIO-001;#4
      

       

      Note: Transforming the data at source via Excel formula or VBA or via RegEx & Split in Tableau are not options (see below for more details)

       

       

       

       

       

       

       

      ------------------------------------------------------------ Lengthier Version ------------------------------------------------------------------

       

      I am sourcing SharePoint list data into Tableau via Excel. One of the fields is a SharePoint Lookup column, which allows multiple choice.

       

      When exporting to Excel the multiple-choice look-up field is flattened into a string in the format:

       

      Single Value: (VALUE1);#(LOOKUPID)
      Multiple Values: (VALUE1);#(LOOKUPID);#(VALUE2);#(LOOKUPID)
      

       

      Note the extra ';#' which is also used as a delimiter between multiple instances of a Value-LookID pair.

       

      Some example data:

       

      (Single Value selected by user)

      D-FLARE-001;#79
      

       

      (Multiple Values selected by user)

      D-RTJ-HD-001;#30;#D-RTJ-SUKU-001;#32;#D-RTJ-SUKU-002;#33;#D-RTJ-SUKU-003;#34;#D-RTJ-SUKU-004;#35;#D-RTJ-SUKU-005;#36;#D-RTJ-MERK-001;#1;#D-SOLAR-004;#40;#D-SOLAR-014;#114;#D-FLARE-007;#85;#D-FLARE-009;#140;#D-FLARE-010;#141;#D-RTJ-AXIO-001;#4
      

       

      I am trying to use technique #5 (as I can get all of the known values) in the Cross-Join collection to un-pivot this data (and also remove the garbage characters e.g. ';#36') - after exhausting other possible solutions (see below) it seems Custom SQL & 'Like'  are my

       

      Other notes:

       

      • The known values AND the multiple string values are both separate columns in the same table (because SharePoint is acting as the "Lookup Middle-man" in this case, and handles the relationship between the data prior to exporting)
      • The solution needs to be automated- the data set is going to be updated regularly from SharePoint and my users are not sophisticated enough to transform the data themselves prior to import into Tableau
        • I have tried to do this beforehand using an Excel formula but so far haven't been able to create one that works
        • I need to avoid VBA as Macros are disabled
      • Users will ultimately consume the information via Tableau Server and do not have the ability to edit Workbooks
      • Using the split functions is not an option as I need to actually un-pivot the data dynamically for the above mentioned reasons

       

      Message was edited by: Hassan Dar (Clarifying that I would like to remove garbage characters from the string also - apologies for not making this clear)

        • 1. Re: Cross-Join, Legacy Connection & Custom SQL: How can I use 'LIKE' in a RegEx-like way to unpivot my data?
          kettan

          TABLEAU

          If Tableau could pivot a calculated field, a regexp_replace formula could be used to create a "SPLIT" character to use with Tableau's split function, such as this one:

           

          REGEXP_REPLACE([STRING],";#([A-Z])",";|#$1")

           

          But it can't pivot calculations according to  Pivot Data from Columns to Rows.

          The same document recommends UNION for "pivoting" two date dimensions.

          This is probably the same as saying that pivot is only featured for measures.

           

          EXCEL

          This limits the options to Excel and Custom SQL ... as you already knew 

          From Tableau 10.2 this cross join can be done inside Tableau join dialogue!

           

          Here is a formula that generates a split character to use with custom SQL:

           

          =SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(A2;";";";|";2);";";";|";4);";";";|";6);";";";|";8);";";";|";10);";";";|";12);";";";|";14);";";";|";16);";";";|";18);";";";|";20);";";";|";22);";";";|";24)

           

          or even this (if all records contain the pattern ;#D in the "split area"):

           

          =SUBSTITUTE(A2;";#D";";|#D")

           

          CUSTOM SQL

          As for custom SQL, the method is #4 (Unpivot Column With Multiple Unknown Items).

          Hopefully it will perform.

          1 of 1 people found this helpful
          • 2. Re: Cross-Join, Legacy Connection & Custom SQL: How can I use 'LIKE' in a RegEx-like way to unpivot my data?
            hassan.dar

            kettan,

             

            Many thanks for the quick response, just realised I should have been more clear - I would also like to remove the garbage the characters & ID (#;20) so that I'm left with pure values (e.g. D-CPF-CA-001|D-FIN-004) and it is these values only that I would like to have each on separte rows.

             

            Comments in-line:

             

             

            kettan wrote:

             

            TABLEAU

            If Tableau can split a calculated string into rows - I am in doubt if it can - a regexp_replace formula could be used to create a "SPLIT" character to use with Tableau's split function, such as this one:

             

            REGEXP_REPLACE([STRING],";#([A-Z])",";|#$1")

            [HD] I have actually already tried this method using the following formula - but as you say this does not actually split into separate rows, just separate fields nor is it automated / dynamic.

             

            Note: Your RegExp did not work for me (it simply changed the values to "D-FLARE-001;#30|" (i.e. added a pipe), in my version I used the following, which replaces the garbage characters with a single semi-colon:

            REGEXP_REPLACE([Dependent Deliverable],";#\d+;*#*", ";")

             

             

            EXCEL

            If not, you could create a split character with an Excel formula like:

             

            =SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(A2;";";";|";2);";";";|";4);";";";|";6);";";";|";8);";";";|";10);";";";|";12);";";";|";14);";";";|";16);";";";|";18);";";";|";20);";";";|";22);";";";|";24)

             

            [HD] This formula is not valid when I put it into Excel (were some of the semi-colons actually meant to be commas (,)?

             

            or even this (if all records contain the pattern ;#D in the "split area"):

             

            =SUBSTITUTE(A2;";#D";";|#D")

             

            [HD] This formula is also not valid, but if I try and put it into the correct syntax for SUBSTITUTE (i.e. =SUBSTITUTE(A2,";#D",";|#D")) I get values as:
            D-CPF-CA-001;#30;|#D-CPF-CTS-001;#32;|

            Which doesn't look correct (would have expected a single de-limiter separating values and remove the numbers and '#' ';')

             

            CUSTOM SQL

            If custom SQL is needed, it would be method 4 (Unpivot Column With Multiple Unknown Items) - a performance wise expensive method, I guess.

            [HD] Is it not possible to use Method 5 and get 'SQL LIKE' to act like a RegEx and extract each string excluding the garbage characters? I seem to have seen others doing this online, but the syntax seems to vary depending on the version of SQL you are using, and there doesnt seem to be much documentation for Jet SQL?

             

            I don't have many rows of data (probably 200-250 rows, 18 columns) so perhaps Method 4 would be better, but would I not still need to clean-up the data to prevent the garbage characters being included?

             

            • 3. Re: Cross-Join, Legacy Connection & Custom SQL: How can I use 'LIKE' in a RegEx-like way to unpivot my data?
              kettan

              Here is a pure custom SQL solution.

               

              STEP 1:  Add a lookup column (method 4)

               

              STEP 2:  Open Excel with Legacy Connector

               

              STEP 3:  Use Custom SQL.

               

              SELECT mid$(d.[STRING], l.[Pos], instr(l.[Pos], d.[STRING] + ';#', ';#') - l.[Pos]) as [String 2]

              , d.[STRING]

              FROM [data$] d, [lookup$] l

              WHERE l.[Pos] <= len(d.[STRING])

              AND mid$(';#' + d.[STRING], l.[Pos], 2) = ';#'

              AND mid$(d.[STRING], l.[Pos], instr(l.[Pos], d.[STRING] + ';#', ';#') - l.[Pos]) not like '[0-9]%'

               

              Everything but the last line is more or less a copy of method 4 in  The Cross Join Collection.

              The differences are:

              1. Two characters are used as row separator (;#)
              2. The last line filters out rows beginning with letter 0 to 9

               

              In case your real data can begin with a numeric value, we need to check it all for being a number or not.

              If I knew a function to use, I would have used it.  (I have tried ISNUMERIC, ISNUMBER, VAL)

               

              VAL() works as a column, but not as a filter in SQL. Interestingly, it works as a filter in Tableau!

              So if necessary, you could add this column to the SQL and thereafter filter it to 0 inside Tableau:

               

              , VAL(mid$(d.[STRING], l.[Pos], instr(l.[Pos], d.[STRING] + ';#', ';#') - l.[Pos])) as [StringVal]

               

               

              WARNING:  I am not sure, but think Jet SQL only handles strings with fewer than approx. 255 character.

              Your example data has 242 characters and thus close to this limit.

              But as said, I am not sure if this really is the limit.

               

               

              Attached Workbook Version:  9.0

              • 4. Re: Cross-Join, Legacy Connection & Custom SQL: How can I use 'LIKE' in a RegEx-like way to unpivot my data?
                Jonathan Drummey

                Yes, JET SQL is limited to columns of 255 characters for text data types. That was one of a number of reasons why Tableau introduced the "new" connector in v8.3 for Excel & text files.

                 

                Jonathan

                1 of 1 people found this helpful
                • 5. Re: Cross-Join, Legacy Connection & Custom SQL: How can I use 'LIKE' in a RegEx-like way to unpivot my data?
                  kettan

                  The comma-semicolon differences are most likely due to our different OS language settings.

                   

                  As for Method 5, you need to cross join against as many rows as are needed to generate.

                  Therefore it is named known values.

                   

                  ALTERNATIVE METHOD

                   

                  Assuming you don't have a list of limited know values, a better alternative would be known number of rows.

                  The beauty of this method is that you can use a normal inner join in Excel to generate needed rows!

                   

                  Step 1:  Create a calculated field in Excel that shows how many rows are needed.

                  Step 2:  Create a LOOKUP table that generates the needed rows (see example at the end)

                  Step 3: Join the DATA table with the LOOKUP.

                  Step 4:  Have fun doing the necessary calculations in Tableau.

                   

                  An example of the LOOKUP table:

                   

                  Rows_TotalRow_No
                  11
                  21
                  22
                  31
                  32
                  33
                  41
                  42
                  43
                  44
                  etcetc
                  • 6. Re: Cross-Join, Legacy Connection & Custom SQL: How can I use 'LIKE' in a RegEx-like way to unpivot my data?
                    kettan

                    THE KNOWN NUMBER OF ROWS METHOD

                     

                     

                    Step 1:  Add a field for number of rows in your DATA table

                     

                    See more in attached xlsx workbook ( copying formulas has international setting issues )

                     

                    Step 2:  Create a LOOKUP table

                     

                     

                    Step 3:  Join DATA and LOOKUP in Tableau

                     

                    This join is a normal join that gives you one row per item in string!

                     

                     

                     

                    Step 4:  Create a calculated field for extracting the item

                     

                    Here is my second and shorter version.

                    A simpler and longer version was used to make this and is available in the attached workbook.

                     

                    String v2

                    IF [Row_No] = 1 THEN MID([STRING],1,FIND([STRING],";")-1)

                    ELSE MID([STRING],FINDNTH([STRING],";",[Row_No]*2-2)+2,FINDNTH([STRING],";",[Row_No]*2-2+1)-FINDNTH([STRING],";",[Row_No]*2-2)-2)

                    END

                     

                    The use of REGEX functions may be less difficult to understand than the formula above.

                     

                     

                    Step 5:  Build view

                     

                    Actually I built the view before the calculated field 

                     

                     

                    Attached Workbook Version:  9.0

                    1 of 1 people found this helpful
                    • 8. Re: Cross-Join, Legacy Connection & Custom SQL: How can I use 'LIKE' in a RegEx-like way to unpivot my data?
                      hassan.dar

                      kettan - I'm not sure the solutions above will give me what I need, I don't think I have managed to articulate properly what I'm trying to achieve so please see below:

                       

                      This is a simplistic example of how my data currently sits (there are more columns but that is mostly descriptive data & of course more dependent deliverables than just three implied from the data below)

                       

                      Database Row ID
                      Deliverable ID
                      Dependent Deliverable(s)
                      1D-CPF-FUND-001
                      2D-CPF-FUND-002
                      27D-CPF-PMO-002D-LCT-001;#97
                      79D-STRAW-001D-LCT-002;#98
                      80D-STRAW-002D-CPF-FUND-002;#2;D-LCT-002;#98
                      97D-LCT-001D-STRAW-001;#79
                      98D-LCT-002
                      142D-STRAW-011D-STRAW-001;#79;#D-STRAW-002;#80;#D-LCT-001;#97

                       

                      Note that the values in column 3 (Dependent Deliverable) will always be a sub-set of the values in column 2 (Deliverable ID) -- therefore I know all of the values (they just happen to be stored in the same table, rather than a lookup table)

                       

                      I would like the data above to be transformed into the following (note the split into separate rows & removal of ';#xx' )

                       

                      Database Row ID
                      Deliverable ID
                      Dependent Deliverable(s)
                      1D-CPF-FUND-001
                      2D-CPF-FUND-002
                      27D-CPF-PMO-002D-LCT-001
                      79D-STRAW-001D-LCT-002
                      80D-STRAW-002D-CPF-FUND-002
                      80D-STRAW-002D-LCT-002
                      97D-LCT-001D-STRAW-001
                      98D-LCT-002
                      142D-STRAW-011D-STRAW-001
                      142D-STRAW-011D-STRAW-002
                      142D-STRAW-011D-LCT-001

                       

                      I have tried Cross-Join #5 (and Cross-Join #4) but unfortunately neither provides the data in the above format. I'm yet to try your "KNOWN NUMBER OF ROWS" method, as I am unsure as to how to calculate via formula the Lookup Row_Count & Row_No...

                       

                      Is what I want possible to do?

                      1 of 1 people found this helpful
                      • 9. Re: Cross-Join, Legacy Connection & Custom SQL: How can I use 'LIKE' in a RegEx-like way to unpivot my data?
                        kettan

                        The Known Number of Rows Method should work fine ... with a reliable formula of course 

                        However, with your new data example, method 5  is less complicated, I think 

                         

                         

                        Step 1:  Add a Dummy Join Column

                         

                        If you had version 10.2 er newer, you wouldn't need this dummy column.

                         

                         

                        Step 2:  Self join on Dummy Join

                         

                        This creates a cross join of your data.

                        The cross product of this is seen when no filter is applied.

                         

                         

                        Step 3:  Create a filter (the real "join")

                         

                        ( ISNULL([Dependent Deliverable(s) (data$1)]) AND [Database Row ID] = [Database Row ID (data$1)] ) OR

                        ( CONTAINS([Dependent Deliverable(s)],[Deliverable ID (data$1)]) AND [Deliverable ID] <> [Deliverable ID (data$1)] )

                         

                         

                        Step 4:  Build View

                         

                         

                        Attached Workbook Version:  9.0

                        2 of 2 people found this helpful
                        • 10. Re: Cross-Join, Legacy Connection & Custom SQL: How can I use 'LIKE' in a RegEx-like way to unpivot my data?
                          hassan.dar

                          kettan, superb work! Marking this as correct as you've solved the immediate issue I asked about here - many thanks!

                           

                          Unfortunately I've hit another stumbling block ... With my newly created separate rows for the Dependent Deliverable, I needed to cross-reference the Dependent Deliverables with another data source (separate excel file) to get Start and End Dates for each row. I thought I could do this with Data Blending, but it doesn't seem like the calculation in the "Filter" field will play nice ... I suspect Cross-Database Joins in Tableau 10 would solve this, but unfortunately not an option for me ... is it worth opening a new Question for this specific issue? Or is there another way around this that would allow me to use a Data Blend to link to the other data source?

                          • 11. Re: Cross-Join, Legacy Connection & Custom SQL: How can I use 'LIKE' in a RegEx-like way to unpivot my data?
                            kettan

                            Thanks. I was also quite happy about it! 

                             

                            The reason it was relatively easy for me to answer your previous question was because you had shared a perfect pairing of source data and expected output. If it was possible, I would have stamped (or tagged) that individual comment as an example for other questioners on how to ask questions!

                             

                            As for other ways to solve this, it is hard to say if it is possible without seeing and trying - and for that representative source data and expected output is vital 

                             

                            I would not limit the options to whatever is possible in Tableau 9.0/9.3, but also ponder what can be done outside Tableau ... as previously done with the now not needed "Known Number of Rows Method

                             

                            By what you wrote, I assume it is not an option to have all data in the same Excel workbook.

                            Could you confirm if this is correctly understood?

                             

                            As for a complete solution, my feeling is that it will be done with custom SQL and the legacy connector. With it we can also connect to other Excel workbooks! To do this we need to include the path of the other Excel file. Here is an example:

                             

                            SELECT *

                            FROM [C:\Users\jemjs\Documents\Method 5 thread 243240 Tableau 9.xlsx].[data$]

                             

                            Ps. I only have an expired Tableau 9.0 license and thus limited to whatever is featured there. This said, it doesn't seem any important feature needed for this task was released in 9.1 and 9.2.

                            • 12. Re: Cross-Join, Legacy Connection & Custom SQL: How can I use 'LIKE' in a RegEx-like way to unpivot my data?
                              hassan.dar

                              kettan,

                               

                              • Yes, I can get both sets of data into the same workbook
                              • But is there a way (perhaps with custom SQL) to get 'true' separate rows (rather than using the calculated field Filter method?) - as I sometimes seem to be a bit limited in terms of what I can achieve with the filter (plus it duplicates the deliverable itself as a 'dependency')

                               

                              Following on from my previous example, taking my 'transformed' deliverable data (plus one extra column as indicated):

                               

                              [Deliverables (with Dependent Deliverables as separate Rows)]

                              Database Row ID
                              Deliverable ID
                              Deliverable Name
                              Dependent Deliverable(s)
                              1D-CPF-FUND-001CPF-FUND Deliv #1
                              2D-CPF-FUND-002CPF-FUND Deliv #2
                              27D-CPF-PMO-002CPF-PMO Deliv #2D-LCT-001
                              79D-STRAW-001STRAW Deliv #1D-LCT-002
                              80D-STRAW-002STRAW Deliv #2D-CPF-FUND-002
                              80D-STRAW-002STRAW Deliv #2D-LCT-002
                              97D-LCT-001LCT Deliv #1D-STRAW-001
                              98D-LCT-002LCT Deliv #2
                              142D-STRAW-011STRAW Deliv #11D-STRAW-001
                              142D-STRAW-011STRAW Deliv #11D-STRAW-002
                              142D-STRAW-011STRAW Deliv #11D-LCT-001

                               

                              I would like to look-up the values in green against those in green in the below table (again only a sub-set of columns shown for Brevity - this table is larger: 900+ rows & 51 columns) - note this data does not have a single column which acts as a primary key

                               

                              [Project Plan]

                              Deliverable IDTask IDStart DateEnd Date
                              D-LCT-001T-LCT-00101-05-201709-09-2017
                              D-LCT-001T-LCT-00129-03-201822-04-2019
                              D-LCT-001T-LCT-00113-06-201724-07-2019
                              D-CPF-FUND-002T-CPF-FUND-00122-08-201703-12-2019
                              D-CPF-FUND-002T-CPF-FUND-00215-11-201713-08-2018
                              D-STRAW-001T-STRAW-00112-12-201815-03-2019
                              D-STRAW-001T-STRAW-00211-11-201711-11-2019
                              D-STRAW-00201-07-201719-08-2017
                              D-CPF-FUND-001
                              D-FIN-001T-FIN-00122-08-201709-09-2018
                              D-FIN-001T-FIN-00217-08-201704-04-2019

                               

                               

                              The output data that would give me what I need below (i.e. the ability to construct a Gantt chart) - I am pretty comfortable doing this with calculated fields in Tableau, as long as I can get to the data:

                               

                              Dependent Deliverable(s)
                              [Deliverables]
                              Deliverable Name
                              [Deliverables]

                              MIN() of Start Date

                              [Project Plan]

                              MAX() of End Date

                              [Project Plan]

                              Date Diff

                              Max(End) - Min(Start)

                              D-LCT-001LCT Deliv #101-05-201724-07-2019814
                              D-CPF-FUND-002CPF-FUND Deliv #222-08-201703-12-2019833
                              D-STRAW-001STRAW Deliv #111-11-201711-11-2019730
                              D-STRAW-002STRAW Deliv #201-07-201719-08-201749
                              • 13. Re: Cross-Join, Legacy Connection & Custom SQL: How can I use 'LIKE' in a RegEx-like way to unpivot my data?
                                kettan

                                Here it is, and probably much simpler, since all "joins" and calculations are made in the custom SQL except the DATEDIFF which I forgot about     Since it is so easy to do that calculation, I leave the attached and explanations below as they are.

                                 

                                 

                                Step 1:  I have added the Project Plan as a sheet in the Excel workbook.

                                 

                                Step 2:  Connect with the Legacy Connecter

                                 

                                Step 3:  Create custom SQL

                                 

                                select

                                  d1.[Database Row ID]

                                , d1.[Deliverable ID]

                                , d1.[Dependent Deliverable(s)]

                                , d2.[Database Row ID] AS [2 Database Row ID]

                                , d2.[Deliverable ID] AS [2 Deliverable ID]

                                , d2.[Dependent Deliverable(s)] AS [2 Dependent Deliverable(s)]

                                , ( select min(pp.[Start Date]) from [Project Plan$] pp where pp.[Deliverable ID] = d2.[Deliverable ID]) AS [PP Min Start Date]

                                , ( select max(pp.[End Date]) from [Project Plan$] pp where pp.[Deliverable ID] = d2.[Deliverable ID]) AS [PP Max End Date]

                                from [data$] d1, [data$] d2

                                where (

                                ( d2.[Dependent Deliverable(s)] IS NULL AND d1.[Database Row ID] = d2.[Database Row ID] ) OR

                                ( d1.[Dependent Deliverable(s)] LIKE '%' + d2.[Deliverable ID] + '%' AND d1.[Deliverable ID] <> d2.[Deliverable ID] )

                                )

                                 

                                Step 4:  Build

                                 

                                 

                                Attached Workbook Version:  9.0

                                1 of 1 people found this helpful
                                • 14. Re: Cross-Join, Legacy Connection & Custom SQL: How can I use 'LIKE' in a RegEx-like way to unpivot my data?
                                  hassan.dar

                                  kettan strikes once again! I've been able to build the exact view I needed, so thank you very much for that.

                                   

                                  In addition to your instructions above I also Left join of "Deliverables" against the custom SQL (Deliverable ID -> Deliverable ID) and then another left join to a duplicate of "Deliverables" against the custom SQL (this time "2 Deliverable ID" -> Deliverable ID). This finally gives me access to all of the data I need!

                                   

                                  Tell me how I can buy you a beer by way of thanks!

                                  1 of 1 people found this helpful
                                  1 2 Previous Next