9 Replies Latest reply on Jan 8, 2014 12:27 PM by Jonathan Drummey

    Using Data Blending for Segmentation

    James Morse

      Hi All -

       

      Our marketing team has executed an engagement campaign in hopes that it would generate new customers as well as stimulate incremental spend among existing customers.  I've been given a list of customer IDs in Excel of customers exposed to the campaign.  In addition, I have my hardened Tableau data source for revenue transactions (think the full Superstore sample data set).

       

      What I'd like to do is run some numbers that benchmark the customer revenue trend using a dimension of whether or not the customer was exposed to the marketing campaign.  I am struggling to create this dimension for campaign exposure via data blending where the customer's presence in the Excel file defines exposure.

       

      Can this be accomplished?  It seems a series of table calcs could be used here as a solution, but I was hoping for something simpler and more elegant.

       

      -James

        • 1. Re: Using Data Blending for Segmentation
          Jonathan Drummey

          Hi James,

           

          There are a few ways to approach this.

           

          - Work with your DBA to import the Excel data into the same database used for your transaction data (this can be done as an import into a table, or a linked table), then you can set up Tableau with a Multiple Tables connection or a custom view or query where the campaign customers data is left-joined to the transaction data. This would give you a first-class dimension for use in Tableau, and you could do things like create a calculated dimension such as NOT ISNULL([Campaign Customer ID]) to use in your analyses. I find that when I need to repeat an analysis, the extra effort to set this up is worth it because then the work in Tableau goes much more smoothly. In general, if I have the time this my preferred approach.

           

          - Tableau's Primary Groups feature is another way to get at this. Here's the KB article: Creating Primary Groups from a Secondary Source Using Data Blending | Tableau Software. I also set up an example in the attached, where I took a random set of Customer IDs from Superstore Sales to create a Campaign Customer IDs data source. Then I created a calculated field in the Campaign Customer IDs data source to flag those as Yes, then created the primary group. Once I had the IsCampaignCustomer (group) in the Supertore Sales, I edited the group to bucket everything else into the Other, then renamed Other. You can use group as a dimension in your view, for example the "Sales from Primary Group" worksheet shows this. There are some complications with this route: First, you cannot use the group in a calculated field***. Also, depending on the # of Customer IDs you're working with this might be slow due to how Tableau works with ad hoc groups. Finally, the primary group is static - every time you have a new or updated batch of Campaign Customer IDs you'll have to regenerate the Primary Group. The advantages to Primary Groups are that you can do it all in Tableau, can use the group quite easily in your views, and don't need your DBA or have to know table calcs.

           

          *** The workaround you can use in this case is to use a data source that supports Sets in calculated fields (i.e. not a JET source), make a Set on the IsCampaignCustomer (group), and select the "Yes" member for the Set.

           

          - Use a blend and table calculations. I set up an example in the attached, there are a few tricky bits. The first is that we need to create an aggregate calculation in the primary^^^, for the IsCampaignCust (blend) measure I used the formula IIF(NOT ISNULL(MIN([Campaign Customer IDs].[Customer ID])),"Yes","No"). This is an aggregate calculation because all calculations using fields from blended sources have to be aggregates. Second, once the IsCampaignCust (blend) field is in a view, you'll need to click on the pill and uncheck "Ignore in Table Calculations", so table calcs can be partitioned on that measure. Finally, because you'll need the Customer ID in the view for this to work and create a finer-grained level of detail then you need, you'll need to use a table calculation for most measures to reduce the level of detail. I set up a simple one for sales using IF FIRST()==0 THEN WINDOW_SUM(SUM([Sales])) END with a Compute Using of the Customer ID, you can see the "Sales from Blend Workout" and then a cleaned up view in "Sales from Blend". This technique can run into performance problems when there are a lot (more than 10s of thousands or maybe 100s of thousands) of Customer IDs, you can speed that up by doing things like extracting the Excel data. It also requires attention to the view and knowledge of table calcs that not everyone has. If I'm doing a quick & dirty analysis, I'll tend to use this over the Primary Groups because it does have the advantage of being dynamic and I'm comfortable with table calcs. However, if the data source is going to be used by someone less facile than myself I'll do the work for the first step to just include this in the data source.

           

          ^^^ Technically we could just use the IsCampaignCustomer dimension I set up in the secondary data source, however when brought into a view in the primary that only has values of "Yes" and Null, and there's no way to alias the Null to something else. (See the Null/Alias problem worksheet). I like to use explanatory text for labels in my views instead of having to explain what Null means, so we go to the longer route.

           

          Hope this helps!

           

          Jonathan

          1 of 1 people found this helpful
          • 2. Re: Using Data Blending for Segmentation
            Matt Lutton

            Hi Jonathan Drummey

             

            I was just reviewing your work in this thread, as usual, to try and pick up some new tricks.  Just a quick question -- I was able to right click on Null in the "Null/Alias problem" sheet, and change it to "No" by using Edit Alias.  Were you referring to something else?

             

            Edit Alias Null.png

            1 of 1 people found this helpful
            • 3. Re: Re: Using Data Blending for Segmentation
              Joe Mako

              Building from the Null/Alias worksheet after apply the Alias as Matthew describes, we can use 'Data Blending 2' features that allows the blend to be at one level and the aggregation at another. This is effectively the same operation as the Table Calculations route, but without the complexity of building and without the fragility.

               

              The Customer ID pill is removed from the active shelves and the blue 'broken link' icon in the secondary data source is clicked to make it red again, so the field is being used in the group by clause to both data sources, and used to blend, but Tableau takes care of the aggregating to another level for us automatically.

              1 of 1 people found this helpful
              • 4. Re: Re: Using Data Blending for Segmentation
                Matt Lutton

                Very nice.  Love this thread!  I kind-of hate that we mark questions as "correct" so quickly, when often, there are so many more approaches we could learn from!  It would be nice to have "Answered, but still looking for more options" or something similar as it makes learning and study so much easier when everything is related to a specific use case (I guess the "Helpful" option could represent this). Similarly, I wish the KB articles would present more than one method for solving a problem, along with details related to when each should be considered, etc.

                 

                I've often thought it would be nice to have a "Improve my Viz" area where people can post visualizations and specifically ask for help in improving their work.  That would have a similar impact--multiple helpers could provide enhancements, suggestions, etc. and there wouldn't need to be a "Correct" answer.

                 

                Anyway, that's all for now.   Thanks for posting, Joe!

                • 5. Re: Re: Re: Using Data Blending for Segmentation
                  Jonathan Drummey

                  Joe - I feel like I should have thought of that option, thanks!

                   

                  Matthew - You pointed out some "interesting" behavior:

                   

                  The IsCampaignCustomer field in the original Campaign Customer IDs data source has only one value, "Yes", so when that data source is selected and we are working from the Data window there is no Null value to assign an alias to:

                  2014-01-08 13_41_36-Tableau - blend in cohort data jm edit.png

                   

                  In the blended view (such as the original Null/Alias problem worksheet), when IsCampaignCustomer is in the view, I can click on the IsCampaignCustomer dimension pill, choose Edit Alias... and get that same dialog with only the "Yes" option. This is what I'd tested and always thought, which led to my statement in the earlier post about not being able to set the alias.

                   

                  However, as you found, if I right-click on the Null value for IsCampaignCustomer, we get a little Edit Alias dialog:

                  2014-01-08 13_45_35-Tableau - blend in cohort data.png

                   

                  And it works as you said. But if I go back to the pill and chose Edit Alias, it's still only showing the Yes value.

                   

                  That we can change the alias for Null makes sense in that the data blended IsCampaignCustomer dimension does have the Null value, and Tableau will let us set an Alias for it. However, we don't have access to unset the Alias for the Null value of the blended dimension the way we do with the dialog - for example, if we change the Alias to something else and then "back" to Null, that 2nd Null isn't actually a Null value, but a "Null" string. I set up an example by copying the crosstab and pasting it into Excel for the three options (default, set alias to No, set alias to Null). So to me there's a one-way bit of behavior (once you change the alias away from Null you can't go back again) that ideally shouldn't be happening, but I'm not sure of a user interface solution that wouldn't introduce more complexity, since a single dimension & data source can be used in multiple blends in the same workbook.

                   

                  Thanks for looking at this, Matthew!

                  1 of 1 people found this helpful
                  • 6. Re: Re: Re: Using Data Blending for Segmentation
                    Matt Lutton

                    No problem at all--I look at pretty much everything you post...    If only it all made sense to me on the first go round...

                     

                    Good point about editing the alias and it resulting in a string--I hadn't thought of that.

                     

                    Cheers.

                    • 7. Re: Re: Using Data Blending for Segmentation
                      James Morse

                      Jonathan, Joe, Matthew -- Thanks so much for the contributions here and the detailed workbook examples showing the steps!  I have been able to tackle this problem with my proprietary data set.  I tried both the Primary Group route as well as the aliased Secondary Group.  Each work, but the Secondary Group has the benefit of being dynamic.

                       

                      One observation / limitation that I noticed with the aliased Secondary Group is that you are unable to change the sort order of the members between the "Yes" and the NULL/"No".

                      • 8. Re: Re: Using Data Blending for Segmentation
                        Matt Lutton

                        That is interesting.  Thanks for pointing that out, as well!

                        • 9. Re: Using Data Blending for Segmentation
                          Jonathan Drummey

                          That's a limitation on dimensions from secondary sources, the workaround is

                          to set up another dimension that has the right alphanumeric sort and then

                          use that to the left of the blended dimension on Rows or Columns, then turn

                          off Show Headers.

                           

                           

                          On Wed, Jan 8, 2014 at 2:50 PM, James Morse <