There are a few ways to approach this.
- Work with your DBA to import the Excel data into the same database used for your transaction data (this can be done as an import into a table, or a linked table), then you can set up Tableau with a Multiple Tables connection or a custom view or query where the campaign customers data is left-joined to the transaction data. This would give you a first-class dimension for use in Tableau, and you could do things like create a calculated dimension such as NOT ISNULL([Campaign Customer ID]) to use in your analyses. I find that when I need to repeat an analysis, the extra effort to set this up is worth it because then the work in Tableau goes much more smoothly. In general, if I have the time this my preferred approach.
- Tableau's Primary Groups feature is another way to get at this. Here's the KB article: Creating Primary Groups from a Secondary Source Using Data Blending | Tableau Software. I also set up an example in the attached, where I took a random set of Customer IDs from Superstore Sales to create a Campaign Customer IDs data source. Then I created a calculated field in the Campaign Customer IDs data source to flag those as Yes, then created the primary group. Once I had the IsCampaignCustomer (group) in the Supertore Sales, I edited the group to bucket everything else into the Other, then renamed Other. You can use group as a dimension in your view, for example the "Sales from Primary Group" worksheet shows this. There are some complications with this route: First, you cannot use the group in a calculated field***. Also, depending on the # of Customer IDs you're working with this might be slow due to how Tableau works with ad hoc groups. Finally, the primary group is static - every time you have a new or updated batch of Campaign Customer IDs you'll have to regenerate the Primary Group. The advantages to Primary Groups are that you can do it all in Tableau, can use the group quite easily in your views, and don't need your DBA or have to know table calcs.
*** The workaround you can use in this case is to use a data source that supports Sets in calculated fields (i.e. not a JET source), make a Set on the IsCampaignCustomer (group), and select the "Yes" member for the Set.
- Use a blend and table calculations. I set up an example in the attached, there are a few tricky bits. The first is that we need to create an aggregate calculation in the primary^^^, for the IsCampaignCust (blend) measure I used the formula IIF(NOT ISNULL(MIN([Campaign Customer IDs].[Customer ID])),"Yes","No"). This is an aggregate calculation because all calculations using fields from blended sources have to be aggregates. Second, once the IsCampaignCust (blend) field is in a view, you'll need to click on the pill and uncheck "Ignore in Table Calculations", so table calcs can be partitioned on that measure. Finally, because you'll need the Customer ID in the view for this to work and create a finer-grained level of detail then you need, you'll need to use a table calculation for most measures to reduce the level of detail. I set up a simple one for sales using IF FIRST()==0 THEN WINDOW_SUM(SUM([Sales])) END with a Compute Using of the Customer ID, you can see the "Sales from Blend Workout" and then a cleaned up view in "Sales from Blend". This technique can run into performance problems when there are a lot (more than 10s of thousands or maybe 100s of thousands) of Customer IDs, you can speed that up by doing things like extracting the Excel data. It also requires attention to the view and knowledge of table calcs that not everyone has. If I'm doing a quick & dirty analysis, I'll tend to use this over the Primary Groups because it does have the advantage of being dynamic and I'm comfortable with table calcs. However, if the data source is going to be used by someone less facile than myself I'll do the work for the first step to just include this in the data source.
^^^ Technically we could just use the IsCampaignCustomer dimension I set up in the secondary data source, however when brought into a view in the primary that only has values of "Yes" and Null, and there's no way to alias the Null to something else. (See the Null/Alias problem worksheet). I like to use explanatory text for labels in my views instead of having to explain what Null means, so we go to the longer route.
Hope this helps!
blend in cohort data.twbx 464.0 KB
1 of 1 people found this helpful
I was just reviewing your work in this thread, as usual, to try and pick up some new tricks. Just a quick question -- I was able to right click on Null in the "Null/Alias problem" sheet, and change it to "No" by using Edit Alias. Were you referring to something else?
Building from the Null/Alias worksheet after apply the Alias as Matthew describes, we can use 'Data Blending 2' features that allows the blend to be at one level and the aggregation at another. This is effectively the same operation as the Table Calculations route, but without the complexity of building and without the fragility.
The Customer ID pill is removed from the active shelves and the blue 'broken link' icon in the secondary data source is clicked to make it red again, so the field is being used in the group by clause to both data sources, and used to blend, but Tableau takes care of the aggregating to another level for us automatically.
blend in cohort data jm edit.twbx 468.4 KB
Very nice. Love this thread! I kind-of hate that we mark questions as "correct" so quickly, when often, there are so many more approaches we could learn from! It would be nice to have "Answered, but still looking for more options" or something similar as it makes learning and study so much easier when everything is related to a specific use case (I guess the "Helpful" option could represent this). Similarly, I wish the KB articles would present more than one method for solving a problem, along with details related to when each should be considered, etc.
I've often thought it would be nice to have a "Improve my Viz" area where people can post visualizations and specifically ask for help in improving their work. That would have a similar impact--multiple helpers could provide enhancements, suggestions, etc. and there wouldn't need to be a "Correct" answer.
Anyway, that's all for now. Thanks for posting, Joe!
1 of 1 people found this helpful
Joe - I feel like I should have thought of that option, thanks!
Matthew - You pointed out some "interesting" behavior:
The IsCampaignCustomer field in the original Campaign Customer IDs data source has only one value, "Yes", so when that data source is selected and we are working from the Data window there is no Null value to assign an alias to:
In the blended view (such as the original Null/Alias problem worksheet), when IsCampaignCustomer is in the view, I can click on the IsCampaignCustomer dimension pill, choose Edit Alias... and get that same dialog with only the "Yes" option. This is what I'd tested and always thought, which led to my statement in the earlier post about not being able to set the alias.
However, as you found, if I right-click on the Null value for IsCampaignCustomer, we get a little Edit Alias dialog:
And it works as you said. But if I go back to the pill and chose Edit Alias, it's still only showing the Yes value.
That we can change the alias for Null makes sense in that the data blended IsCampaignCustomer dimension does have the Null value, and Tableau will let us set an Alias for it. However, we don't have access to unset the Alias for the Null value of the blended dimension the way we do with the dialog - for example, if we change the Alias to something else and then "back" to Null, that 2nd Null isn't actually a Null value, but a "Null" string. I set up an example by copying the crosstab and pasting it into Excel for the three options (default, set alias to No, set alias to Null). So to me there's a one-way bit of behavior (once you change the alias away from Null you can't go back again) that ideally shouldn't be happening, but I'm not sure of a user interface solution that wouldn't introduce more complexity, since a single dimension & data source can be used in multiple blends in the same workbook.
Thanks for looking at this, Matthew!
No problem at all--I look at pretty much everything you post... If only it all made sense to me on the first go round...
Good point about editing the alias and it resulting in a string--I hadn't thought of that.
Jonathan, Joe, Matthew -- Thanks so much for the contributions here and the detailed workbook examples showing the steps! I have been able to tackle this problem with my proprietary data set. I tried both the Primary Group route as well as the aliased Secondary Group. Each work, but the Secondary Group has the benefit of being dynamic.
One observation / limitation that I noticed with the aliased Secondary Group is that you are unable to change the sort order of the members between the "Yes" and the NULL/"No".
That is interesting. Thanks for pointing that out, as well!
That's a limitation on dimensions from secondary sources, the workaround is
to set up another dimension that has the right alphanumeric sort and then
use that to the left of the blended dimension on Rows or Columns, then turn
off Show Headers.
On Wed, Jan 8, 2014 at 2:50 PM, James Morse <