If both tables are in the same data source, you may want to consider the Multiple Tables option when connecting to your data source.
Generally, when using a data blend, the field you are joining on is the level of detail you what to display, but with custom table calculations, you can perform the data blend at one level, and aggregate at another.
Here is a KB article with some details on the differences between a Join and a Data Blend in Tableau:
For performing a join between tables in the same data source see:
I'm having a lot of trouble with data blending myself. I find myself running into the exact same problem Catherine was.
I have two tables that I have previously uses a join on. The reason I want to switch to data blending is because one of the tables is a table of Transactions for all time, the other summaries members transactions in the past year -- we want to be able to set up an incremental extract for the Transactions_all_Time table, and a full refresh extract for the Members_Past_Year table.
(The tables are both in a SQL Server database).
I've attached much a truncated versions of the Tables:
MembershipID, Film, Year
MembershipID, Visits in Year, Language_Code
We want to be able to get the distinct count of MembershipID from Transactions_All_Time (because we often want more then just the past year) and filter on things like Language_Code, Film, and a lot more fields I haven't included.
The Member_Past_Year table has a record for every single MembershipID, Transactions_All_Time has multiple entries for MembershipID's.
Any help or suggestions is most appreciated.
One option is like the attached image for Create Primary Group, the setup is the following:
1. From a new worksheet, select All Time as your data source, and place Membership ID as a discrete dimension pill
2. Change to the year Table data source, and place Language Code as a discrete dimension pill
3. right-click the pill for Language Code and select Create Primary Group
4. rename this Ad Hoc Group to "Language Code"
repeat those steps for every other dimension that you want to have available to filter on that only exists in the Year Table.
Once complete, you no longer need the Year Table, as in the attached.
Another option instead of the Ad Hoc group is a CASE statement because Ad Hoc Groups cannot be referenced by other calculated fields. Also you would need to regenerate this Ad Hoc Group and/or CASE statement for each data refresh. With that in mind, this may not be the optimal method for your situation.
The best option in my opinion is to join this data before Tableau.
When you use a join before Tableau, you are joining your data BEFORE aggregation.
When you use a data blend, you are joining your data AFTER aggregation.
Also a data blend is only a LEFT join.
A data blend cannot filter on dimensions that do not exist in the Primary data source. The "Create Primary Group" operation is bringing the dimension from the secondary into the primary, but it is static, and does not update when your data updates. When you data changes, you need to redo the "Create Primary Group" operation.
Thanks for the workbook and detailed response Joe!
It does look like I'll have to stick with the join, rather then data blending, meaning we can't do incremental extracts. (Our join was a Left join anyway so this wasn't a concern).
We have a good 33 million records and growing in the Transactions (All time) table and a good 3 million members (Year Table). Creating all 7 Primary Groups we want would take a considerable amount of time. The other problem is making a sheet with the 3 million members causes Tableau to display the 'out of memory' error.
I did try the method you presented with a truncated 10,000 transactions table and it does function.
While I think your right and it isn't a solution that is viable for what I am trying to achieve here, I am curious about the CASE statement you mentioned.
The closest I got was a non-functioning (as in it returned the same 'canot aggregate error') statement that looked like the folowing:
CASE ATTR([MembershipID]) WHEN ATTR([YearTable].[MembershipID]) THEN ATTR([YearTable].[Language_Code])
This also only returns a value when you do put the MemberID (linking field) on one of the rows/columns. (This is because of the use of ATTR, which seems to be required).
I mean a case statement that does the same thing as the ad hoc group, removing the need to join/blend. something like:
CASE [MembershipID] WHEN "S1" THEN 1 WHEN "S2" THEN 2 WHEN "S3" THEN 1 WHEN "S4" THEN 1 WHEN "S5" THEN 2 WHEN "S6" THEN 1 WHEN "S7" THEN 2 WHEN "S8" THEN 1 WHEN "S9" THEN 1 WHEN "S10" THEN 1 END
Ah. Thanks for the clarification. Given the situation, definetly not feasible.
I was mistaken in my comment about the CASE statement I supplied just returning 'values' (wich probably wasn't very clear). I had provided Language_Code with an alias and was expecting to see the Alias not the code.
I sure all of this will come in useful in the future.