I read it. Was very useful. but still have a doubt if data blending still uses inner join or it by default it uses secondary data source as outer join.
You can doubt all you want, but Matt used to be a Zen Master (before going to work for Tableau), Jonathan Drummey (also a Zen Master) has done many conference presentations about data blending, and Bora Beran, Program Manager at Tableau, has also indicated that Blends are in fact modified left joins. If you still think otherwise, you're going to need to post a packaged workbook that proves all these folks are wrong.
2 of 2 people found this helpful
Thanks for the vote of confidence, Shawn!
Data blending is a variation of a left join:
- Data blending occurs after aggregation in each source. Left-joins are at record level.
- Data blending occurs after densification in the primary source. For example, we can use Show Missing Values to pad out a date range and then blend a secondary source onto those padded dates. We'd have to do this in a data source by joining in an additional dates table, using a cross product, or something else.
- Data blending occurs after aliases are assigned in each source. This would require additional joins or calculated fields in a join.
- Data blending does not increase the granularity of the result set the way a left join can. For example when using source with the grain of Category as the primary and blending in a source that has the grain of Category & Sub-Category when we try to bring in the Sub-Category dimension we will get * for every Category that has multiple Sub-Categories. A left join would show all Sub-Categories.
- Finally, one key difference is that data blending can be configured per-worksheet, whereas a left join is defined in the source.
Given the above differences in data blending from a left join, we can approximate an inner join by filtering out Null values from the secondary. We can approximate a right join by "flipping the blend" to make the secondary source the primary and vice versa.