Make a third sheet with all the depts. use it as the primary data source.
Blend the other two with it.
Thanks for your suggestion Alexander. Sorry I am still new to Tableau, how do you "make a third sheet"? I have left joined Worksheet 1 and Worksheet 2. But not sure how to proceed from there.
You mean something like this? What makes this complicated is the fact that you have records on both tables you want to keep which requires a full outer join, which is not supported either by JET or the new Excel connectors. Getting the full domain via union in SQL is a possible workaround.
Here is my solution that you can download here: | Tableau Public
1.Create an excel sheet with all the Dept names
2.Load it to Tableau as the third data source
3.Use it as the primary data source and blend with 2 other sources
4.Create the chart
5.Add label thru adding reference lines per cell, show value, format label....
I see that you use PO when you left join CustomSQL with Worksheet1 and Worksheet 2. What if my data does not have PO number. My real data would be Department Name and Paid Amount in one table (Worksheet 3) and Responsible Dept (Worksheet 4) and Invoice Amount in another table. The only linkage is by the name. However if I used that to do my left join, I ended up with duplicate entries and my numbers are off. Do you have any idea?
PO_2.twbx 23.1 KB
1 of 1 people found this helpful
It is not a matter of join type. Even if you did INNER JOIN you'd get such replication. There is no way for the query to identify which marketing department entry matches with what marketing department entry so it will join with all occurrences. You need a reliable unique identifier for things you don't want to replicate.
If there is no other way out, you can do some tricks in calculations to avoid skewed results. Some aggregations won't get impacted e.g. MAX, MIN, AVG. For the ones that will get skewed it is possible but a really hacky workaround.
You can add two more Custom SQL pills into the join diagram that tell you how many occurrences of a department came from both tables.
[Worksheet2$].[Responsible Dept] AS [Responsible Dept], COUNT(*) as [cnt]
FROM [Worksheet2$] group by [Worksheet2$].[Responsible Dept]
[Worksheet1$].[Department Name] AS [Department Name], COUNT(*) as [cnt]
FROM [Worksheet1$] group by [Worksheet1$].[Department Name]
If dept A appears 2 times in one table and 2 times in the other, you know you have to divide it by 2 since it will replicate twice.
If it happens 2 times in table source but 1 or fewer in the other, you don't need to divide it by anything to deal with the skewed results since there will be no replication.
So you need a few IF statements before you aggregate.
Or you can pre-aggregate in the query to get rid of replication (which gives you the same effect as blending but without need for creating multiple data sources) but with the down side of having to bake in the aggregation. So it is a trade-off based on what kind of flexibility you want.
1 of 1 people found this helpful
this is the same as my solution above.
Joining is the Better Approach than Blending when two tables located in the same Database & Same Schema, Since
1) If we implement Data Blending, Can't use the "Show Relevant Value option (Cascading Quick Filter), If we use Join then it easier.
2) If we implement Data Blending, We can't apply the Global Filter when different reports from Different Databases, If we use Join then it easier.
3) Blending woks like Left Outer Join. Joining can support all kind joins (Inner, Left, Right,----)