In fact (if you can solve your performance issue), joining the Datasources could work well with Tableau 10.0.
This version allows you to Remove Null Values from Filters Based on Secondary Data Sources.
I would say this Quick Star article would be pretty helpful in your case as well: http://onlinehelp.tableau.com/current/pro/desktop/en-us/help.htm#qs_cross_datasource.html
Let me know if it helped!
No, this does not solve the problem. The first link is to remove null values which is the opposite of what I want. I want to INCLUDE values that do not exist in the second data source.
That quick start article does not help either.
Sometimes when people are working with Tableau they want to build a data source that can answer every question, so they are trying to include everything from every table and start running into design and/or performance problems. I suspect that might be the case here. What you're asking for is a specific type of analysis and that could very well require a specifically built data source to answer those questions.
Also I'm presuming here that you're on a live connection and the tables and database have been evaluated for performance and things like indexes and partitioning have been configured as necessary. If they haven't then please look at that, I've gotten 10x performance improvements simply by handing my DBAs the query Tableau issued (from the logs or Performance Analyzer) and then the DBAs doing their magic.
Anything like a cross database filter or a join is going to be run every time Tableau refreshes the data and given 8M customers and probably a much larger version of orders that means that Tableau has to get a distinct list of the customers from the orders table and then join that into the customers, so having some performance issues is not too surprising.
Since this analysis explicitly doesn't need orders I'd wonder why bother bringing them into Tableau at all?
I've got a few suggestions:
1) In Tableau create a Custom SQL that is a SELECT CustomerID FROM Orders GROUP BY CustomerID and then use that as the right-side of a join to your Customers table, and add a data source filter for Orders.CustomerID is Null.
1a) If that isn't fast enough then create a Tableau data extract. That should be very fast.
2) Create a database view or stored procedure that uses the SELECT CustomerID FROM Orders GROUP BY CustomerID and use that as the right-side table of a join to your Customers table, and add data source filter for view.CustomerID Is Null.
2a) If that isn't fast enough create a Tableau data extract.
3) Create a database view or stored procedure that does all of #2.
3a) If that isn't fast enough create a Tableau data extract.
Perhaps this is not something I will be able to do in Tableau. The reason I am bringing in the orders is because the end user has requested functionality to choose which item (using the Superstore example) and use certain filters to bring them to a smaller customer list that they can target. This would be fine if they were always targeting the same item but this is not the case.
"I want to find everyone who hasn't purchased a stapler and has the last name of Thomas." (I know this is a super silly example!)
"I now want to start over and find everyone who is part of a 500+ company, hasn't purchased furniture, but also is right handed."
I am not doing a live query at the moment. I tried a half version of 1-3a. I made an extract of a two column table in custom SQL but did not apply the data source filter of nulls.
What I'm aiming to do next is to all of the work in the two column table to get all the customers who did not purchase and list them for every item. It won't be pretty but it'll accomplish the functionality the end user is looking for until I can best figure out how to proceed from there.
1 of 1 people found this helpful
You may want to try using Filter Actions (with a twist).
For that you have to have Customers on Details of the Source view,
and build the initial Filter Action by simply filter those Customers in the Target,
but then go edit the Action Filter on the Target to Exclude those Customers.
With that in place, your users are simply filtering
whatever Customers they want to Exclude
and finally Click a Dummy pill Header to Filter the Target.
Please find the attached as an example.
Not sure about the actual performance in your case.
What I ended up doing was creating a second data source with a cross database join on all the customers and all the product types with 1,0 flags so the filter was always an include option. The exclude filter was taking far too long to render. It's pretty fast with 30+m rows.
Glad you've got it working,
one way or another.