Thanks for your help on this. I really have to slowly go through the answer to ensure that I understand it fully.
I was interested in your point "outside of doing a bunch of work in the query to pre-calculate stuff". I have been looking at connecting data with Tableau and it appears to me that I have to do more calculation work in my database before sending the tables to Tableau. Looking at the following blog that this is sub-optimal and you should look to do as much calculations in Tableau rather than in your database. I think this undermines the value of Tableau but is this the best way to proceed.
1 of 1 people found this helpful
The answer to what is best for you is, "it varies". Here are some factors:
- complexity of query in the database
- # of rows returned from database to Tableau (along with speed of network between database server and Tableau, etc.)
- are the columns that the query is using indexed, is the query using a database view, is it fully optimized for that data source, etc.?
- when a multiple tables connection is used, are all the foreign keys properly set up so Tableau can do join culling and not query all of the tables when necessary?
- if a Tableau data extract is in use, then the Custom SQL vs. multiple table questions largely go away because the only time they impact Tableau is when the extract is created/updated.
The specific situation that Russell Christopher was describing in the blog you referenced was using Custom SQL vs. a multiple tables connection. While multiple tables will be faster, an alternative to Custom SQL is to put the query into your data source as a database view, so the query can be optimized/materialized as much as possible by the database. In practice, I pretty much never use Custom SQL, beyond what's already been mentioned a couple of others are that we almost exclusively use extracts, so multiple tables vs. Custom SQL doesn't really matter, and that for maintenance purposes I prefer to do data manipulation in as few places as possible, so I keep it to the database query and Tableau calculated fields.
The "do a bunch of work in the query to pre-calculate stuff" I mentioned relates to the question of "where do we want to do what level of aggregation?" We can pre-aggregate in the data source (either by queries/views, Custom SQL, or aggregating an extract), and then do aggregates of those aggregates in Tableau via SUM, MIN, etc, and use table calcs beyond that, or return the detail level data to Tableau, then aggregate that via SUM, MIN, etc. and do further aggregates of aggregates via table calcs, and nest table calcs as necessary. Usually what I'll do is set up a data source to give me the level of detail that I need with as little aggregation as possible, and do the rest in Tableau. The only things that have pushed me to do more aggregation in the data source are when I can't get the responsiveness/performance that I need out of Tableau on a given dashboard (and I recognize that I'm torturing Tableau with some of my dashboards), and one case where I couldn't do multiple levels of ranking and aggregate ranking (I'm hoping that 8.1 will solve that, since 8.1 has new rank and percentile functions).
Thanks for your insights. I find that its great to get best practice information from people with experience like yourself.
It would appear that setting your data up correctly before you bring it into Tableau is where a major strength in the software lies and I think that some more on demand webinars on this subject would be really helpful.
When you say "what I'll do is set up a data source to give me the level of detail that I need with as little aggregation as possible" are views in databases the best way forward?
Please accept my apologies if the answer is "it varies".
I'm not sure you'll have to do as much query-building as you might think... For me, the use cases for queries/views before Tableau cover the following areas:
- Unions to unpivot data - basic unpivot to make data tall vs. wide, or unpivoting for measure names & values to do more advanced formatting
- Cartesian join/cross products where we need to pad out the data, for example for census/throughput analyses where we need a row for every date combination.
- WHERE clauses with join criteria - useful to make those unions & cross products not return too many rows.
- Doing pre-processing before join, for example to massage some textinto a real date field that we can then use to join on some other table.
- Once in a while there's a need for an aggregate subquery prior to joins.
- Flattening a star schema when a given Tableau data source will be used for multiple workbooks, to save time in building the connection - note this case can be met by using a Tableau Saved Data Source instead of a query, I just got into the (probably bad) habit of using queries when I started using Tableau.
- Some combination of the above.
As to your question about the "best way forward", my answer to that one is "I don't know." There are too many variables between database, environment, the data itself, how the data is used in the views, user profiles, etc. to be able to specify a single best practice. For example, I'll live with all sorts of inconsistencies and hiccups when I have a view that is just for me for exploratory analysis, whereas when the data is going onto a dashboard for new users then I'll put a lot more effort into tuning Tableau and the data source, making sure names and aliases are cleaned up, all displayed numbers have the right units, etc. What I'm building now are some data sources to be used by Tableau Server users via web authoring, so there I'm having to think more about setting default aggregation levels, what fields are truly necessary for users to build their analyses, etc.
Does that help?
That is a great help.
Being a new user I think you expect to get a silver bullet answer for every question whereas making a start and then realising that there may be a better way to proceed after trial and error is probably an effective way forward.
I do believe that the SQL Views will be very useful, I had not been aware of that initially.
Great, and good luck building your views!