If you can get what you want with a single table or multiple table connection then it is certainly better to do that than to use custom SQL. With those types of connections Tableau understands the structure of the data, so can construct it's queries to be as efficient as possible. With a custom SQL connection, on the other hand, Tableau treats the entire custom SQL connection as an embedded query and wraps all the clauses it needs to filter or group or aggregate the results around the outside. So Tableau is much more at the mercy of the behaviour of the underlying database's query optimiser to determine the performance of the queries.
But having said that, there are lots of things that you can do with custom SQL that you can't achieve with the other connection types, and in many cases the performance is just fine. It all depends on the custom SQL statement and how the resulting queries are processed by the underlying database. In some cases a custom SQL connection will work absolutely fine with one database but be completely unworkable with a different database. In cases like this it's good to have a friendly DBA to help unless you have very strong SQL knowledge.
I am using Custom SQL and extracting data. Does this reduce the performance of dashboard compared to extracting data from single table directly?
I already loaded the data into packaged workbook, does the mode of extraction impacts the performance of dashboard? If so can you please explain it.
1 of 1 people found this helpful
If you are extracting the data, it will not impact the performance of the dashboard. Once extracted, it makes no difference to Tableau how the data was stored or processed previously. It only makes a difference when you come to refresh the extract.
Tableau gives a warning message to review the query, you can ignore the prompt if the query is select statement. If any SQL statement or custom SQL query includes an update query, you might want to revise that syntax to be sure nothing unexpected can occur.
Review the below link for more information.
As per my understanding, tableau validates Custom SQL query even though I extract data. Considering that as fact, tableau takes more load when loading the dashboard for the first time. Correct me if I am wrong.
Please let me know if avoiding Custom SQL queries reduces the loading time of dashboard when we open it for first time?
1 of 1 people found this helpful
I'm sure that Ben's earlier response was absolutely correct - having extracted the data it doesn't matter how the extract was created, it won't be going back to the database.
I agree that the custom SQL warning is confusing in this case.
If you want to be absolutely convinced, you could unplug your PC from the network before you open the workbook (assuming the extract is on a local hard drive). I don't think you will see any difference in response time.
By the way, apologies for not replying to your earlier question. Unfortunately an upgrade of the forums software a few years ago lost all of the links for who should get notified about posts on a thread, so I don't get notified about many of my old threads these days. Pinging me as you did today obviously still works. :-)
Thanks Richard. I have gotten the answer for the question
Really depends on the backend and how good it's optimizer is.
- Tableau in some cases is able to optimize queries against multi-table relations and not query all the tables (i.e. SELECT A.C1 FROM A instead of SELECT A.C1 FROM A JOIN B JOIN C)
- Tableau queries only columns that are needed for the viz, i.e. query "SELECT A.C1, SUM(B.C15) FROM A JOIN B" when wrapped in Custom SQL might look like this: SELECT CustomSQL.C1, SUM(CustomSQL.C15) FROM (SELECT A.C1, A.C2, A.C3, A.C4, A.C5, A.C6, B.C11, B.C12, B.C13, B.C14, B.C15 A JOIN B) "CustomSQL" - if database is not smart enough it will actually compute all the column that will be thrown away in outer select.
Main issue with Custom SQL comes from it's flexibility. It's very tempting to start writing complex joins, expensive calculations and unreadable logic which will be executed every time Tableau has to issue a query - regardless how much data is actually requested.
Hi Richard Leeke ,
Sorry to bother you again. Though I am convinced with your explanation, I would like to know if tableau has internet access, does it connect to database and validate the query? In case of a tableau dashboard published on tableau server with embedded credentials of database, tableau has the access to database. In this case, does Tableau try to connect to database and validate SQL? which may consume extra time to load for every user opening the dashboard.
Please explain me this concept of checking the SQL in spite of extracting data. I am trying to understand if I need to create a dedicated view for dashboard to avoid extra loading time on tableau server.
I think that Tableau choose to warn about the custom SQL even when you have an extract so that you are informed of what it does in case you choose to refresh your extract. But I'm just guessing.
I have just created and published a packaged workbook with an extract derived from a custom SQL connection while watching all SQL activity on the database with a SQL profiling tool. I am now absolutely 100% sure that it doesn't access the D/B after creation of the extract (until you refresh it). No database access when you open the workbook in Desktop, publish or access views on server.
2 of 2 people found this helpful
I know this is an old thread, but I'm going to chime in here because it came up while I was looking for something completely unrelated.
Using custom SQL does not by definition improve or deteriorate the speed of the load. How fast it works is ENTIRELY a function of the quality of the SQL you write. When you use Tableau's interface to import data, it's simply creating a SQL statement to send off to the source server to execute. It tends to do this in the most straight forward way possible, and in some cases that is efficient and in some cases it is not.
I can almost always get better performance by writing my own SQL, but that is based on deep understanding of SQL and knowledge about the databases I'm hitting up. There is no "always" in the answer to this question.
What I can say is that anyone who is not reasonably advanced at SQL can accidentally write queries that do not perform, as there are all kinds of "newbie mistakes" that one can make. In cases of live connections, this can be devastating to the performance of a dashboard, or in cases of extracts - load time. When you need to write higher performing queries, it's impossible to manipulate the SQL syntax using Tableau's GUI, and that's where you have the need to write your own. Similarly, understanding how Tableau writes SQL means that often times I can write a query that out-performs the GUI-generated SQL that Tableau sends off.
Hey Justin Larson
Thanks for chiming in! this helps.
Even after reading all answers on this post, I am still unclear about one bit. From I have understood so far - that even if it is a huge/complicated query (i.e. joining about 10 different tables, with a bunch of sub-queries & case statements, & not a particularly optimized or an advanced query as such) in the custom SQL as long as we have created an extract and scheduled that extract on tableau server, and our the dashboards are running off of that extract, it wouldn't affect the performance/loading of the dashboard on server? Have I understood that correctly ?
The reason why I ask that, is I have a custom sql that's pretty huge, and I do publish the datasource on the tableau server and schedule an extract. and all my reports run off of that extract, I was thinking of moving away from custom SQL and have a materialized view created in my SQL server DB (that's the database source for my tableau reports) and use that materialized_view as my data source as an extract , publish that, and run dashboards off of that ... will that improve performance of loading of dashboards on server? or will it stay the same ? would it even make sense to do that ?
I just don't want tableau to any query processing/ talking to DB etc. on its end, and I am not sure if it is doing that every time it loads the view on server or does it only reach out to my DB only when creating an extract ?
Please do suggest.
2 of 2 people found this helpful
sorry, I literally just got this message.
Yes, you've got it right. Whether your sql is custom sql in Tableau or a materialized view, it will only effect the load time of the extract. Performance of the dashboard would be the same against the extract.
One additional thought that I didn't include in previous comment - if you are not using an extract, and you have, say, 10 tables in your query. If you join all 10 tables in a single custom sql, it will have to execute the whole block, even if your interaction with the canvas only needs a field from one of the tables. Better to break it apart and do the joining in Tableau so that it can cull off the tables that are not necessary to return reliable results. This is only a consideration with non-extracted connections though.