You can try using the following steps to help diagnose which part(s) of the data extraction process takes longest:
1. From Tableau Desktop, choose Help > Start Performance Recording.
2. Right-click the Data Connection you have already set up, choose Extract > Refresh.
3. Wait the 45 minutes or so for the extract to be refreshed.
4. Choose Help > Stop Performance Recording.
5. A new window will open with your performance recording, showing a breakdown of your data extraction process.
You haven't provided enough information. Are you talking about Tableau Server or Desktop? What version? Are you building an extract or using a live connection or ??? How is the data being pulled, Custom SQL or Multiple Tables or separate data sources or ??? We need details.
Tableau IS a big data tool but the person building the report(s) affects how fast it can move.
Apparently many people are having this problem with specifically large data sources. When I use the performance monitor, all it does is keep executing the query for 900 seconds and then terminate it with the message 'Operation Timed Out'.
In my case, I am using Tableau Desktop 9.3 connected to data source published on Tableau server 9.3. The data source is a live connection, not an extract. The data source is simply a join statements of 1 fact table and about 9 dimension tables. Data traffic in the database: about 500,000 rows per day.
For those who don't have Tableau 9.3, I am also attaching an image. If I click on either of the bars, I do not get any query.
Any help is highly appreciated.
PerformanceRecording.twb 78.2 KB
1 of 1 people found this helpful
There is a common misperception among BI tools, Tableau included, that "<name of BI tool> is taking a long time to run a query". BI tools do not run queries - databases do. ALL BI tools send a query (SQL or otherwise) to a data source (relational, cube, etc.) and wait for a response. When you see an "operation timed out" message, this is almost always the database closing the connection, not Tableau. That said, I have seen Tableau building TEMP tables in certain data sources in order to satisfy the query (MSTR does this on the fly as well). This behavior is very specific to 1) the query itself, 2) the structure of the data source (tables, join types, etc.), and the brand of database (Teradata, Oracle, etc.), and should not be the first place you look. In my case, it took Tableau Support to examine the log files to see that temp tables were being built. The query was then restructured to eliminate the need for those temp tables and all was well.
As for BA S's comment about Tableau and Big Data, it's all about the database. If the database can supply Tableau with a bazillion rows of data, Tableau will handle it just fine. The problem is that the database server that holds those bazillion rows is choking and can't supply Tableau with the data it asked for. However, the user is using <BI tool of choice> as the UI, so naturally, it looks like the BI tool's fault. No BI tool has an error message that says "sorry, your puny database server can't handle your request so there's nothing we can do".
Some database types, namely flavors of Hadoop, are notoriously slow at querying, even with various query accelerators. You will simply never get the query performance that you would get from the likes of Oracle, SQL Server, Teradata, HANA, etc. Time outs for large query result sets are very common. That's why Tableau recommends creating an extract from Hadoop sources, vs. a live connection.
When diagnosing such symptoms, always start small and work up until it breaks.E.g., start by restricting the data source (often with a data source filter) that creates a small query data set, then create your viz. Note the query time(s). Gradually reduce the restriction on the filter until you are querying all the data (and getting the time-out error). Then, repeat these tests, but create an extract instead. The extract creation step will eventually hit a breaking point as well (because it's querying the same database), but you will find very good viz query performance with the previously successful extracts. Now, compare query times of the smaller data sets with the smaller extracts. If Tableau was having issues querying large amounts of data, you would see poor performance with both live and extract queries. However, you will not - extracts are faster. This proves that it is the database that is running the query (and timing out), not Tableau. I.e., how can Tableau time out on 500,000 rows of live data, but run like a champ on 500,000 rows in an extract? Because it is the database that is succeeding or failing to run the query, not Tableau (or other BI tools, for that matter).
As for your Perf Recording workbook, you'll need to save it as .TWBX format. The one you attached has no data, as it was saved as .TWB format.
Hi Amit. I am using Tableau Desktop/ Server 10.5 with hyper, and it has taken 16 hours to run the simplest of queries on its way to creating an extract. This is simply unacceptable. Tableau needs to do something about it soon or they are going to start losing their edge.
I have found the same slow performance problem mentioned by others in the post.
You are partly correct, but Tableau is not totally blameless.
Yes, the database executes the query. But, if a BI tool creating the query produces an inefficient query, the query will take much longer than a correct query.
Tableau creates the queries and rewrites them, based on the filters, and that is often the start of the problem. I've seen queries that are more complex than necessary, and often ignore filters that should have been applied.
The user has limited control on how to manage how the queries are rewritten.
Here is my scenario, and I have had this problem many times on similar setups. I have a single table in the datasource, and a 10 quick filters on the sheet.
In the datasource, I have applied a filter to limit the date and a few other items. The resulting dataset in the datasource has 1 million records.
The filtered data after applying the quick filter values has 720 records, which are then grouped and aggregated.
I can execute the exact aggregation, group by the date field and sum 8 measures. The query executes in 1 second.
1. Through Tableau, it takes quite a few minutes.
2. When adding an extra filters to the datasource, the other existing filters are completely ignored. Tableau queries the entire table (>100million records) to populate a distinct list, instead of applying the existing datasource filters to only check the 1million record subset.
3. Queries for quick filters ignore the predefined normal filter ranges and other quick filter values, even when "relevant" values only is set. This results in a full table scan to determine distinct values, instead of e.g. only distinct values for this month.
4. I've seen instances where Tableau populates a normal filter with two values, yet takes nearly a minute to return the list. There are no active processes in the database for the last 50 seconds, so the database is not the cause. I have no idea what Tableau is doing for that time. The db is local, so it's not network latency either.
5. Fields are cast to types in the where conditions, instead of using them natively. This can cause index suppression and slow the query execution. for example (CAST(""status_summary"".""product_group"" AS TEXT) = 'All') instead of simply ""status_summary"".""product_group"" = 'All'. Tableau knows the data type is text, so why cast it again?
6. With an extract in Tableau 10.5, I found Tableau ran multiple queries simultaneously, one for the data and one for each quick filter.
7. Looking at the database processes when opening the report, a full table query like the one in point 2 is run against the entire table for each filter, ignoring the datasource filters. It should at least apply the other filter conditions as well. This problem occurs with Live queries, or updating an Extract. To force Tableau to only hit the subset, I have to manually create a custom SQL query with the filters hard-coded in them. This can certainly be improved.
There have been improvements in the past few years. Hopefully Tableau can address these issues as well.