3 Replies Latest reply on Mar 7, 2018 9:46 AM by satish.parvathaneni

    Tableau Extract and Performance

    Venu Tummala

      There are new performance enhancements in Tableau 9 and Tableau 10 along with features like parallel queries from dashboards, query fusion and external query caching etc. For a data set with a typical record size.. after how many million records in Tableau data extract do you see a performance degradation in the query response times and interactivity in the dashboard? Is there a bench mark of number of records or a best practice? Knowing this will help us in designing/creation of data set(s) (like creating multiple data sets instead of a single huge data set with a lot of fields)

       

      Thanks

      Venu

        • 1. Re: Tableau Extract and Performance
          Mahfooj Khan

          Hi,

           

          Few questions!

           

          What is the question you want to answer from your dashboard?

          Instead of visualizing every possible figure and combination of variables in the dataset, your job is to give the end-user easy access to the answers to his or her questions - often less is more.

           

          Who is the interactor/viewer of the report?

          As the production and customization of your reports are easier than ever, utilize it! Don't make one master dashboard/workbook to service the entire organization. The combination of all the organization's data sources and all the different questions you need to answer is bound to give you poor performance.

           

          What's possible with the data at hand?

          Take a good look at the data and the data structure. What is your options Tableau-wise with the data at hand? Plan your analysis before you start and make the necessary adjustments to the data or talk to you DBA and ask him/her to help you out. This way you won't end up down some narrow dark alley where your only way out is extremely complex table calcs or crazy parameter filter combinations.

           

          DONT's

          Avoid reproducing old excel reports in Tableau. Use the Tableau functionalities to reinvent the report and give the end-user easier access to insights. Reproducing old excel reports is often a hassel and can force you to make solutions that are less than efficient and tough on performance.

          Don't make Tableau a data extracting tool where your only wish is to make a list or a table of all your records for the past month, quarter or year.

          Don't build Tableau in Tableau. If the end-user should have all thinkable options to combine and analyze data with no pre-thought put into the report, then the end-user needs a Tableau Desktop license and not a dashboard with all the desktop functionalities build in. Give him or her the fantastic experience of data discovery with Tableau Desktop. But enough of me hearing myself talk about Tableau philosophy - now to the technical part of improving your workbook performance.

          If you have already considered and implemented the above mentioned and your workbook is still not performing as it should there are a couple of things you should consider:

           

          Your workbook will never perform better than the underlying database

          Working with big datasets will often mean you are connected to a database and not just excel files. Tableau does not import the data but queries the database and gets an answer back. If the database is not responding fast enough you'll end up with a slow workbook. Try to make a query directly in the database to the same view/table (or ask your database guy to do it) and see if the the response here is just as slow. If thats the case your performance issue should be solved in the database and not in Tableau. The issue could be related to an extensive amount of joins, or tables not optimized for joining. The solution could be indexing the tables or creating a new table instead of the view with the underlying joins.

           

           

          Data connection

          If your database is performing but your workbook is not you should take a look at your data connection. A couple of suggestions for better data connections could be:

          If not already using an extract, do it! Tableau Data Extract (tde) can really improve performance compared to a live connection to many big datasets. If possible, always use native connections. Limit or avoid the use of custom SQL. Are you blending? Data blending in Tableau is very powerful, but can also be a performance killer. Be aware of the level on which you are blending (you want to blend on as high a level as possible) and also the amount of data. If possible you should always choose joining over blending.

           

          Data size

          We all love huge amounts of data and Tableau is a great tool to handle it but if you experience performance issues you should consider eliminating unnecessary data. Keep only the data that you need to answer your questions by making partial extracts or by adding a data source filter. Do you really need all 100 columns? And all the records for the past 10 years?

           

          Performance recording

          Tableau has a build-in performance recorder that you can utilize. The performance recording outputs a Tableau report showing you the query time for all your different elements. This is a good start when diagnosing your workbook.

           

          Filters and Parameters

          We all love quick filters but use them wisely. Don't plaster your dashboard with quick filters - it's making your viz more complicated, it makes it slower and it doesn't look good. As a rule of thumb just try to keep your visualizations simple and informative and use dashboard actions to improve interaction, insights and performance. Same goes with parameters - use them when needed and not just because you can.

           

          Calculations

          Are you making very complex calculated fields? On huge datasets this could affect your performance. Especially string calculations can drain performance, but also complicated date manipulations and calculations. You should consider making some of your complex calculations in the data layer before the data enters Tableau. Another solution could be to utilize the Tableau Data Extract and the optimize option that "saves" the value of your calculations for future queries.

           

          Layout

          Keep your dashboards simple. Not only for performance but also for the user experience. Stick to maximum 4 sheets on your dashboard, keep the number of quick filters and parameters down and use action filters. Consider the complexity of your graphs. Do you really need a scatter plot with 100.000 marks? Does it give the end user valuable insights? Probably not and its bad for your performance.

           

          Implementing or at least keeping the above mentioned in mind will hopefully improve performance for your workbooks, but also improve the insights that you deliver to your endusers which of course is the ultimate goal.

          6 Tips to Make Your Dashboards More Performant | Tableau Software

           

          Mahfooj

          • 2. Re: Tableau Extract and Performance
            Venu Tummala

            I heard that we generally see performance degradation with Tableau extract data sets with more than 30 million records.

             

            Venu

            • 3. Re: Tableau Extract and Performance
              satish.parvathaneni

              Hey Venu - I know this thread is pretty old .

               

              I have a Basic Question .Do you have a process setup and shared with your Analytics or BI Team internally on how to Create Optimized Extracts ?