5 Replies Latest reply on May 15, 2016 7:16 PM by Jonathan Drummey

    Performance Issues with tabcmd

    Jeffrey Jacob

      Hi All,


      I use tabcmd to print a PDF of the only dashboard in my workbook.  When I first implemented the tabcmd functionality, I was not able to script more than 30 PDFs (1 PDF for each territory - 150 total territories).  After reviewing other discussions, I converted the "Live" Excel data connections to extracts.  This worked like a charm - all 150 PDFs were generated.  The Vizqlserver performed just fine.


      I recently implemented some conditional formatting within one of the worksheets for the dashboard, and it required the use of dual cross-tabs - one to display the measured value, the other to apply a background color for the "cell" if the value was above or below a percentage of the baseline value.  As soon as this was implemented, I noticed the rendering time of the dashboard in Tableau Server increased significantly.  When I attempted to generate the PDFs with the new conditional formatting feature, it would generate approximately 30 files before the process began to fail (and fail consistently).  Previously when this failed, Tableau Server would allow navigation through the different folders but would not render any views.  This time, navigation is still available, as well as rendered views.


      My first attempt to optimize the workbook was to convert all data sources to database tables.  This resulted in minimal gains.  My second attempt involved the conversion of a blended table to a joined table.  This did improve performance.  I am now able to generate 60 PDF files.  This, however, requires too  much babysitting.  Once the process fails, subsequent restarts will result in 1, 5, maybe 10 more PDF files before it would consistently fail.  If, on the other hand, I restart Tableau Server, it will generate another 60 before it begins to fail.


      Is there anything that I can do/review to further optimize this process?  Ideally, I could set it and forget it - all 150 PDF files would get generated without manual intervention.  My planned work-around is to script a Tableau Server restart (with the appropriate timeout to ensure Tableau Server is back on-line) after every 50 reports.  This, however, is not very elegant (or clean).


      Please let me know if I am missing something.





        • 1. Re: Performance Issues with tabcmd
          Russell Christopher

          As you've discovered, Tableau isn't a great report bursting engine - just wasn't really built for this work. I suspect you may always be somewhat challenged by this scenario, but it seems you've already identified the main problem as a workbook which is designed in a less-than-efficient manner for the type of work you want to do.


          Can you actually identify which Tableau Server process is failing and what it's resource consumption looks like right before the failure? You mentioned TWO crosstabs - how big are these suckers? (How many "marks" does each one render). Big crosstabs can cause lots of RAM (and CPU for that matter) to be utilized while rendering. Tableau also has a built in "watchdog" which will actually kill processes which leverage "most" resources on the machine (90% over the course of either 30 or 60 minutes as I recall)...so if you're rendering PDFS for a long period of time, and VizQL is using 90%-ish (?) of the RAM or CPU over  the course of that time, Tableau Server ITSELF may be killing the VizQL, assuming that something has gone wrong...


          Cycling Tableau Server is doing a couple things for you:


          • Releasing all the resources held by processes...you said that restarting the process after a failure doesn't do a lot of good...which indicates that you may simply have a process which is using just about all the resources on the machine (what SORT of machine are you running on? How much RAM?)
          • It'll reset the clock on the "watchdog".


          If this is indeed a resource-management issue, you're either going to have to modify your vizzes so that they are less resource intensive to render (crosstabs and tables generally = BAD in this regard), increase RAM on the box, or just do the cycling you're thinking about...

          1 of 1 people found this helpful
          • 2. Re: Performance Issues with tabcmd
            Jonathan Drummey

            In addition to Russell's excellent points (especially the fact that Tableau is not a report bursting engine, as much as  some of us might try to make it into one http://github.com/tableau/VizAlerts), what version of Tableau are you on?


            The reason that I ask is that when we upgrade Tableau Server from v8.3 to v9.2 the time to render for many of my daily reports (mostly cross tabs, unfortunately) was cut in half, from what I've read there were a number of performance improvements in v9.


            Another question is what happens if you spread out the load more, like do 10 at a time, wait 1/2 an hour to an hour for Tableau to do whatever housekeeping it needs to, then do 10 more, etc.?



            1 of 1 people found this helpful
            • 3. Re: Performance Issues with tabcmd
              Jeffrey Jacob

              Hi Jonathan,


              We are using Tableau Server 9.3.  If I include a timeout for any period of time (e.g., 10 minutes, 30 minutes, etc.) after n reports, it behaves similarly to an execution without timeouts - it begins to fail after report #60.  Only a Tableau Server restart allows for successful execution across all 170+ reports.



              • 4. Re: Performance Issues with tabcmd
                Jeffrey Jacob

                Hi Russell,


                I've posted other questions in this forum to ensure I have built the workbook in the most efficient manner, and based on the lack of responses as well as existing solutions posted in the community, I feel I do have the most efficient workbook that can be built within Tableau.


                For example (and to answer your question regarding cross-tabs), I have built the following view:

                Each row is a different measured value.  Each row was built using an "ATTR container".  For those with colored cells, they are the result of using dual axis cross-tabs - one to display the measured value, the other to apply a color based on a calculated field which compares the measured value to a baseline.  Tableau does not do a great job with conditional formatting.  I understand it is not Excel, but this should be a very basic function since it does provide a quick and easy understanding of what the data represents.  Yet, to apply conditional formatting across a column of different measures, "ATTR containers" with dual axis cross-tabs are the best solution I've found in the forums.  Each container has at most 2 marks.


                The implementation of the conditional formatting is what increased cycle time significantly.  Prior to it, I could generate all 170+ reports without a Tableau Server restart.


                Is there a better way to implement?


                Back to your response, I will execute the process without the restarts.  When it begins to fail, I will provide a snapshot of Task Manager with details around all processes, specifically Vizqlserver.


                Thanks for your response.



                • 5. Re: Performance Issues with tabcmd
                  Jonathan Drummey

                  Hi Jeffrey,


                  There are a few directions to go in here:


                  1) Deal with the tabcmd problems. Given that you are on 9.3 I don't have much to suggest except that you might try using trusted tickets and direct URL downloads instead of tabcmd, however I don't have enough experience to know whether that would work since what you're seeing is more on the Tableau Server rendering.


                  2) Try to speed up your existing workbook. When you are using the multiple-axis crosstab then there's an overhead for rendering every single axis and Tableau does have limits. I've found that using ATTR() can be 1/2 as fast as using MIN(), MAX(), or AVG(). The reason why is that ATTR() is computed in Tableau and requires a MIN() and MAX() to be computed, so by directly using MIN() or MAX() we cut the number of calcs in Tableau and Tableau doesn't have to do any computation itself.


                  Besides that there are several alternative ways to set things up that could lead to significantly higher performance.


                  a) Precompute more of the values outside of Tableau so there are fewer computations that need to be done in Tableau. Depending on what you are doing there can be order-of-magnitude increases here.


                  b) Use a Tableau data blend where there is a "scaffold" source for the layout that is used as a primary source and then your data as a secondary source, where the scaffold source has the dimensionality necessary to get those 24 rows in the view and a set of calculations to get the desired results. See TDT: Data Scaffolding with Joe Mako for an example where he took a dashboard that took 90+ seconds to render down to <10 seconds.


                  c) Restructure your data to be in a "tall" format that has the dimensionality that you need. Sometimes the Tableau Pivot feature can be of use here, or you can do that pivoting in your source.


                  3) Fundamentally you are trying to use Tableau like it's MS Excel, and Tableau is not MS Excel. It's a different tool with a different set of capabilities and unfortunately it's easy to think that they are more equivalent then they are. These kinds of kitchen-sink-of-metrics cross tabs are common in Excel because they are relatively easy to build in Excel, but they very often not so good at actually meeting the needs of the business. My suggestion here is to take a step back and work with your users and evaluate the kinds of questions they are trying to answer and then work to that, taking advantage of Tableau's capabilities.


                  [plug alert] I'm going into more details on this very topic at the Boston Tableau User Group next week, if you're in the area the meeting is full and there's a waitlist, you can register at BTUG @ Fenway Tickets, Mon, May 23, 2016 at 4:30 PM | Eventbrite. I'll also be posting the content after the session. [/plug alert]


                  I'll respond on the other thread about the lack of response to your other questions, I think you have had expectations of the forums that are not accurate.