9 Replies Latest reply on Feb 15, 2014 6:30 AM by Russell Christopher

    Distributed Install

    Tableau Person

      Is there an official step-by-step guide to doing a distributed install of tableau server with all components distributed onto a different machine or grouped and separated by functionality?

       

      Also, is there a guide to perform benchmark and load testing?

        • 1. Re: Distributed Install
          Aman Lamba

          I think this link might be able to help you

           

          Administrator Guide

           

          For performance recording in Tableau. Open Tableau Desktop. Go to Help -> Start performance recording!

           

          Hope it helps!

          1 of 1 people found this helpful
          • 2. Re: Distributed Install
            Tableau Person

            I guess there is no other guide with practical cases and best practices. Sure I could follow this but would be great to know where the performance is the best in a distributed environment. I have read that having Data Engine and Repository on separate servers is good practice along with VizQL on a third and gateway on a fourth, but should the background, data server and application server reside together on 1 machine or split them as well.

             

            Also are there any recommended BIOS settings for optimal performance?

            • 3. Re: Distributed Install
              Aman Lamba

              I think the Tableau pros should be able to answer this question of yours

               

              Matt Lutton Jonathan Drummey

              • 4. Re: Distributed Install
                Matt Lutton

                Hi Aman,

                 

                I'm certainly not a Tableau pro, and I know nothing about this topic.  Sorry.

                • 5. Re: Re: Distributed Install
                  Jonathan Drummey

                  I'm not familiar with this, sorry. Here's an updated link to the online help:

                   

                  http://onlinehelp.tableausoftware.com/current/server/en-us/help.htm#perf_extracts_view.htm

                  And an older KB article: http://kb.tableausoftware.com/articles/knowledgebase/optimizing-tableau-server-performance

                   

                  Someone who might help is Russell Christopher. However, I can say from being a reader of Tableau performance questions that the original poster hasn't given any detail with which to give specific recommendations.

                   

                  Jonathan

                  • 6. Re: Re: Distributed Install
                    Michael Perillo

                    I have built out a distributed/HA platform based on recommendations from the Server Administrator Guide. There is no one single answer as each Tableau environment will have different circumstances.

                     

                    The Administrator Guide Performance Section lists out some good suggestions to follow that could be interpreted as best-practice.

                     

                    If you have existing metrics that can tell you where you should place workers with certain processes (When to Add Workers & Reconfigure, I would recommend to start basic and monitor your environment from there. You'll want to having monitoring tools to capture Server performance. If you do not have 3rd party tools doing this, you can follow the steps outlined in Monitoring Tableau Server Performance.

                     

                    As you start to understand how your users interact and use the system for extracts, visualizations and such, you'll be able to reconfigure processes to Improve Server Performance.

                     

                    Some other resources to consider:

                    • 7. Re: Re: Distributed Install
                      Russell Christopher

                      Generally, the processes that will take up the most resources are the backgrounders (when refreshing extracts), data engines (when reading extracts), and VizQL processes.

                       

                      Where possible, keep the backgrounders segregated to distinct machine/vm (unless your refreshes happen when no users are on the system, in which case "who cares").

                       

                      Segregating data engines can be useful if you do a lot of extract reading.

                       

                      If you are dealing with 8 cores or less, don't split them up just because you can. You could unintentionally starve your VizQL processes of resources (CPU) that they could utilize all the time by splitting things up in an effort to "tune ahead".

                       

                      Since Tableau makes it pretty fast/easy to move components around, most folks just go to production with the basic configuration. Then, begin experimenting in an interative fashion when/if necessary.

                      • 8. Re: Distributed Install
                        Tableau Person

                        Thank you Russell.

                         

                        Your answer makes sense and gives me the insight I needed beyond the Administrator Guide and Online Documentation around this topic. Should you have further hard knox experience please feel free to share. I would be particularly interested in learning about Hardware Whitelist that works well with Tableau if there is such a list so I can configure and plan for a large deployment accordingly across geographical regions but all managed from a centralized point of origin for ease of use

                        • 9. Re: Distributed Install
                          Russell Christopher

                          Tableau Server was engineered to run well on "low-end", commodity hardware. There generally is no need to worry about specific "brand names and models" of hardware. Just give us CPU, Disk, RAM, and we are happy.

                           

                          That being said:

                           

                          • The better CPU you can give us (Xeon better than i3, etc). the better we'll perform for you
                          • RAM is cheap, give us more if you can (32 GB vs 16 GB - especially if running the 64-bit version)
                          • If virutalized, make sure your VM solution isn't throttling disk IO. If your VM solution gives us poor disk access, then we'll perform poorly, just like any other solution would perform poorly: SQL Server, Oracle, etc.

                           

                          You'll need to test a distributed, geographically dispersed deployment very carefully. In general, most people will avoid this approach. Here's why:

                           

                          Scenario:

                           

                          3 data centers:

                           

                          • New York: Primary Tableau Server
                          • New York: Worker 1 (Date Engine #1, Backgrounders, App Servers)
                          • New York: Worker 2 (VizQLs)
                          • Pune: Worker 3 (Data Engine #2, VizQLs, App Servers)
                          • London: Worker 4 (VizQLs)

                           

                          (I randomly put resources on the machines above - don't try and read anything into WHERE components are located)

                           

                          When a user in the London office logs into Tableau Server and requests a report, there is no way to FORCE the report to be renderded by a "local" VizQL in London. The report could be rendered in NY, Pune, or London...and when the report is rendered in any one of those 3 places, we don't know which data engine will be used (NY or Pune).

                           

                          So there is quite a bit of stuff flying around to different locations and if connectivity isn't really good and fast between data centers, you could see poor performance as a result of high latency between the components working together in different places. It is also typical for these data centers and offices to be on different subnets, which make for more problems and more latency. And where is your data source? Do you really want a user in Pune having to connect to a SQL Server or Vertica box or Redshift instance in the US? Slow!

                           

                          You should talk to your sales consultant or professional services guys, but we often recommend one just drops in 3 distinct deployments in a scenario like the one above...then keep them in sync with some of low-tech mechanism.