4 of 4 people found this helpful
Yeah, this is a really big question. I'll put key stuff in bold throughout.
Rather than trying to figure this out at a "global" level (which can be next to impossible if you have many vizzes which all of the sudden are "slow"), I'd pick one to focus on.
- First, define what is "way slower". How fast was the viz in question running before? (This is where historical baseline numbers are really helpful). A couple of seconds is probably not worth chasing.
- Try to use a Viz which uses an Extract as a data source to eliminate trips out on your network for data. If you can't do this, you''ll also need to monitor your databases performance too - which makes life pretty complicated
- Install Tableau Desktop on your Server (Worker1) and run the Viz interactively. How much faster/slower is it? It'll typically be a little faster in Desktop, but not "way faster".
- While logged on at the console of Tableau Server (Worker 1) hit localhost and see if the viz runs any faster and slower. The goal here is to rule out the network as a cause of the behavior. This'll be difficult for you since you have a multi-node setup and therefore will almost NEED to go out over the the network.
- Does your workbook use a Data Server data source? Kill it and try an embedded data source instead to see if it makes a difference
Global things to look at:
- Is your CPU averaging 80% or higher during periods of slowness? If so, you're probably just "out of Tableau" and need more cores if your goal is to deliver the same level of performance during peak usage time,
- What does RAM utilization look like? Do you have at 8 GB of RAM per core on your machines?
- Look at disk (Tabmon doesn't track the one specific counter that is really useful - you need to add it): Do you see disk queues of > 2 on the drive that Tableau is installed on? Does Average Disk sec / Transfer (disk latency) go over 20-30 more than every once in a while? If not, your disk may not be responding to the OS (and the OS to Tableau) fast enough
- Are you running on a VM? Talk to the guys who own the VM configuration and find out exactly what the settings they used to deploy. VM nerds love to use settings which "share" Tableau's resources with other hosts, and Tableau Server does not like this
Thanks for pointing out some things for me to look into. I am going to work on gathering some of this information. While I do I know you mentioned network....I have also been getting emails about Cluster Controller process going up and down. On both my Worker 1 and Worker 2 separately. I have gotten 10 emails about this happening in the past 4 days. I believe this is network related, correct? Is this cause for concern/potentially the cause of my 'slowness'?
Here is some information on a specific View which has gone up 2X (in the past 14 days) in terms of load times. I will now start looking into some of the global items you mentioned.
- In regards to defining ‘slowness’…Below is a screenshot of a specific URL which has been raised. This workbook is tapping into the Server tables and is something created by Tableau when looking at our environment some time ago. The bottom chart shows the Avg Elapsed Time going up. Elapsed Time is Completed-Created timestamps. This is only going back a few weeks as we do a full refresh, but even within this time we can see it has doubled.
- The workbook above is using an extract
- I ran performance recording when I looked at the same Workbook/View on Worker1 in Tableau Desktop. To create the view which is above took less than 5 seconds. To open the whole workbook took 70 seconds with 58 of that being for ‘Open Workbook’ event.
- I was unable to hit the local host on Worker1 because according to someone at my company, ‘it doesn’t have a web server on it’. I did log into my primary machine and hit the local host. When I did this it took almost 2 minutes for the visualization to come up.
- This does not use a Data Server data source. It is SQL Server connection.
Ok Russell...I think I have gotten most of the information you requested. Looking forward to seeing where I need to go from here...
Global things to look at:
- For the machine with the VizQL processes, Worker1, the CPU is way below that. This is data from the last 2 weeks from TabMon. Worker2 CPU is much higher, which we expect as we have lots of extracts running on this machine.
- Below are RAM utilization charts for last 2 weeks for both of my workers. We do have at least 8 GB of RAM per core.
- I am not quite sure where to find info about disk latency…are you saying this is what I need add?
- We are running on a VM…I am currently trying to track down the nerds.
A couple thoughts:
- Worker 1 does have a Gateway process, so the person who told you that you can't access it via localhost is incorrect. You should be able to login to the console (via RDP) on worker1 and execute http://localhost. Worker1 has all the components necessary to render your viz for you.
- IT doesn't look like CPU / RAM is an issue. You clearly have lots of headroom there.
- You'll need to modify TabMon's config files to add the perfmon counter I mentioned earlier (to capture disk latency). I think that Tabmon captures disk queing by default, however.
- I'm not 100% sure what "work" is encompassed by the "Open Workbook" event, but it sounds like this is the part of the rendering process where we pull a BLOB out of the database (which represents the XML "design" of the workbook) and potentially parse it. If our metadata repository (PostgreSQL) takes a long time to respond because it is getting hammered by other processes (like your reports which query it directly), things would slow down.
- You mentioned VMs and it is not unusual for VM admins to give Tableau "bad storage" - meaning they see Tableau as "only reporting software" that therefore doesn't need high disk throughput like an RDBMS would. That is a mistake. We include an RDBMS (PostgreSQL) and we write lots of temp files so we need fast disk. Drill down on this with your VM peeps
- Zookeeper freaking out is not uncommon if the network is twitchy. However, all the component necessary to render your viz (vizql, data engine, repo) live on the same machine, it is likely NOT network that is causing this issue. We'll typically see network causing an issue in terms of sending info to the browser and/or Tableau Server being able to connect to a remote database in a timely fashion. You're using an extract, so there no db connection going on. While it would be nice to test hitting localhost from a local browser on Worker1, I think you've already identified the issue (Opening Workbook event).
Russell has given some great advice as usual. But just curious, what version of Tableau Server are you running? Also, in regards to slow performance were things running for awhile and then it suddenly happened versus the slowdown appears to have happened gradually over time? Lastly, not that it would be a solution but to possibly provide temporary relief has Tableau Server been restarted recently? For a lot of our customers they do restart Tableau Server after say applying windows patches but it would be interesting to see if things are still slow after a restart or if there's any noticeable improvement in rendering afterwards.
Yes Russell gave me some great things to look into and I am in the process of talking with our VM people to see if we can identify any issues.
We are currently running 9.3.4 Server. I dont have great stats on the slowdown as we full refresh our workbook with this info, but it did appear to happen relatively quickly. We did a restart on our Server yesterday after Windows patches and things do appear to be running quicker. I am going to let the data gather for a couple of days before calling it good. Any explanation for why this restart helps?
It's hard to say for certain but a restart means that all resources are getting refreshed. For the 9.2.x branch I know there was a vizql memory related issue that got addressed in 9.2.10 but I'm not certain that same issue existed in the 9.3.x branch.
I realize for most organizations it's not necessarily easy to upgrade but thinking if you have a non-Production environment it might be worth deploying 9.3.9. In particular that release includes this one fix:
Loading some complex workbooks, with many dashboards and data sources, was significantly slower after upgrading to Tableau Server 9.3.
Plus I believe there were some security fixes included along the way and you probably already know that are releases are cumulative so you would get all the fixes that are included from 9.3.5 up through and including 9.3.9.
Hi Nick, I had similar view render slowness. But mine happened right after v10 upgrade, see my post V10 Server upgrade workbook renders slower than V9
Although we may have complete difference RCA, there are 3 things that I look to know if server has performance slowness or not:
1. What is the avg elapsed time for all views by percentage ? This performance view is actually from V10 Admin - Server Status - Performance of View. By this view along may not tell you the slow for sure but this view gives you pretty good indication. If you do not have v10 yet, you can easily build this view as I saw you had correct table joins already in your above post.
2. What is avg elapsed time for all views overtime? Move up or not? You can also use mean of elapsed time.
3. What is the sessions or total views rendered within a hr (which is also available from V10 Admin - Server Status - Performance of View) which gives you idea about your workload.
Combination of those 3 things can tell you if server slows or not for sure. If you see more red in above #1, avg or mean moves up and your sessions are about the same or more, it means that your server slows than before.
Thank you to everyone on here for your help! A restart seemed to solve my issue and I know have some views built which will get me the historical data i need to look into this issue next time!
Would you be able to explain a little bit more what you mean by "VM nerds love to use settings which 'share' Tableau's resources with other hosts"? Is there a specific setting on VM we should be looking at? According to my VM people VMs are shared by definition.