1 of 1 people found this helpful
Thanks for sharing your insights here. Have you happened to compare the consumption of threads? I'm not sure if this is enabled by default within the tabjolt reports, but it is available via built-in windows perfmon counters called "\System\Processor Queue Length" as one way to measure the # of threads waiting. I am interested to know if 10.4 creates more parallelization and albeit uses more threads which may cause contention which may cause CPU to go up faster.
We're currently on 10.3, and haven't made the move quite yet to the newer releases. But the last time we ran tabjolt I believe we were on version 10.1 and we saw the CPU start to go up at about 30+ concurrent users along with error rate started to go up. We attributed it to some thread contention that we were seeing running a similar environment to you (16 cores hyperthreaded which allows for 32 concurrent threads). So this is a good reason to scale out horizontally across workers as while CPU may not be 100% consumed, thread contention may cause CPU to artificially increase. So this is why I ask the question above.
And the reality is based on how concurrent users are defined. We don't run into an issue most of the time even though we have way more than 32 active users on prod. We have 100's of active users all logged in at the same time. But not all of them are actively hitting enter at the same time. And when there is active thread contention, it's usually minimal and short lived.
Does this help at all? Let me know.
It does help to build some confidence into migrating to a new version. We are still looking at this though.
We have executed a new test to get more data adding a few counters.
Checking the Process Queue Length, we don't see any significant difference (it actually seems we have slightly more queue length with Tableau 10.1).
We have also checked the Thread Count by Tableau process and came with the following observation:
- Tableau Server 10.4 uses less threads for most processes except httpd that uses twice as many as Tableau Server 10.1.
It seems that only Tableau Server 10.4 VizQL and Data Server processes use more CPU though.
Most other processes uses as much CPU, or less, which is the case for httpd.
We also noted that in our test it seems we trigger a lot of cache hits, though we haven't clear metrics about that yet.
We did the test by changing the number of threads for httpd back to the number we observed in 10.1 and we have better results: tabadmin set gateway.threads 550
Tableau Server 10.4 still uses more CPU, but we could have overall results that were comparable to 10.1 (average response time is the same, the 95 percentile response time is higher with Tableau 10.4 when there is too much concurrency).
We opened a ticket to Tableau support to know if this is a valid approach to allow better scaling of Tableau Server 10.4 when using mostly a single node for the Gateway and VizQL processes.