Hardware recommendations are always generic, unless we have (or can collect) data on your environment's actual usage.
In your scenario, you have an 8-core license. Given that, and your estimated number of concurrent users, I'd suggest:
1) A single, 8-core server with *at minimum* 32GB of RAM, and perhaps 64GB if possible. The free space Tableau recommends (50GB) should be fine given your scenario, but as your usage grows, you might find yourself wanting more. Since disk is so inexpensive, I tend to recommend ~250GB or more as a rule of thumb. Please keep in mind that the disks need to be fast, so preferably solid state or several 15k RPM spindles in a high performance RAID configuration.
2) Two 8-core servers with the same specifications for high availability to serve as your worker nodes, and one at half-spec for your primary. Keep in mind that you'll want additional disk space on your primary to contain backups.
2a) I do not recommend having two 4-core workers for high availability. From my perspective, two 4-core servers would be giving too much of your total licensing capabilities to other processes such as OS and management, rather than to Tableau, which is what you paid for. That can add up.
Here's a whitepaper that I and a few coworkers wrote for Tableau on Hardware Planning and Server Configuration. You may find it useful, but maybe a bit more info that you need.
Thank you very much for your quick response. The white paper very well explained all that one needs to know on hardware configuration for Tableau.
As of now, considering my requirement, We will go ahead with Single node installation with 250 GB free space on disk.
We plan to review this setup after 6 months and upgrade if necessary.
Thank you very much for your support and time!!
Joyfully & With Gratitude,
Brad has good recommendations so all I can add is that I recommend 64GB RAM minimum. We are running with 8 cores in a heavy extract environment with over 1,000 active users, 77 published data sources, and over 1,500 individual views. We are now on a 1TB hard drive and planning on increasing capacity to 3TB total.
It is useful to know your server load and configuration. Thank you!!
I am pitching to go to an enterprise server with 8 core and am looking at similar numbers to you ie around a 1000 users. Looking at the Tableau recommendations they suggest that I would need 16 core to handle that many users at a concurrency of around 100. How have you found performance on that many users on 8 core? What is your concurrency like?
Thanks in advance for any advice. I'm putting all this effort into pitching and then if I get approval am going to look pretty silly if it runs like a dog
We've had 19 concurrent users at most during a 5 minute period. Typically it's lower, like less than 10, however, it is pretty much constant -- there's always someone or some process (extract, subscription) consuming server resources. What is hard to establish is concurrency -- how many users are actually using the Tableau Server, making it do work. Also, measuring concurrency at 15 minute intervals is very different from 5 minute intervals which is different from 1 minute intervals (see where I'm going?). This very specific topic is spoken but elusively never defined
Our issue is with many extracts running 24/7 (predominantly North America but some users in ROTW (Rest Of The World)). A lot depends on how good the developers are Based upon my experience with what we have I think 16 cores is aggressive. 8 would be safe but if you can get 12 then you are definitely safe and have future capacity to boot Make sure you get 64GB RAM at the very minimum (again, my personal recommendation), 128GB would be great.
For production environment, minimum 8 core per box is recommended. You can install with less cores but it's for non-production testing purposes only. Production with less than 8 core per box and you probably will encounter performance issue. I like Toby's 128GB RAM minimum, too. Also keep in mind that minimum specs = minimum performance
1 of 1 people found this helpful
Hi Toby! I was reminded of this post today, and I remembered that I wanted to chime in on concurrency.
I take my idea of a concurrency calculations from an operations management course I accidentally wandered into once. Basically, when an activity starts, it adds one to the list of concurrently running activities. When the activity ends, it subtracts one. Taking that very simple explanation and tearing it apart, we can build a query against the repository that lets us understand concurrency. First thing's first though:
1) For something to be concurrent, it has to have a start time and an end time. To that end, there's no concept of "concurrent users" because users don't start and end. It's a bit pedantic, but it helps when considering the next bits. I tend to count "concurrent sessions" or "concurrent tasks" because those are things with start and end times.
2) There's an http_requests table in the repository that shows us created_at and completed_at columns. If we pivot each row and create an event column and time column, we can then use a running calculation to determine the effect on concurrency (each event "start" would be a +1, and each event "end" would be a -1). Of course, this would need to be sorted by time to make the calculation make any sense.
3) If you just used the http_requests table, you'd get "concurrent tasks" which is NOT the same as "concurrent sessions" -- a user can have 10 tasks going at once in a single session. If you want concurrent sessions, you'd need to get the MIN(created_at) and the MAX(completed_at) grouped by the session_id.
Here's a handy dandy Excel workbook that walks through how to take data with created_at and completed_at columns and turn it into ACTUAL concurrency.
PS - Alright, I also added a query for you; the query has starting and ending concurrency for the site level and the system level. The start concurrency is task concurrency including the task that just started; the end concurrency is task concurrency excluding the task that just finished. Site and System concurrency is self explanatory; keep in mind that system will be equal to or greater than the sum of sites -- the difference will be in things that aren't related to sites (like logins).
Brad, I'm glad you wandered into that class
Stepping through your Excel presentation helped me visualize and understand you're explanation
I understand what you mean by session vs. task. Also, I think sessions could be tied to users as isn't "session" a different name for "user" as denoted by "user_ip" in your SQL statement? From my trouble-shooting forays I know that an IP address can be linked to a user ID. Here's an interesting question now: Can a user have multiple sessions, for example, I log on to my Tableau Server through two separate browser sessions but still use the same Tableau logon credentials? Or would that just be considered a single session?
Since Tableau broadly uses concurrency at the system level -- which makes sense -- I think having Site in your query adds unnecessary data. Anyway, we still need to determine what the maximum and average concurrency is which means we need a viz for it
Nice work, it got me thinking
P.S. Oh, and what's with this unnest(array[... code? I had to look that up Your SQL Kung-fu is strong, Shifu!
2 of 2 people found this helpful
Regarding users vs sessions, I think that when people say "concurrent users" they mostly mean "concurrent sessions." That said, to your point, it IS possible for a single user to be running multiple simultaneous sessions. Having 100 concurrent users each with 3 simultaneous sessions would be significantly more load than 100 concurrent users with only one session each. That's why I favor "concurrent sessions" as a measure. I favor sessions over tasks because a 16-core server can truly only ever do 16 things at once (waiting on things not included). Concurrent tasks tends to be bound nearer to the number of cores you have than concurrent sessions, which can go significantly higher. Also, if I walked into a room full of people, and each of them had a Tableau Server viz on their screen, I'd consider each of them a "concurrent user" (whatever that means) whether they were causing TS to do a task or not. That falls in line with the "session" thought.
Regarding site concurrency, I agree. I can't remember what caused me to add that site concurrency -- maybe I was trying to hunt down how each site contributed to the number of concurrent tasks as part of a chargeback model?
At any rate, I'm glad the spreadsheet and query helped; it's definitely not fully hashed out, which is why i've just been holding onto it for the past couple of years. In hindsight, I probably should have shared it sooner.
Brad Fair wrote:
Regarding users vs sessions, I think that when people say "concurrent users" they mostly mean "concurrent sessions." That said, to your point, it IS possible for a single user to be running multiple simultaneous sessions. Having 100 concurrent users each with 3 simultaneous sessions would be significantly more load than 100 concurrent users with only one session each. That's why I favor "concurrent sessions" as a measure...
Perfect! Now it makes sense Thank you for sharing your knowledge!