Adding myself as a watcher...
Did you monitor IO load during extract generation and temp folders used by Tableau server's processes?
There is a thread questioning the rationale behind a RAM disk solution, there are a lot of settings posted, take a look here Would a RAM disk improve Server performance?
I just copy/paste few comments:
Tableau Server uses the folder :\ProgramData\Tableau\Tableau Server\data\tabsvc\temp
- In a Tableau Server environment, it’s important to make sure that the backgrounder has enough disk space to store existing Tableau extracts as well as refresh them and create new ones. A good rule of thumb is the size of the disk available to the backgrounder should be two to three times the size of the extracts that are expected to be stored on it.
- Tabcmd (a command-line utility) can be used to refresh extracts, as well as to publish TDEs to Tableau Server
I suppose that vizqlserver.exe and/or data engine (tdeserver64.exe) will need temporary disk space to be able to push correct data back to clients.
The list with all tableau processes http://onlinehelp.tableausoftware.com/current/server/en-us/help.htm#processes.htm%3FTocPath%3DAdministrator%2520Guide|Troubleshooting|Work%2520with%2520Log%2520Files|_____1
Stores data extracts and answers queries
The data engine's workload is generated by requests from the VizQL Server process. It is the component that loads extracts into memory and performs queries against them. Memory consumption is primarily based on the size of the data extracts being loaded. The 64-bit binary is used as the default on 64-bit operating systems, even if 32-bit Tableau Server is installed. The data engine is multi-threaded to handle multiple requests at a time. Under high load it can consume CPU, I/O, and network resources, all of which can be a performance bottleneck under load. At high load, a single instance of the data engine can consume all CPU resources to process requests.
Loads and renders views, computes and executes queries
Consumes noticeable resources during view loading and interactive use from a web browser. Can be CPU bound, I/O bound, or network bound. Process load can only be created by browser-based interaction. Can run out of process memory.
Tableau stack use ~20 folders to store logs!
the math equation who tell us how much free space on temp folder we need!
If using extracts, consider the space needed by the Temp directory during an extract refresh. The Temp directory, which is where an extract is stored to during a refresh, may require up to the square of the final file size of the extract. For example, a 12 GB extract may take up 144 GB of disk space to complete the refresh.
26Gbytes of RAM are a lot of bits...
In your case i will buy 2 good sdd disks (256 or 512 Gbytes), the prices are decent now, put them in raid 1 (mirroring) and point temp/tmp folders to that disk.
Hope this helps.
If I run the SQL that the extract refresh is generating (sourced from the Tableau log file) in a query tool like Toad, I get results in 10 minutes. But, the actual refresh process on Tableau Server clocks in at 104 minutes, on average.
You compare apples versus oranges, what you see in TOAD are raw records versus .TDE, where data is put in columns, indexed and compressed. To create a .TDE, which is in fact a database, a lot of work is done in background, so to decrease the time needed to create that .TDE you should carefully monitor IO load, cpu and memory usage during the process.
Other option is to offload this task to a dedicated machine or purchase an ETL software able to export data as .TDE see this thread third parties tools able to export/import data to/from Tableau Data Engine
Yes, I concede that TOAD vs. Tableau Server extract is not a one-for-one comparison. I only throw the TOAD information out there because it verifies that Tableau Server isn't creating some horrendous, unmanageable SQL on its own and it should also be roughly the amount of time it takes Tableau Server to return the same raw data in the early stages of the extract refresh.
Aside from decreasing the amount of data, I'm searching for what we can do to speed up the "columnarization" that Tableau is doing. We are good on all the basics (disk space, etc.) but it would be extremely beneficial to know whether there are any hardware or configuration changes that result in tangible benefits to extract refresh performance.
I skimmed the RAM disk thread and it sounded like the consensus was that there wasn't enough of a performance boost to justify making the change.
The issue with the RAM Disk approach is that you need a LOT of RAM.
Explore the SSD path. There are 3 zero costs applications able to evaluate your system and provide recommendations based on your IO load and disk access patterns.
- ioturbine profiler http://get.fusionio.com/ioturbine-profiler
- hgst profiler http://www.hgst.com/software/HGST-profiler
- stortrends idata tool StorTrends® StorTrends iDATA Tool | All Flash Array & Hybrid Storage
Hope this helps.