Well, my goal was to publish a Story in .twbx form, but I neglected to remove the more sensitive fields from my data before extracting. Now I have no way to obfuscate the data and share publicly without a lot of extra work. So you'll have to be content with pictures.
Config and HW of test system:
Adding the RAM disk and altering configuration:
Overall results for extract refreshes ("in" = HDD, "out" = RAM):
- Incremental extract refreshes showed a larger improvement than full (60% improvement on some of our incremental Splunk extracts). This makes sense as they theoretically reduce the amount of time spent by the query running on the source data server, and transferring it all over the network. If we supported incrementals that handled Updates and Deletes, I suspect the RAMDisk approach would really be worth persuing.
- Large workbooks with many extracts also saw a better than average benefit.
- Rendering speed tests done over 100+ vizzes during two nights showed no benefit to viz load times.
- I halved the temp file expiry time to decrease the amount of data we retain there, decreasing the risk of running out of space (we get up to the mid 20 GB levels during peak extract refresh times)
- During a backup or restore operation, you must revert your change to the service.temp.dir, as you will very likely not have as much room in RAM as Tableau thinks it might need to perform them--it will fail as a result. Because the config change would require a restart, you'd be taking the server down for a few minutes each night--not possible for us.
We won't be using this, but for extract-heavy environments running Incrementals all the time, it might be an option.
Seeing about a RAID-0 SSD array dedicated to temp might be worth pursuing, as Cristian suggested--you should be able to maintain plenty of space for temp, and still get a bit of a performance boost (in particular I'm interested in what backup / restore / ziplogs might behave like).
All this investigation work is great, though I continue to wonder if we're barking up the wrong tree and the focus of the issue of initial load rendering time and getting data readily accessible within cache. I sense that the RAM disk solution will greatly enhance the "execute query" step (awesome), but many times when I look at performance recordings (?:record_performance=yes), the query step isn't the pain point. Case in point is the below perf recording viz, the "execute query" is 5 second out of the total 28. This is why I put the idea (shameless plug) out there. I sense that the product will evolve to smart pre-caching, though there should be a way to force it ahead of time. http://community.tableau.com/ideas/4370
you wrote: "Rendering speed tests done over 100+ vizzes during two nights showed no benefit to viz load times"
Isn't ii weird?
That result if VizQL and other services indeed do use ram disk means the Cache service is working at 100% effectiveness and I am not quite sure that is real...
Not necessarily. It could simply mean that the amount of data (or frequency of access) is so small for a view render operation that using the HDD for that portion of the work just doesn't improve things much. That's what I would suspect, and that's why I agree with Jeffrey that this probably isn't the bottleneck in our own viz load time. What does generally take up that time, I would like to better understand.
Could you capture the same performance recording form desktop side?
I see that for one non null workbook (Geographic) you have both dashboard and worksheet marked as null... ???!!!
If I remember right, perf. recording put a lot of details on .twbx but display only a portion. Maybe you and Matthew, should engage support team on this issue.
You are correct that by default the pef recording only displays a portion and leaves a lot of whitespace. But if you download it and unexclude other and add in event names, then you can see all the detail to your hearts content.
Here's the perf recording when running the same workbook in desktop. Looks a little different, but still deficient in terms of time to render.
And I do not control the null values for dashboard and worksheet. These are Tableau internals...
I found three applications similar but not identical with Arsenal Image Mounter, so i decided to post the links hre.
1. drive bender - Drive Bender (Drive Bender presents multiple hard drives as a single pool of storage, either as one or more drive letters, or a network shared drive.)
2. drive pool - https://stablebit.com/DrivePool (Combines multiple physical hard drives into one large virtual drive)
3. flexraid - http://www.flexraid.com/ (RAID over File System including storage pooling)
Where do I get RAM disk if I want to do a trial? On the website, it makes me fill out a form and wait for a reply.
For testing try SoftPerfect ramdisk SoftPerfect RAM Disk : high-performance RAM Disk for Windows
- RAMDisks Roundup and testing
- 12 RAM Disk Software Benchmarked for Fastest Read and Write Speed • Raymond.CC