Is very long custom SQL being used at all? All custom SQL is logged every time it is used to I have seen times where logs were filling up at a rate of over 7GB a day depending on how many times the query was run. Otherwise, I'd want to know if the issue occurs on later versions of Tableau Desktop.
Thanks for posting,
When you say "long SQL" do you mean a long SQL string, or bringing back a large dataset? I haven't used any large custom SQL code, but have brought in some not small datasets (but nothing that I thought was particularly problematic).
Here's something I found, by Anthony Krinsky, that is of interest:
An interesting thread on auditing came through this morning. Here are some highlights:
This customer uses Tableau to display dashboards with real-time operational metrics on big LCD panels throughout their organization. These panels would cycle through a list of dashboards, swapping every 30 seconds. The result is they have a sustained request rate for dashboards of one every couple of seconds (this is before you add the traffic from human users via iPads and PCs). This means there are a LOT of entries going into their _http_requests table, making it grow and grow. Over the course of a week it seems their server writes about ~25GB of data to the _http_requests table. So initially the PostgreSQL DB grew and stabilised around 25GB (remember, by default we purge all records from this table that are older than 7 days when we do a backup).
Then they had a very busy week that resulted in a lot more data in the _http_requests table. The DB file grew to ~34GB at which point C:\ was full and the fun started. They tried to run a backup to trigger a purge of the _http_requests table but that didn’t make the DB files smaller. It turns out that PostgreSQL is like many other DBMS as it dynamically grows its storage files, but when you delete records it doesn’t shrink them – there is a specific command you need to run as an admin (VACUUM FULL) to reclaim the space. Running this brought the DB file back to ~25GB and got their server back up.
Also, much of the information that we drop into _http_requests is getting landed into the new v8 historical_events table, too….and unlike _http_requests, we don’t clean historical_events up on a backup or with tabadmin cleanup. Instead we save these records for 6 months out of the box – so you might want to drop this value down a bit on a really active system:
# How long to keep historical_events records before deleting them, expressed in days
- wgserver.audit_history_expiration_days: 183
This is why you should run a tabadmin cleanup that will flush these http requests older than 7 days as well as the log files. I usually asks customers to do a weekly backup over the week end (tabadmin backup) and a tabadmin cleanup at the same time. They can write a simple bat script to schedule via windows task scheduler.
You need to do 2 cleanups because depending if server is started or stopped you can't clean the same thing (logs vs http requests).
Toby, thanks! In general terms, the SQL that I use there is pretty light (I generally do the heavy SQL work in SQL itself so that I'm just bringing in a table). It sounds like that article was about maintenance, but I was having it scale to 30GB in half a day...hopefully I'll get some feedback from that other board, if not I'll shoot a note to support.
Fortunately I'm in a documenting phase so I'm not using T for a bit
Did you ever figure out what was happening here with the log files from Tableau Desktop? We use a shared terminal server at my company with several Tableau users logged on to one machine and each person has their own Windows user folder on the C drive where My Tableau Repository lives for each user. We were recently asked by IT to perform some housekeeping to clean up these logs and free up space on that machine, but I'm curious as to what generates the logs from Tableau Desktop and if they are necessary to keep besides those times when they might need to be sent to Tableau Support for troubleshooting an error. We're considering creating a script that would clean up the logs after a given length of time, but don't want to purge these files if they need to be kept for the software to run properly.