1 of 1 people found this helpful
Hey Darin. How long do these extracts take to increment using Desktop? If you can reproduce there, you can check the logs and see if the time is really being spent on the query itself, or it it's related to the Data Engine rebuilding the extract. The background task data used by that viz won't break down the sub-tasks it needs to do to complete the extract refresh successfully, but the logs will.
Thanks Matt - that was basically my question to Tableau Support but related to the Server Logs. Can I not retrieve the same information from Server logs in terms of the bottleneck being query execution or rebuilding the extract on server? I'll run a local version of the incremental off of the test SQL Server environment and see what I can dig out of the desktop logs. I appreciate your time.
1 of 1 people found this helpful
You sure can. There are several benefits to reproducing in Desktop with this issue:
I find the Desktop logs a little easier to peruse--they're all in the same folder, and you can easily make them more legible by deleting them all first, then opening Desktop again and reproducing the issue.
You are reproducing in a separate environment, which in this case removes environment issues as a variable and also removes any adverse effects you might visit on your production Server environment (I know, in this case risk is basically 0, but it's a good habit to have).
If this doesn't reproduce, we've identified a discrepancy between Desktop and Server, which narrows down the issue to environmental factors, and code differences between the two applications. Sets us up for next steps in troubleshooting nicely.
I have not done a performance recording. I'm going to get up to speed on a lot of troubleshooting skills that I've been fortunate (or maybe not so) enough not to have to master up to this point.
I have two things I need to do for sure - which is to reproduce clean logs from Desktop to reproduce an Incremental Refresh - and also do a performance recording of the same process.
This requires coordination from Data Arch to copy some views and change some filters so I can actually accomplish the reproduction on a dev/test environment- but I'm going to do these things and see what I can learn from the results.
Thanks Toby for the nudge.
I've submitted a new case with Tableau Support. Also - while waiting for their assistance we've also excluded the data directory of Tableau Server from McAfee Weekly Scans. I'm not the anti-virus guru but I'm also going back and forth with that group, internally, to exclude the entire E:\ from ALL scans OnAccess, Continuous Threat, Passive, etc - not just the weekly scan.
After 1 day of doing the exclusion on the data directory though - those incrementals did finish in under 4 hours - which is still about 3.5 hours too long - but it wasn't running up to the 12 hour max.
That's where I'm at now - still not sure what other poisons are in the water but - I've provided Tableau Support with a couple performance records of interacting with published dashboards that were nearly inoperable while those 2 extract jobs ran all day.
Thanks again gentlemen.
Upgraded to 9.0.1 this morning.
Excluded the entire E:\ that Tableau is installed on from all McAfee OnAccess scanning.
Latency has been improved - but jobs are still running much longer than they should.
The same jobs that run on the DEV environment (Dev Tableau box - Dev Sql Server) finish in under 5 minutes, loading the same incremental data for "yesterday". Whereas the same incremental jobs on Prod take many hours. Granted Prod has last 13 months in the entire data source - whereas prod is only 2015 data but still - Dev is performing as I'd expect. The source database is returning results in an appropriate time - where we can't tell what's taking the jobs so long to finish on prod.
We installed a sql client - squirrel, I think, on the Tableau Server box to query the source database from there using the same view - and data for these extracts is returned in the appropriate time. So we've ruled out the source database view SQL, McAfee, TServer version 8.2.5.
Haven't heard back from Support since level 1 support pushed to the Server queue Weds.
Today I came to find the entire repository down on the 9.0.1 prod environment.
No content, no process status, eventually got a server login screen with 'login failed'.
And the rain came pouring down.
Arg--sorry to hear you're having further issues, Darin. Typically a restart fixes that particular issue with 9.0, I found--but please try and grab Server logs again and send it in as a new case, because it needs to be fixed.
Here's what we have identified as the root cause:
A 3rd party Java program was not playing nicely with Tableau Server.
Solarwinds has a Log and Event Manager program (a.k.a. TriGeo) that our security team uses to analyze events/logs.
Tableau Support found in some Tableau Server logs that there were some Java related exceptions/alerts/errors - and suggested we remove that program.
After doing so - the Repository has not crashed, all 25 extracts are finishing in an acceptable amount of time and this has been the case since Thursday.
I still have to figure out how TriGeo and Tableau Server can co-exist as that's a requirement of the Security team - but everything is getting better.
Thanks for following up with the answer! That's definitely a tricky one to have figured out...I'm glad to hear you were able to get to root cause, and get it fixed.