    Incremental Refreshes Running Stupid Long

    Darin Coulter

      Our data source is SQL Server 2012 views and we're publishing extracts on a single node Tableau Server 8.2.5 instance.  I know there are endless possibilities for what may be in the water here... but I'm not sure where else to turn.


      I realize this question probably has the answer of - Check with Tableau Support.  Which I've done and they said that my DBA should be able to optimize the queries being used and they can't really do anything for me.


      I've went back and forth with our Data Arch team since our most recent promotion of new data sources to production - and they've optimized according to query plans as best they can.


      Yet - loading 1 day of no more than 250K records runs for more than 4 hours consistently with these 2 problem extracts.


      The other full Refresh Extract jobs here are re-loading our rolling 13 month warehouse in under 30 minutes for each of those views.


      I'm looking for suggestions from the Server Community if anyone else has experienced this issue with incrementals taking forever?  Just pointers for "Have you checked this or that?" "Are you using a Date field as the key for the incremental refresh?" "Is that key field included in a non-clustered index in SQL Server?"


      One of these queries, if I hit the view directly in management studio returns full results of the query to get data for '2015-05-04' in 3 minutes.  So how can I troubleshoot what is causing a similar query from Tableau Server to take 5 hours?


        • 1. Re: Incremental Refreshes Running Stupid Long
          Matt Coles

          Hey Darin. How long do these extracts take to increment using Desktop? If you can reproduce there, you can check the logs and see if the time is really being spent on the query itself, or it it's related to the Data Engine rebuilding the extract. The background task data used by that viz won't break down the sub-tasks it needs to do to complete the extract refresh successfully, but the logs will.

          • 2. Re: Incremental Refreshes Running Stupid Long
            Darin Coulter

            Thanks Matt - that was basically my question to Tableau Support but related to the Server Logs.  Can I not retrieve the same information from Server logs in terms of the bottleneck being query execution or rebuilding the extract on server?  I'll run a local version of the incremental off of the test SQL Server environment and see what I can dig out of the desktop logs.  I appreciate your time.

            • 3. Re: Incremental Refreshes Running Stupid Long
              Matt Coles

              You sure can. There are several benefits to reproducing in Desktop with this issue:


              I find the Desktop logs a little easier to peruse--they're all in the same folder, and you can easily make them more legible by deleting them all first, then opening Desktop again and reproducing the issue.


              You are reproducing in a separate environment, which in this case removes environment issues as a variable and also removes any adverse effects you might visit on your production Server environment (I know, in this case risk is basically 0, but it's a good habit to have).


              If this doesn't reproduce, we've identified a discrepancy between Desktop and Server, which narrows down the issue to environmental factors, and code differences between the two applications. Sets us up for next steps in troubleshooting nicely.

              • 4. Re: Incremental Refreshes Running Stupid Long
                Toby Erkson

                Darin, you didn't mention if you've done a Workbook Performance Recording.  Have you?

                Server Performance Recording.

                • 5. Re: Incremental Refreshes Running Stupid Long
                  Darin Coulter

                  I have not done a performance recording.  I'm going to get up to speed on a lot of troubleshooting skills that I've been fortunate (or maybe not so) enough not to have to master up to this point.


                  I have two things I need to do for sure - which is to reproduce clean logs from Desktop to reproduce an Incremental Refresh - and also do a performance recording of the same process.


                  This requires coordination from Data Arch to copy some views and change some filters so I can actually accomplish the reproduction on a dev/test environment- but I'm going to do these things and see what I can learn from the results.


                  Thanks Toby for the nudge.



                  • 6. Re: Incremental Refreshes Running Stupid Long
                    Darin Coulter



                    I've submitted a new case with Tableau Support.  Also - while waiting for their assistance we've also excluded the data directory of Tableau Server from McAfee Weekly Scans.  I'm not the anti-virus guru but I'm also going back and forth with that group, internally, to exclude the entire E:\ from ALL scans OnAccess, Continuous Threat, Passive, etc - not just the weekly scan.


                    After 1 day of doing the exclusion on the data directory though - those incrementals did finish in under 4 hours - which is still about 3.5 hours too long - but it wasn't running up to the 12 hour max.


                    That's where I'm at now - still not sure what other poisons are in the water but - I've provided Tableau Support with a couple performance records of interacting with published dashboards that were nearly inoperable while those 2 extract jobs ran all day.


                    Still learning.


                    Thanks again gentlemen.

                    • 7. Re: Incremental Refreshes Running Stupid Long
                      Darin Coulter

                      Upgraded to 9.0.1 this morning.


                      Excluded the entire E:\ that Tableau is installed on from all McAfee OnAccess scanning.


                      Latency has been improved - but jobs are still running much longer than they should.


                      The same jobs that run on the DEV environment (Dev Tableau box - Dev Sql Server) finish in under 5 minutes, loading the same incremental data for "yesterday".  Whereas the same incremental jobs on Prod take many hours.  Granted Prod has last 13 months in the entire data source - whereas prod is only 2015 data but still - Dev is performing as I'd expect.  The source database is returning results in an appropriate time - where we can't tell what's taking the jobs so long to finish on prod.


                      We installed a sql client - squirrel, I think, on the Tableau Server box to query the source database from there using the same view - and data for these extracts is returned in the appropriate time.  So we've ruled out the source database view SQL, McAfee, TServer version 8.2.5.


                      Haven't heard back from Support since level 1 support pushed to the Server queue Weds.



                      • 8. Re: Incremental Refreshes Running Stupid Long
                        Darin Coulter

                        Today I came to find the entire repository down on the 9.0.1 prod environment.


                        No content, no process status, eventually got a server login screen with 'login failed'.


                        And the rain came pouring down.

                        • 9. Re: Incremental Refreshes Running Stupid Long
                          Matt Coles

                          Arg--sorry to hear you're having further issues, Darin. Typically a restart fixes that particular issue with 9.0, I found--but please try and grab Server logs again and send it in as a new case, because it needs to be fixed.

                          • 10. Re: Incremental Refreshes Running Stupid Long
                            Darin Coulter

                            Here's what we have identified as the root cause:


                            A 3rd party Java program was not playing nicely with Tableau Server.

                            Solarwinds has a Log and Event Manager program (a.k.a. TriGeo) that our security team uses to analyze events/logs.


                            Tableau Support found in some Tableau Server logs that there were some Java related exceptions/alerts/errors - and suggested we remove that program.


                            After doing so - the Repository has not crashed, all 25 extracts are finishing in an acceptable amount of time and this has been the case since Thursday.


                            I still have to figure out how TriGeo and Tableau Server can co-exist as that's a requirement of the Security team - but everything is getting better.


                            Thanks again.


                            • 11. Re: Incremental Refreshes Running Stupid Long
                              Matt Coles

                              Thanks for following up with the answer! That's definitely a tricky one to have figured out...I'm glad to hear you were able to get to root cause, and get it fixed.