8 Replies Latest reply on Apr 24, 2015 11:22 AM by cory.tak.0

    How large can an extract be before impacting performance?

    Jack May

      I published a SQL Query to server and it crashes the Dashboards that access it. The query is complex, but as efficient as it can be, it runs against the databse in 5 minutes and returns 64,000 rows. That response time makes it inadvisable to publishing a live connection to the data source using the sql (right?).

       

      I created an extract and published the extract to server, and the extract isn't helping performance. The dashboard that it feeds takes up to 15 minutes to load, even when using the TDE, any action within the dashboard or workbook (e.g. clicking on a filter) takes between three and ten minutes for a response. I get frequent out of memory errors when loading the dashboard.

       

      Any help is appreciated.

       

      On a related note, is it true that publishing the EXTRACT to the server means I can't schedule a refresh?

       

      Thanks,

      Jack May

      IDEXX Laboratories.

        • 1. Re: How large can an extract be before impacting performance?
          cory.tak.0

          Jack,

           

          At my organization, I have seen fast response times from extracts that have 10s of millions of rows, so I'm guessing that you're okay with 64000. However, extracts with many columns (think wide table) seem to perform poorly compared to tall and thin data extracts. One thing you may want to try to see where the performance issues are is use tableau performance recording:

           

          Create a Performance Recording

           

          This should show you what's taking the most time to render your dashboard (computing the visualization, getting the data, etc.)

           

          I hope this helps you!

           

          Cory

          1 of 1 people found this helpful
          • 2. Re: How large can an extract be before impacting performance?
            Toby Erkson

            Jack May wrote:

             

            1)  I published a SQL Query to server and it crashes the Dashboards that access it. The query is complex, but as efficient as it can be, it runs against the databse in 5 minutes and returns 64,000 rows. That response time makes it inadvisable to publishing a live connection to the data source using the sql (right?).

             

            2)  On a related note, is it true that publishing the EXTRACT to the server means I can't schedule a refresh?

             

            1) Correct.

             

            2)  False.  You can refresh the extract.

             

            What Cory states, doing a performance recording, is a good idea, too.

            • 3. Re: How large can an extract be before impacting performance?
              Jack May

              Followup: The issue was NOT the data source, I created a new dashboard, using the same datasource and it loaded and filtered in a quite reasonable amount of time. The issue turned out to be a calculated field with an If-then-else formulat that we'd created that had about 50 elseif clauses. I moved that to the custom sql statement and it fixed everything. Thanks for all your help. Recording performance was helpful as it showed that the data source was not the issue.

               

              I still have a question about refreshes, though. I've published several data sources to Tableau Server. When I look at them on the server, the column that indicates "Type" shows that most of them are Oracle data sources. There is one that's Type: Tableau Data Extract. If I check the box next to that one, the "Scheduled Tasks" option greys out. For this reason, I believe that I can't schedule this to refresh.

               

              Moreover, I created this extract in Desktop by connecting to an Oracle database, and selecting the "Extract" option, which, of course, prompted me to save the *.tde. I then opened a new workbook, connected to the *.tde, and hit "publish." When I did this, there was no option to embed credentials (to the Oracle database)....that's the other reason I think that it can't be refreshed (how would it connect?).

               

              Thanks,

              -Jack May

              • 4. Re: How large can an extract be before impacting performance?
                cory.tak.0

                Hey Jack,

                 

                Glad to hear you had success! I believe what's happening is that your workbook in Tableau Desktop is pointing to a local tde (on your local harddrive) and that the file is being saved as a .twb which does not package the data with the report specifications. Try saving your workbook as a .twbx which will package the data along with the report. When you select Server > Publish the option for embedding your credentials and setting up a schedule should exist.

                 

                Cory

                • 5. Re: How large can an extract be before impacting performance?
                  Toby Erkson

                  Jack May wrote:

                   

                  Followup: The issue was NOT the data source, I created a new dashboard, using the same datasource and it loaded and filtered in a quite reasonable amount of time. The issue turned out to be a calculated field with an If-then-else formulat that we'd created that had about 50 elseif clauses. I moved that to the custom sql statement and it fixed everything. Thanks for all your help. Recording performance was helpful as it showed that the data source was not the issue.

                  ...

                  This is why many ask for the actual workbook so it can be examined.  Knowing about such a crazy IF...THEN would've been helpful

                  • 6. Re: How large can an extract be before impacting performance?
                    Toby Erkson

                    Jack May wrote:

                     

                    ...

                    I still have a question about refreshes, though. I've published several data sources to Tableau Server. When I look at them on the server, the column that indicates "Type" shows that most of them are Oracle data sources. There is one that's Type: Tableau Data Extract. If I check the box next to that one, the "Scheduled Tasks" option greys out. For this reason, I believe that I can't schedule this to refresh.

                     

                    Moreover, I created this extract in Desktop by connecting to an Oracle database, and selecting the "Extract" option, which, of course, prompted me to save the *.tde. I then opened a new workbook, connected to the *.tde, and hit "publish." When I did this, there was no option to embed credentials (to the Oracle database)....that's the other reason I think that it can't be refreshed (how would it connect?).

                     

                    Thanks,

                    -Jack May

                    My guess would be that it's greyed out because you are not the Owner.  Can you show us a screen shot of what you're seeing?

                    • 7. Re: How large can an extract be before impacting performance?
                      Jack May

                      You're right, it's better to send a twbx, but some of the data was confidential and I thought the data source was the problem, so I didn't think I could put in dummy data.

                       

                      As far as the data source, I'm the owner (and a sysadmin). The only datasource in the screencapture below that causes the "scheduled task" to go grey is also the only data source that is of type "Tableau Data Extract".

                       

                      DataSources.JPG

                      • 8. Re: How large can an extract be before impacting performance?
                        cory.tak.0

                        Jack,

                         

                        Are you able to republish that extract and choose a refresh schedule? I wonder if in a situation where no refresh schedule is selected if Tableau does not include database connection information as part of the published data source.


                        I noticed a similar thing among data sources on our server. If I connect to the data sources that say oracle in desktop and then right click to 'Create a Local Copy' it allows me to view the SQL/connection information. However, if I download the data source that's listed as a Tableau Data Extract it only shows me the path of where to find the TDE on my file system.