9 Replies Latest reply on May 30, 2018 11:24 AM by Barb Reiser

    Scheduled Refresh of Extract

    Barb Reiser

      Is there anyway to avoid manually refreshing an extract without publishing the data source? My work has disabled scheduled extract refreshing in order to maintain server performance. I'm wondering if there is another way. The one extract I have takes almost 3 hours to refresh.

        • 1. Re: Scheduled Refresh of Extract
          Okechukwu Ossai

          Hi Barb,

           

          Have you looked at why the extract takes too long to refresh? How many rows of data do you have? Is your data stored in a SQL database? Are you joining multiple data sources in Tableau or is your workbook connected to a single database table/view?

           

          I once had an extract created in Tableau by joining 5 SQL database tables and views. These tables had multiple many to many relationships. The original database tables have up to 20 million rows of data but a data source filter was applied in Tableau reducing the number of rows in the extract to just over over 1 million. It took about 3 hours to refresh.

           

          However, once we created a materialized view in our database and then connected Tableau to the materialized view, the extract refreshed in under 1 minute. It was unbelievable. Sometimes, investigating why the extract is taking too long is the best place to start.

           

          Another thing to check is your version of Tableau. You could suggest upgrading to Tableau 10.5 or higher to your boss, if you are not yet using those versions.

           

          Hope this helps.

          Ossai

          • 2. Re: Scheduled Refresh of Extract
            Chris McClellan

            If you're not scheduling a refresh the only 2 other options are manually refreshing or republishing.

             

            You can fire a manual refresh by using tabcmd though

             

            tabcmd is a utility on your computer and use the refreshextracts option to refresh the data source.

             

            You could create a small batch file, then schedule that for a particular time.  Your basically emulating the scheduled refresh, but doing it from your computer rather than from the server.

            • 3. Re: Scheduled Refresh of Extract
              Barb Reiser

              Hi Chris,

               

              I'm not seeing any tabcmd on my computer... Is there a certain place I should be looking?

              • 4. Re: Scheduled Refresh of Extract
                Barb Reiser

                Hi Ossai,

                 

                The database is very large, it is sql. I know there's over 20 million rows from what the extract window had been displaying. I filtered the data down to just one month and that didn't help. I am using 10.5.

                 

                What do you mean by creating a materialized view?

                 

                Thanks,

                Barb

                • 5. Re: Scheduled Refresh of Extract
                  Okechukwu Ossai

                  Hi Barb,

                   

                  A view is a virtual table created by a select query. However, this virtual table is not stored anywhere in the database. The query will need to be re-run each time you want to retrieve the data and depending on the complexity of query and size of the underlying tables, performance may be compromised.

                   

                  However, you can 'materialize' this view or 'virtual table' by storing the result of the query in the database as an actual table. So,each time the data is needed you can simply retrieve records from this table without having to re-run the query. This is why performance is improved by using materialized views.

                   

                  Ossai

                  • 6. Re: Scheduled Refresh of Extract
                    Barb Reiser

                    How would I go about storing the result of the query as a table? Is this something I can do somehow or would I need my data team to create that?

                     

                    • 7. Re: Scheduled Refresh of Extract
                      Okechukwu Ossai

                      Ask your data team to create the materialized view for you. You will have to give them the SQL query or show them how you are joining the data sources in Tableau. Best way will be to convert your Tableau data join to custom SQL and send the script to them.

                      • 8. Re: Scheduled Refresh of Extract
                        Chris McClellan

                        Sorry .. you need to download and install it:

                         

                        tabcmd

                        • 9. Re: Scheduled Refresh of Extract
                          Barb Reiser

                          Hi there,

                           

                          After doing some research I found that I could create a custom sql query to only pull in specific data from the tables I was using vs using the whole table. By doing this I was able to reduce the size of the extract and it refreshes in about 3 min vs 3 hours.

                           

                          Barb