2 Replies Latest reply on Feb 22, 2018 12:43 PM by Marcus Kahla-Dunn

    Schedule extract should not run if data is not updated in database

    Dhrati Acharya

      Hi,

       

      Can we create a dependency for Scheduled extract to run only if data is available into the database. For example if the database downtime is going on during our scheduled extract timing then our extract will refresh at its scheduled time and no data gets updated on report due to database downtime. In short our schedule extract fails, is there a way we can create a dependency to run the schedule only if database is up or have data.

       

      I am facing a scenario where if my Talend (ETL tool) job fails and no data gets updated into the database but my Tableau extract still run. I would want a condition to be implemented where if my Talend job fails then Tableau extract refresh should not run.

       

      Looking forward for your keen help.

       

      Best Regards,

      Dhrati Acharya

        • 1. Re: Schedule extract should not run if data is not updated in database
          Yuriy Fal

          Hi Dhrati,

           

          Re-framing your problem slightly,

          you may want to run your Extract Refresh Schedule in TS

          right after your Talent job has been successfully finished.

           

          This way, you'd like to call a tabcmd command at the end of your Talend job.

          This involves a bit of coding, but fortunately, Tamas Foldi wrote

          a double-part article on the very topic:

           

          Trigger Tableau Schedule from Talend -Databoss

           

          Hope this could help.

           

          Yours,

          Yuri

          • 2. Re: Schedule extract should not run if data is not updated in database
            Marcus Kahla-Dunn

            We have had similar instances where a scheduled extract refresh has removed all data from a dashboard/data connection because there isn’t any data in the underlying database table.

             

            Last week we tested and proved that we can implement some basic conditional extract refresh logic in Tableau Server. The solution involves putting a logical test in the ‘Initial SQL’ that either a) runs (to initiate the refresh) or b) fails to compile / returns an error in the DB and aborts prior to the TDE refresh commencing.

             

            The test can be as simple as:

             

            CREATE TEMPORARY TABLE myTest AS SELECT 1/COUNT(*) FROM myTable;

            /* where myTable is the source objects … obviously using appropriate database syntax for your environment*/

             

            Where there are rows in the underly myTable that we are refreshing from, the Temporary Table will be created successfully and then the extract will commence.  But … if there are no records in myTable, a critical ‘divisor by zero’ error will fire in the database and Tableau will abort prior to the refresh commencing (and hence will not truncate the data connection).  As an example, we do a full refresh of some customer data every day and expect there to be records in the source database table.

             

            Last week a database process aborted and left the source table with no records.  We replicated the same with the Initial SQL test and proved that that the Tableau Refresh attempt was aborted and the TDE maintained the previous day’s data (avoiding the embarrassment of an empty dashboard).  The database Temporary Table then is dropped by the DB when the Tableau connection is dropped.  The data connection publisher then received an email highlighting that the extract failed due to a ‘divisor by zero’ error – which is our translation of the preconditions not being met. (NB.  My Tableau support team are comfortable with this solution given there will be rare instances of extract refresh failures)

             

            The concept can be extended to more complex refresh requirements (eg. # of new rows, currency of latest data, logs entry when  process completion).  Main deficiency is the solution only runs the tests on the existing Tableau schedule.  External scripting is definitely a more flexible solution.

             

            MKD

            3 of 3 people found this helpful