3 Replies Latest reply on Jul 9, 2018 11:23 AM by Bill Li

    Initial SQL Statement to just select data for a certain date range

    Bill Li

      Hi Everyone,


      I am not sure if I am correct. From my understanding, what to input in Initial SQL is the original SQL queries? the same as the queries we used in SQL Server Management Studio or MySQL?


      Given the data is too big for Tableau to load, we are trying to reduce the data size without deleting on the database end. I have used the following statements to just select the data for a certain date range.


      SELECT [CaseID]


      FROM [Table]

      where DateTime between '2017-01-01' and '2018-01-01'


      I have questions on:

      1. Is my statement correct or feasible for Tableau Initial SQL?

      2. After I used Initial SQL for one table, and then, I'll have to join another table, would I be able to simply drag and drop or I have to rewrite the statement for the join as well?


      Thanks in advance if anyone can help.



        • 1. Re: Initial SQL Statement to just select data for a certain date range
          Ken Flerlage

          I don't think you need this as an initial SQL. Initial SQL is typically only used in some special cases, such as when you may need to create a temp table (see Run Initial SQL). In this case, you can just use this as a custom SQL for your data source (see Connect to a Custom SQL Query). Another option would be to use an extract filter (see Filter Data from Your Data Source).

          1 of 1 people found this helpful
          • 2. Re: Initial SQL Statement to just select data for a certain date range
            Steve Martin

            Hi Bill,


            Ken's reply is correct, initial sql is used to set certain connection parameters typical for Teradata and that custom SQL is the source of choice - I use this for all of my connections 1st for performance - I know that my query has been designed for the data and secondly, for flexibility - it is far quicker to only send the data over the pipe that is required rather than have to initially load a table / model and then choose which fields are required.

            Also, custom SQL allows for parameterised querying - using parameters in the where clause.


            Incidentally, how much data are we talking here? You see, the problem with Tableau is that when it comes to filters, unless they are wildcard types, Tableau must perform distinct lookups in order to build the final set which on large data can take time.

            As a result, limiting the amount of data Tableau has access to in the first instance is paramount to improving performance. And, a properly tuned source is essential to fast querying.


            For perspective, I connect to smallish warehouse tables each typically more than 1 billion records (which needs to be turned around daily which in turn plays havoc with the db table stats) and require the reports to be able to open in 12 seconds and turn-around in 3-ish seconds.


            If real-time is not possible, and extracts take too long to build, you could consider a tool like Alteryx which uses the api (there are other tools to Alteryx that can do this - you could even write your own); using the api, you can create Tableau repositories typically faster than Tableau can, and by automating them using something like SSIS to handle deployment too you could link the creation to when the data refresh has completed.


            Additionally, you could also consider the new Tableau Prep which would help you build your aggregated data-model which being aggregated, might produce a considerably smaller set than your first thought.



            1 of 1 people found this helpful
            • 3. Re: Initial SQL Statement to just select data for a certain date range
              Bill Li

              Hi Steve,


              Thanks for your reply. I have another question then - when I used Custom SQL Query and I was trying to join another source(an Excel file outside of the SQL server). Would that be possible?