1 Reply Latest reply on Apr 28, 2018 3:01 AM by Simon Runc

    How does Tableau query data from a datasource

    Daniel Jung

      Hi. I want to know how often Tableau queries data from a data source.

       

      If I have a Live data source on BigQuery and I use a custom sql to generate my own table in Tableau, I'm assuming Tableau sends this query to BigQuery and BigQuery will first query this sql statement and then send it over to Tableau (am I incorrect?). How often does this happen if the data is live? Will I be raking up query costs on BigQuery if I leave the connection live?

       

      If I created an extract then I assume it will only query once every time the extract is refreshed. This is more ideal but I may have use cases where I need to have live data which use custom SQL, I just want to know the frequency of these queries for live data sources.

       

      Thanks!

        • 1. Re: How does Tableau query data from a datasource
          Simon Runc

          hi Daniel,

           

          So yes when you first create the CustomSQL connection, Tableau will send a query to generate the data. After that (in general*) every-time you do "something" it will send that query to BQ (e.g. If you drag "Region" and "SUM([Sales])" onto the canvas Tableau will send a SELECT Region, SUM(Sales) FROM YourQuery GROUP BY Region...if you then filter to a single region this will be send as a "WHERE" clause). This is the same for a user, if they click on a Region, say, to filter other visuals, it will send that as a WHERE to get back the results. I think of Tableau as a SQL engine, where you are generating the SQL by dragging and dropping pills, rather than writing SQL, and the results are returned as "render instruction" rather than a table of results.

           

          *I say in general, as in some cases Tableau can use a cache so it doesn't have to go right back to the source data (but we can't really control this) and is more related to server.

           

          One way you can see what is being sent is by using the "Performance Recorder"...GIF of Live connection to SQL Server below

           

          Performance Recorder.gif

           

          You are correct in that if you extract the data to a TDE/Hyper the only query that is sent to BQ is to generate the table, and then all subsequent queries are run against the extract (TDE/Hyper) which doesn't have a cost per query.

           

          Personally I've not found BQ that expensive, as you only pay for the columns and rows queried...but of course this depends on the size of your data and complexity of the queries sent (and number of columns use).

           

          Hope that helps.

          2 of 2 people found this helpful