5 Replies Latest reply on Dec 9, 2016 3:47 AM by siyang.teo

    Return only aggregated table from database


      I hope I can explain myself clear on this:


      1. I have about 20 million rows of data for a day but it is aggregated to about 100,000 data points for visualization.
      2. Aggregation has to be set in Tableau as it is dynamic, e.g., aggregate by day, week, month or year. Therefore, it cannot be created as a fixed view table in the database
      3. However, Tableau is unable to load the visualization. I suspect it is because Tableau retains the underlying data of 20 million (since one can view the underlying data for an aggregation under 'View Data > Full Data').
      4. The scary thing is that I might need to query more than a day. I can't imagine pulling one month worth of data (20 mil x 30 days). However, it shouldn't be an issue if the table returning from database is always pre-aggregated to 100,000 data rows.


      So I have two questions:


      • Is my assumption correct for point 3?
      • Is it possible to set Tableau such that it passes the aggregation parameters to the database so that the database eventually only returns a table of aggregated values?


      I am using Vertica database, and Tableau 10


      Thank you!

        • 1. Re: Return only aggregated table from database
          Mahfooj Khan

          Hi Siyang,


          Why don't you create an aggregated view of your system table at database level? Then connect to tableau, it will be more efficient. You can think of like that.



          • 2. Re: Return only aggregated table from database
            Andrew Watson

            You could create a view in the database pre-aggregating for all of the time periods then doing a union to bring all of the pre-aggregated results into one view. As long as you have a column defining the level of pre-aggregation - i.e. day, week, etc, you will be able to create a parameter allowing the user to select the level of aggregation they want then feed that back into your data connection.


            For example you data could be:


            Sales     Date     Period

            100     Jan 2016     Month

            150     Feb 2016     Month

            4000     2016     Year


            And you connection in tableau:


            select sales, date

            from table

            where period = [ParameterPeriod]


            Also Vertica is supposed to be very fast to connect to so you might be surprised how quick database calls are against that - meaning no pre-aggregation is required. Worth a try.

            1 of 1 people found this helpful
            • 3. Re: Return only aggregated table from database

              Hi Andrew,


              Can I clarify that your statement on:


              select sales, date

              from table

              where period = [ParameterPeriod]

              This is to be entered in the custom SQL in Tableau right?



              Unfortunately, 20mil rows seems too much though. The query took more than 3 mins in Tableau before I can visualise. It does seem strange that the bottle-neck is coming from the query since Vertica is supposed to be very fast.


              I'm a little confused about how things work with a Tableau aggregation query too. If Vertica is processing the aggregation which is defined in Tableau, I would expect the result coming back is aggregated too. But the base table (datasource) is still the original (20mil).

              Any thoughts on this?

              • 4. Re: Return only aggregated table from database
                Andrew Watson

                Yes, entered in custom SQL. See this link Connect to a Custom SQL Query


                If Tableau is performing the aggregation it will require all data in order to be able to perform the aggregation, I expect that's why you're seeing the 20m records.

                • 5. Re: Return only aggregated table from database

                  Yes, I'm familiar with Custom SQL, just checking to be sure


                  Hmm, your 2nd part is the one which confuses me. It was stated in Tableau  documentation Create Efficient Calculations that aggregate calculations in Tableau are sent to the database for processing. So it would appear that Tableau is not the one performing the aggregation.


                  Why does it take eons to return the query back to Tableau though?