8 Replies Latest reply on Nov 11, 2013 11:23 AM by Joshua Milligan

    Aggregating: database or Tableau?

    Alexander Severinsen

      Hi

       

      I have some 200 million observations. The records are registered every minutes. I need to produce some report where my data are aggregated on a weekly level. Off course, I can do this in Tableau. However, I can also make aggregate up a table at the database. I´ve experimented a bit, and to me it seems like Tableau is just as fast working directly with minutes, versus having Tableau work with my already aggregeted data from the database. What do you guys think? Should I make the extra effort aggregating at the database, or is it just a waste of time?

       

      Kind regards

      Alexander Severinsen

        • 1. Re: Aggregating: database or Tableau?
          Joshua Milligan

          Alexander,

           

          My first thought is that if you can reduce it to a table of 10 to 50 million it should be faster, but consider these points to help clarify which approach might be best:

          • How many rows would the aggregate table have?  If it's 190 million, there probably isn't much point.  If it dramatically reduces the count, maybe.
          • What is the database?  Some databases are more efficient than others at performing aggregations (though almost all of them are designed to do this efficiently and Tableau leverages that capability).  If in a POC, you've seen Tableau work just as efficiently with the minute level of detail, then it's likely you are already benefiting from some good efficiency.  Often, simple indexing can make a dramatic difference.
          • What are the aggregations?  If you are simply doing SUMs and COUNTs, then creating an aggregate table will work.  But if you want AVG, MEAN, etc... then you'll get the right values per week, but additional aggregations of those won't be correct.
          • Are there any other important dimensions that are dependent on being at a minute grain that will be lost if you aggregate to a week?  Even if you are not using them now, you might want to later on.
          • Are you live or using an extract?  I'm assuming you're live from your question, but keep in mind that with an extract, you can roll-up everything to an aggregate in the extract -- so minimal work to get the aggregate table.
          • How much gain is required to make the level of effort worth it?  Going from 5 seconds for a rendering to 4 seconds is probably not perceptible -- unless you have lots of views and it starts to add up.  Going from 5 minutes to 4 seconds would be fantastic!

           

          • How long would it take to pre-aggregate the data? Sometimes the time necessary for pre-aggregation is too large given new data arriving.
          • How much CPU, bandwidth, memory, and disk would be required for the pre-aggregation? If the database server or Tableau are busy pre-aggregating, that's not available for users.

           

          Regards,

          Joshua

          1 of 1 people found this helpful
          • 2. Re: Aggregating: database or Tableau?
            Ramon Martinez

            Hi Joshua,

             

            Excellent thoughts!!!! Thanks Joshua for sharing them. I totally support them.

             

            Best,

            Ramon

            • 3. Re: Aggregating: database or Tableau?
              Alexander Severinsen

              Joshua, thanks for sharing. I really appreciate it. Let my clarify a bit. I am working with MySqI and Tableau server (both hosted in the cloud). I am currently not live. Maybe that is why I am experiencing about the same speed whether I use aggregates at the database or at Tableau? I am a bit uncertain how Tableau works "under the hood". So, what you are saying is that if I publish a dashboard/view already aggregated within Tableau desktop they will be roll-up when published. and not when requested at the server side? In that case it will be inefficient to do the aggregates at the database...

               

              Thanks.

              Kind regards

              Alexander

              • 4. Re: Aggregating: database or Tableau?
                Joshua Milligan

                Alexander,

                 

                When you extract the data, you can specify that the extract should be aggregated to the level of visible dimensions (meaning any dimensions visible in the left data window).  You can either hide the fields from the data window using the right-click context menu, or you can hide all dimensions not used in any views in the workbook by clicking "Hide All Unused Fields" on the Extract Data screen.

                 

                You are correct, that using the extracts is likely why you are not seeing much difference.  Extracts use a columnar format so aggregation is very efficient.  Still, you might see some performance increase and also shrink the size of the extract if you aggregate it.

                 

                Regards,

                Joshua

                 

                0.png

                • 5. Re: Aggregating: database or Tableau?
                  Alexander Severinsen

                  Thank you for this. Very helpful. Much appreciated.

                   

                  Kind regards

                  Alexander

                  • 6. Re: Aggregating: database or Tableau?
                    Jonathan Drummey

                    This is great, Joshua!

                     

                    A couple of other factors to add to the list are resource questions:

                     

                    - How long would it take to pre-aggregate the data? Sometimes the time necessary for pre-aggregation is too large given new data arriving.

                    - How much CPU, bandwidth, memory, and disk would be required for the pre-aggregation? If the database server or Tableau are busy pre-aggregating, that's not available for users.

                     

                    Jonathan

                    • 7. Re: Aggregating: database or Tableau?
                      Joshua Milligan

                      You're welcome Alexander!

                      Jonathan & Ramon, thanks for the additional comments!

                      • 8. Re: Aggregating: database or Tableau?
                        Joshua Milligan

                        Jonathan,

                        I've added your points to the list.  Thanks!