2 Replies Latest reply on Feb 21, 2013 7:49 AM by Tim Quayle

    Incremental refresh of a dataset aggregated to a date/time field

    Tim Quayle

      We have a very large Oracle dataset that collects observations with a date/time dimension all the way down to the second. We are adding roughly 200,000 rows to the dataset each day. In an effort to manage the size of the dataset and improve performance, we'd like to create an extract that aggregates to the hour in the date/time dimension. However, when we select the date/time field for aggregation, the check box for selecting an incremental refresh is grayed out, forcing us to perform a full refresh whenever we want new data. We're publishing this extract to Tableau Server with the intention of performing a refresh every day - a full refresh involving tens of millions of records gets to be too big of an operation. Is there any way to work around this and set up an incremental refresh of a dataset that's aggregated to a date/time field?

        • 1. Re: Incremental refresh of a dataset aggregated to a date/time field
          Cristian Vasile

          Tim,

           

          I would keep the aggregated dataset in Oracle database, in a separate tablespace, and connect live Tableau with that tables.  The volume of data and the velocity of change is somehow big, and keeping both original and summaries on Oracle gives you a lot of control on how and when you could improve your aggregated dataset.

           

          Maybe next month you will be forced to aggregate at 4 hours window or 8 hours or to drop some columns because your preliminary analysis tell that are redundant from business logic perspective.To create the new dataset is very easy having things under control.

           

          Regards,

          Cristian.

          • 2. Re: Incremental refresh of a dataset aggregated to a date/time field
            Tim Quayle

            Thanks. It turns out that our dataset has a ton of superfluous fields. After we hid the unused fields and re-ran the extract, it reduced the size by nearly half! Performance is now much better. Hopefully we won't have to go down the route of aggregating to 4-hour periods.