1 Reply Latest reply on Aug 16, 2016 8:41 AM by Kent Sloan

    Reshaping data to improve performance (large data tables)?

    Lance Dacey

      The database that I am working with is fairly large. There are about 2 million new rows coming in per calendar date and I try to keep some detailed data for a couple of months to a quarter (still tweaking this).


      The current shape of the data is extremely wide though, I have tables that are 200-500 columns (I did not create this database, but I can help reshape it for Tableau).




      • Data is per 30 minute interval (24 in a day)
      • There could be 10-30 channels per employee per interval
      • There are many thousands of employees (so now we are at 10-30 channels times 24 intervals times 30,000+ employees)
      • Most of the columns (minimum of 50+) are either about call counts or time duration (seconds)


      Will I gain performance if I reshape this to something like:




      My concern is that this will cause a massive number of rows, although it will be a much narrower table.


      Most of my formulas are something like SUM(talk_time) + SUM(hold_time) / SUM(staff_time). Switching to a tall format would require me to add a lot of IF statements to refer to the category column. Will this perform well or slow things down even more?

        • 1. Re: Reshaping data to improve performance (large data tables)?
          Kent Sloan

          Hi Lance,


          In general Tableau runs much faster when using tall narrow data sets over short and wide ones. Having fewer columns also makes it easier to use in Tableau. The first thing I would do before flipping all the content is check to see what you actually need to use in your reports. Leaving unnecessary content behind can save space and help speed things up. Looking at the calculation you need to use I would lean towards keeping those columns side by side rather than summing up separate chunks of the same column to divide against each other. I would also consider if there are columns that can be combined before bringing in Tableau or if the data set can be rolled up to the hour or day level. Without seeing more of the data set all I can honestly give are general guidelines. 



          Kent Sloan


          1 of 1 people found this helpful