2 Replies Latest reply on Jul 25, 2013 6:04 PM by Chris Moseley

    Improving Performance of Table Calculations (Blending)

    Chris Moseley

      This is a follow-up to a thread started by my esteemed colleague, Joe Cole, pertaining to changing user perspective without needing huge data joins (most recently: community.tableausoftware.com/message/215326).

       

      We want to allow the user to select any one of a number of locations as "My Location" and then see the world from that perspective.  Our initial approach was using data joins to create unique data sets for each location.  Unfortunate, this approach is unsustainable due to a relatively large data set combined with hundreds of potential locations.

       

      Thanks to the help from Jonathan Drummey and Robin Kennedy in the thread referenced above, we were able to use a filter from a secondary data source and table calcs to measure distance from "my location" to all other location without increasing the core data size. This approach also worked well for me on a different criteria: "only view job titles I employee".

       

      However, when I combined these two attributes (i.e., provided the ability to filter on both distance from my location and titles I employee) the dashboards became unusable slow.   Instead of taking a ~5 seconds, like it did with just one filter, it takes 30-45 seconds with both.

       

      I have attached a .twbx with fake data where the user can change perspective of their location and then filter on both distances from and titles they employ at the selected location.  It works really well in the attached demo, presumably because of the relatively small data set and only a handful of locations. But this same approach comes to a grinding halt with my real data set and calculation-heavy viz.

       

      I am trying to determine if I am running into limitations of Tableau and table calcs or if there is a more elegant and efficient way to combine these two filters.  Again, it works great with just one "Compute Using" variable but something about having the table calcs address two variables brings my real viz to it's knees.

       

      Any thoughts appreciated on what might be wrong or even ways to problem shoot what is causing my slow down.

       

      - ctm

        • 1. Re: Improving Performance of Table Calculations
          Robin Kennedy

          Hi Chris,

           

          I think your question should say "Improving Performance of Blending", rather than of Table calcs.

           

          When you use blending, Tableau is making separate queries and then combines the data, post query, within the tool. I kinda think of it as a bit like a VLOOKUP in Excel... It compares each value that you're blending against across the whole data set and returns the result from the 'lookup' result set. This can sometimes mean slow performance if you have a high cardinality (number of unique values) in the field that you're blending against.

           

          I tried your example workbook and fleshed out the data to a million jobs and 50 locations (i.e. 2,500 combinations) and although there is a small amount of computing time it was only ever a couple of seconds... How many records are you working with? In your real dataset, are you using location IDs and TitleIDs like in the sample (if you're blending on text fields this will be slower). Are you using a 'modern' computer to run Tableau? I also saw a small improvement by changing the formula of # of Jobs from COUNTD to just COUNT... since every job ID was already distinct (hopefully your real data is too!!)

           

          Sorry that doesn't really directly answer your problem. I don't think you are doing anything wrong by using these table calculations...

          1 of 1 people found this helpful
          • 2. Re: Improving Performance of Table Calculations
            Chris Moseley

            Robin,

             

            First of all, thank you for taking the time to play with my Workbook. It is greatly appreciated. 

             

            To answer a few of your questions, my data set is not overwhelmingly large.  There are ~40 columns (fields) but the total number of rows is only something like 65K.  And yes, I'm using a new computer with plenty of horsepower.  

             

            I'll try incorporating your suggestions and see if I see any improvements.  In the real data set, for "distance from my location" was using the location names rather than location ID#s so, I started converting that over.  It did not seem to have a huge impact on the first few calculations and tables I converted but, I'll make the universal conversion to location ID #s and then take a fresh look. 

             

            It may be that some of the presented metrics on the real page are a little complex.  Nothing crazy, but things that require a few calculations in succession, such as "% change in market share over the last week".  I'm going to try removing these one at a time and see where i get my biggest bang for the buck.  I did try the performance tracking tools in 8.0 desktop but did not see meaningfully improvement when I removed what that tool fingered as the biggest "offenders."  

             

            The part that keeps getting me is that the page is quite fast with either one of the two blends are incorporated independently (Distance or My Titles) but when I put both on the page together it gets exponentially slower, taking at least ten times longer to render.  I would expect a drop in performance with this blending approach but not like this. 

            I'm still hopeful that I will find a way or develop an alternate approach.

             

            Anyway, thanks again for your thoughts.  Like I said, it is incredibly helpful and is greatly appreciated.  I'll work on this tonight and let you know if i see any improvements.

             

            Any additional thoughts or insights from past experience is welcomed

             

            - ctm