3 Replies Latest reply on Nov 27, 2012 9:11 AM by Jonathan Drummey

    Performance degradation in calculated fields

    Ivan Maffioli

      Hello Tableau people,

      a simple question about performance on calculated field:

       

      I have to build a dashbord to show a set of kips with values and colors.

      An example of kpi calculated field is:

       

      IF (SUM([%ANOM_OK_n])/SUM([%ANOM_OK_d])) > 1 THEN 1

      ELSE

         (SUM([%ANOM_OK_n])/SUM([%ANOM_OK_d]))

      END

       

      where [%ANOM_OK_n] and [%ANOM_OK_d] are values of my dataset.

       

      The color rating is determided by a 5 values, from green to red (gray for null value) by this

      calculated field:

       

      if [%ANOM_OK] >= 0 and [%ANOM_OK] < 0.30

      then  'Red'

      else

          if [%ANOM_OK] >= 0.30 and [%ANOM_OK] < 0.60

          then 'Orange'

          else

              if [%ANOM_OK] >= 0.60 and [%ANOM_OK] < 0.90

              then 'Yellow'

              else

                  if [%ANOM_OK] >= 0.90 and [%ANOM_OK] <= 1.0

                  then 'Green'

                  else 'Gray'

                  end

              end

          end

      end

       

      I've noticed a fast Tableau degradation in performance with 10 kpis like this after only 5000 dataset rows.

       

      My question is: is the [%ANOM_OK] expression calculated once in the color rating expression above, or 8 times?

      It seems that Tableau makes a text substitution: in the expression editor Tableau shows the text expression

      for [%ANOM_OK].

       

      Is there a way to optimize this. (if only I could use a variable assignment...!)

       

      Thank you all,

      Ivan

        • 1. Re: Performance degradation in calculated fields
          Jonathan Drummey

          As far as I know, Tableau calculates the %ANOM_OK only once in your example. The only thing I can definitively see that could improve performance is to just have one IF statement, like:

           

          if [%ANOM_OK] >= 0 and [%ANOM_OK] < 0.30 then 'Red'

          elseif [%ANOM_OK] >= 0.30 and [%ANOM_OK] < 0.60 then 'Orange'

          elseif [%ANOM_OK] >= 0.60 and [%ANOM_OK] < 0.90 then 'Yellow'

          elseif [%ANOM_OK] >= 0.90 and [%ANOM_OK] <= 1.0 then 'Green'

          else 'Gray'

          end

           

          There are a lot of reasons why your performance could be suffering. Having very wide data (hundreds of columns) can slow things down, having too many worksheets in a workbook, network performance, layout of worksheets in a dashboard, etc. You'd need to provide more details to be able to give you more information about what might be going on.

          1 of 1 people found this helpful
          • 2. Re: Performance degradation in calculated fields
            Ivan Maffioli

            Hello Jonathan,

            thank you for your advice.

             

            My datasource is a view on SQLServer that builds a single table with essentially an id, few descriptive fields,

            a date and 15 couples of float values that are used to compute 15 kpi values like %ANOM_OK.

            These values are collected on a monthly basis for 300 locations.

            My dataset now is 3 month of values  rows that are loaded entirely in tableau.

             

            If you want, here there is a workbook:

            https://dl.dropbox.com/u/232315/DemoKPIv3b.twbx

             

            Ivan

            • 3. Re: Performance degradation in calculated fields
              Jonathan Drummey

              Hello Ivan,

               

              When I open the workbook to the KPI_Sul_Territorio_1 dashboard that takes 20+ seconds to load, and when I click on the map the refresh takes 10-15 seconds. Is this dashboard where you are seeing the performance degradation? If so, I can think of a few reasons why. The biggest is that you are asking Tableau to draw 16 different worksheets. Up through Tableau v7, Tableau performs the computations in serial, so if each worksheet takes 1 second then the dashboard will take 16 seconds to be calculated, and that's what I'm seeing. In v8, Tableau will perform the computations in parallel, so that would improve performance. Even with that, though, you're asking Tableau to make 16 different queries to the database vs. two queries (one for each column). A workaround here is to use fewer worksheets, and use Measure Names/Values to draw a table instead of a whole bunch of tables. I see that you're using a different color indicator for each KPI, in that case you'd probably need to do a multiple axis crosstab, see http://public.tableausoftware.com/views/conditionalformattingv4/Introduction for details.  Alternatively to that, reshaping your data to be "tall" rather than "wide", so each KPI would be a different row instead of a different column, though that would be more work to set up in the first place it would let you keep much of the existing layout and make color each cell per the appropriate KPI easier.

               

              A couple of other points:

               

              - changing from lots of IF statements to a single IF statement using ELSEIF clauses should have some impact on performance

              - in the ATM_NOCARTA calc you have ROUND(AVG([ADM_NOCARTA_n])) used 12 times. I'm not sure whether Tableau is smart enough to calculate that once, I do know that performance advice from Tableau staff is to take re-used functions like that and turn them into their own calculated field so I'm guessing Tableau isn't smart enough. The same goes for some of the other calcs.

              - You didn't specify whether you were using extracts or a live connection or not, what you posted has extracts. Any time you can reduce the number of columns (like for unused fields), you'll get faster. Almost any time you can use an extract vs. SQL Server, you'll get faster.

               

              Hope this helps!

               

              Jonathan