3 Replies Latest reply on Aug 16, 2016 4:08 PM by Jonathan Drummey

    How to do linear spline in Tableau?

    Matthew Beason

      Fellow Tableau Users,

       

      I'm using Tableau Desktop 9.2.2.

       

      I'm creating a workbook which calculates the TTL for a given hardware pool in OpenStack. Right now, I use a simple run rate using the following calculated field:

       

      TTL

      INT(([Max_CU] - [Used_CU]) / 7)

       

      Max_CU

      INT(MIN( MIN( MIN([Max Disk] / 40, [Max_CPU_Prvt] / 4), [Max Ram] / 2048), [Max Instances] / 1))

       

      Used_CU

      INT(MAX(MAX(MAX([Pct_Disk], [Pct_CPUS]), [Pct_Ram]), [Pct_Instances]) * [Max_CU])

       

      I've included the other calculated fields for reference.

       

      I'm trying to align my report and calculations with what I'm see from the OpenStack Team. When I asked them about their calculation, they said they are using a linear spline.

       

      Is there a way to use the same method in Tableau?

       

      Thanks in advance for the assistance!

       

      Matt Beason

        • 1. Re: How to do linear spline in Tableau?
          Jonathan Drummey

          Hi Matt,

           

          Can you post some sample data and a mock-up of the output?

           

          Jonathan

          • 2. Re: How to do linear spline in Tableau?
            Matthew Beason

            Jonathan,

             

            Absolutely. The attachment, "MB Sample Data", contains several weeks worth of sample data for a given openstack zone.

             

            The Cloud Team is currently using a python script to generate the forecasts for TTL for a given zone.

             

            They are using this program to do four things:

            1) Smoothing

             

            # Smooth out the data set
              ewma = pandas.stats.moments.ewma
              fwd = ewma( Ys, span=42# take EWMA in fwd direction
              bwd = ewma( Ys[::-1], span=42# take EWMA in bwd direction
              c = numpy.vstack(( fwd, bwd[::-1] )) # lump fwd and bwd together
              c = numpy.mean( c, axis=0# average

             

            2) Extrapolating

            k = 1 # linear
               # k = 2 # parabolic
               # k = 3 # cubic
              extrapolator = UnivariateSpline( Xs[:,0], smooth, k=k )

             

            3) Interpolated/predication

             

            # Predict
              newX = []
              last_timestamp = Xs[len(Xs) - 1][0]
              limit = 180 # Always predict 6 months min(len(Xs), num_days)
               while len(newX) < limit:
              series = last_timestamp + 86400
              newX.append([last_timestamp, 0])
              last_timestamp = series
              newX = numpy.asarray(newX, dtype=float)
              _newX = []
               for x in newX[:,0:1]:
              _newX.append(x[0])
              newY = extrapolator( _newX

             

            4) Calculating TTL

             

              sz_max = Max[-1]
              units  = Ys[-1:]
              days  = 0
               for y in newY:
               if days >= 180 or units >= sz_max:
               break
              units = y
              days  = days + 1

             

             

            This outputs several charts that show the forecasted TTL up to 180 days for a given zone. I'd like to be able to do this in Tableau.

            • 3. Re: How to do linear spline in Tableau?
              Jonathan Drummey

              Hi Matt,

               

              To get exact results my first recommendation is to use the given Python code to pre-process your data, given that there are 4 steps and the data is forecasted out beyond the range of the data this would be extremely complicated to implement in Tableau. The challenge is that fundamentally there's data set A that goes through a set of transformations to add data set B (the forecast), and I'm guessing that the final displayed view(s) are a combination of A and B. I went through an example of this sort of problem awhile back in Not a funnel cake, nor funnel charts, it’s funnel plots! | Drawing with Numbers.

               

              Here's a step-by-step way of looking at it:

               

              0. We start out with N data points.

              1. The data is smoothed with a  look forward/back in step 1 above. This can be done with a WINDOW_AVG() function in Tableau.

              2. In step 2 above the data has a linear spline created. I'm not sure about the results of this Python function, I'm guessing that the extrapolating values step is adding records to the data between the smoothed values. In Tableau the only way to do that is with data densification and we can't do that on table calculation results (like the results of the WINDOW_AVG() calculation). Technically we could densify the data prior to step 1 but that means the WINDOW_AVG() calculation has to deal with the densified data and that adds complexity to that step and this step. Doing a linear spline in Tableau would require additional table calculations built on top of the WINDOW_AVG().

              3. The interpolation/prediction calculation and TTL calculations are certainly adding rows to the data beyond what exists in the data -- creating the data source B. This is not something Tableau does at this time. There are ways we can work around it such as by "re-using" the original records in the data (which is what I did in the funnel plot workbook above) or a second technique where we  add one or more dummy records to the data or do a self union that would enable us to (prior to step 1) to further densify the data in Tableau such that we have enough data to do build the forecast. This data prep is sometimes not possible and as I'd mentioned *all* the calculations have to deal with the densified data to accurately return results, plus since this is all done in Tableau at the time of refresh depending on your data volumes it can be memory and/or processor intensive. Also, I'm not 100% sure of the calculation logic used in the array (I'm not that good at reading Python), there are cases of nested iterative computations that are not feasible in Tableau (though they can sometimes be pushed into R using Tableau's R integration).

               

              For any of us to help further we'd most likely need a next level of clarification of exactly what is happening at each step of the Python sample code you gave in order to translate that into Tableau-speak. In addition having intermediate and final outputs of the Python code on a sample data set would be helpful in validating the numbers at each step of the way.

               

              Jonathan