
1. Re: How do you calculate the standard deviation of a product category?
Jonathan Drummey May 28, 2012 6:53 AM (in response to Fabio Annovazzi)Hi Fabio,
An exact answer is going to depend on what dimensions you have in the view you are constructing, I've set up a basic example using the superstore sales data. To calculate the variance at the product level, you are going to need the Product in the view. A simple STDEV(Sales) should get that for you. Then, to create an aggregation of that, we turn to table calculations. So, a table calculation such as:
IF FIRST()==0 THEN
WINDOW_AVG(STDEV([Sales]), 0, IIF(FIRST()=0,LAST(),0))
END
with the Compute Using set to Product should get you the results you are looking for. The IF/IIF statements in the above calc are there to improve performance and eliminate overlapping text using the method described by Richard Leeke:
Cheers,
Jonathan

2. Re: How do you calculate the standard deviation of a product category?
Fabio Annovazzi May 30, 2012 5:41 AM (in response to Jonathan Drummey)Hi Jonathan,
thanks a lot, this worked fine. And it is also very quick.
One more question.
I tried to apply Richard's trick to another workbook that takes 5 minutes to load, and has 4 table calculations with window_avg.
It looks for the fan pages that have had the largest increase in number of new fan (a so might have done something smart) by comparing ration of average number of new likes in the last seven days/ average number of new fans in the last month.
My original formula was
WINDOW_AVG(sum([Likes per week]),1,0)
and
WINDOW_AVG(sum([Likes per week]),30,0).
In order to make them compute only once I changed them into
IIf([last record],WINDOW_AVG(sum([Likes per week]),1, IIF(FIRST()==1,0,1)),0)
and
IIf([last record],WINDOW_AVG(sum([Likes per week]),30, IIF(FIRST()==30,1,30)),0),
which I thought would make it calculate only starting from the row of the most recent date.
There was no improvement of performance. Any idea why?
thanks
fabio

3. Re: How do you calculate the standard deviation of a product category?
Jonathan Drummey May 30, 2012 4:52 PM (in response to Fabio Annovazzi)Hi Fabio,
I'm going to ping Richard Leeke on this one, from looking at it I'm not sure, I do have a couple of ideas:
First of all, reducing the amount of data that Tableau has to grab from the DB can have a big performance improvement, so if you haven't set up a filter to grab the minimal number of records you need (maybe the last month?), try that first.
You don't say what the [last record] value is, depending on that calculation it may be acting on all rows and slowing things down. Also, if [last record] is returning TRUE for more than just the last record, then the calc will of course be running more often. That's something to test out, I have a suggestion on a method at the bottom of this post.
Something I read in a Tableau performance doc said that IF was faster than IIF, and in this case where we only want one result then we definitely don't want to return 0 for every noncalculated row, leaving it as NULL would work better. So, something that might work better would be:
IF LAST()==0 THEN
WINDOW_AVG(SUM([Likes per week]),1,IIF(LAST()==0,0,1))
END
A way to test this is to grab a smaller batch of data and set up a text table where all your discrete dimensions are on rows, with Measure Names on Columns and Measure Values on text. For an example, see the attached. This way you can validate each component of the table calc to make sure it's returning the right values for the right row(s) in your data. You might try this on your data set to see what calcs work the fastest. And hopefully Richard will respond as well!
Jonathan

sum on last.twbx.zip 1.1 MB
