
1. Re: Modified ZScore? (Requires Median Average Deviation)
Brad Llewellyn Mar 20, 2013 9:35 AM (in response to Nathan Lee)Nathan,
You can use WINDOW_MEDIAN(). I blogged about creating ZScores in Tableau. You can alter my code to use WINDOW_MEDIAN instead of WINDOW_STD.
http://breakingbi.blogspot.com/2013/03/conducting2sampleztestintableau.html
Enjoy,
Brad Llewellyn
Associate Consultant
Mariner, LLC
brad.llewellyn@marinerusa.com

2. Re: Modified ZScore? (Requires Median Average Deviation)
Nathan Lee Mar 20, 2013 1:33 PM (in response to Brad Llewellyn)I don't think that's it, but maybe I'm missing something. The denominator of the ZScore is the sample standard deviation. The denominator of the Modified ZScore is the Median Absolute Deviation, not just a median of the samples, which is what you appear to be suggesting.

3. Re: Modified ZScore? (Requires Median Average Deviation)
Brad Llewellyn Mar 21, 2013 5:58 AM (in response to Nathan Lee)1 of 1 people found this helpfulNathan,
I think you may be right that this is impossible. If you were to attempt to put [X] on a chart, you would have to aggregate it somehow. Therefore, if you wanted to compare [X] to its median, you would need use a WINDOW_MEDIAN(). However, when you get to the denominator, you would need to take another median. Tableau limits you to only use two aggregations. See the below "attempt" at coding:
( MAX( [X] )  WINDOW_MEDIAN( MAX( [X] ) ) /
WINDOW_MEDIAN( ABS( MAX( [X] )  WINDOW_MEDIAN( MAX( [X] ) ) ) )
To conclude, I do not believe this is possible in Tableau due to the number of iterative functions that are needed. You may have to perform this operation before you pull the data into Tableau.
Thanks,
Brad Llewellyn
Associate Consultant
Mariner, LLC
brad.llewellyn@marinerusa.com

4. Re: Modified ZScore? (Requires Median Average Deviation)
Nathan Lee Mar 21, 2013 7:24 AM (in response to Brad Llewellyn)That is disappointing, but I believe you are correct.

5. Re: Modified ZScore? (Requires Median Average Deviation)
Brad Llewellyn Mar 21, 2013 7:30 AM (in response to Nathan Lee)On the bright side, there is an idea on the Tableau forums to add R functionality.
Thanks,
Brad Llewellyn
Associate Consultant
Mariner, LLC
brad.llewellyn@marinerusa.com

6. Re: Modified ZScore? (Requires Median Average Deviation)
Jim Wahl Mar 22, 2013 2:06 PM (in response to Nathan Lee)What about
Zm =
0.6745 * (ATTR([value])  WINDOW_MEDIAN(ATTR([value]))) /
WINDOW_MEDIAN(ABS(ATTR([value])  WINDOW_MEDIAN(ATTR([value]))))
For the table calc, you need to compute using a value that is at the row level. I.e., one row_id / value. As a check, ATTR() will return * if there is more than one value.

mad_v2.zip 19.8 KB


7. Re: Modified ZScore? (Requires Median Average Deviation)
Brad Llewellyn Mar 22, 2013 2:04 PM (in response to Jim Wahl)Jim,
Your denominator has a Window median of a window median, is that valid?
Thanks,
Brad Llewellyn
Associate Consultant
Mariner, LLC
brad.llewellyn@marinerusa.com

8. Re: Modified ZScore? (Requires Median Average Deviation)
Jim Wahl Mar 22, 2013 2:20 PM (in response to Brad Llewellyn)Brad,
I think it's valid.
I did another version where Zm was a nested table calc  I calculated an separate [value median] and took the median of this value in [Zm]. It looked like a "normal" nested table calc, but I'll admit that I don't use these often.
In Zm v2, i just pasted the formula for [value median] into [Zm v2]. This removes the nested table calc, but gives the same answer. ...
Jim

9. Re: Modified ZScore? (Requires Median Average Deviation)
Nathan Lee Mar 29, 2013 8:50 AM (in response to Jim Wahl)Jim, it looks like you did create a working formula.
It is my understanding that the ATTR() function returns the value of the argument if there is only one value. In which case, a WINDOW_MEDIAN of ATTR() would always equal ATTR(), rendering the median ineffective. I don't know that a Modified ZScore calculated on a fully homogeneous series is useful.
(ATTR([data series]) = x only if [data series] = {x, x, x,...x}. Otherwise, if for instance [data series] = {x, y, x, x}, ATTR([data series]) = "*".)  tell me if I'm wrong on that.
Brad, it's good to know R may be in Tableau's future. I'm constantly running into limitations in the calculation functionality. In the mean time, it may be a good idea to learn R, and use Tableau as a visualization tool.

10. Re: Modified ZScore? (Requires Median Average Deviation)
Jim Wahl Apr 2, 2013 12:06 AM (in response to Nathan Lee)Hi Nathan,
I think the modified Zscore is working as expected across a nonhomogenous data set?
value = [1, 1, 2, 2, 4, 6, 9]
median = 2
absolute deviation (AD) = value  median = [1, 1, 0, 0, 2, 4, 7] = [0, 0, 1, 1, 2, 4, 7]
Median AD (MAD) = 1
From the twbx in the above message:
As you said, ATTR() returns only one value, but it operates at the level of detail specified in the table calculationin this case my "row" variable, which is a unique ID per row. Therefore ATTR() is returning one value / row and WINDOW_MEDIAN(ATTR(value)) = 2.
Since there is only one value / row, you could also use min(), max(), avg(), ..., and get the same result as ATTR(), but ATTR() has an "error checking" advantage that if the table calc compute using field is not properly specified and there is more than one value / row, ATTR() will return null / * and the table calc will be null.
Another approach to this problem is to use custom SQL and do the stats / calculations with SQL. You can probably find SQL examples of common stat functions. Custom SQL has the advantage that if others are using your workbook, you avoid the complexity of table calcs. Note that this works best on real databases. If you're connecting to an Excel or text file, the Microsoft Jet database is limited and, for example, doesn't have a MEDIAN function.
Yet another approach, especially if you're using Excel or CSV files, is to just run the files through an R script to add or modify the necessary columns. Or, with Tableau 8, you could use the Python Tableau API to load a Tableau Data Extract into Python, add whatever stats, and then output an updated TDE.
Doing basic stats in R is much easier than table calcs. Here's a fourline R script "mad_process.r":
# Input = csvFile with "value" column
# Output = csvFile with new "value_Zm" column
# Zm = 0.6745*(value  median(value) / MAD(x)
# MAD(x) = median absolute deviation = median(x  median(x))
#
# For variable assignment, R traditionally uses "<" but you can also use "="
data < read.csv("C:/Users/jim/Desktop/mad.csv")
dv < data$value
data$value_Zm < 0.6745*(dv  median(dv)) / median(abs(dv  median(dv)))
# With one extra line, we could also add standard Z scores
data$value_Z < scale(dv) #equivalent to (dv  mead(dv)) / sd(dv)
write.csv(data, "C:/Users/jim/Desktop/mad_updated.csv", row.names = FALSE)
Install R from http://cran.rproject.org/bin/windows/base/ (You should also install RStudio if you're going to use R interactively.)
Add the R install directory\bin to your Windows path, and you can run the above script from the command line with:
>rscript mad_process.r
Jim

11. Re: Modified ZScore? (Requires Median Average Deviation)
Nathan Lee Apr 5, 2013 12:58 PM (in response to Jim Wahl)Thanks, Jim. You have presented some fine options, which I will investigate as time allows. For the time being, I will consider this impossible within Tableau's own calculation capabilities.