Thanks Joe, nicely explained. I especially like this bit of thinking:
I think of ATTR(<field>) as shorthand for:
IF MIN(<field>) = MAX(<field>) THEN MIN(<field>) ELSE "*" END
Makes sense. Also:
I believe with awareness of Dimensions and Measures nearly everything in Tableau can be derived and understood. Without this awareness, Tableau can be a confusing and frustrating application because it will seem to behave in unexpected ways, producing results that sometimes seem to be right, but in other times, like when the data changes, return incorrect results.
And it's threads like this that will help us all raise our awareness. Thanks again.
1 of 1 people found this helpful
I thought I'd take a swing at your question to try to practice explaining this. I don't see this view as Tableau "knows" to separate May and June, it's about the data, dimensions, and measures, and I'll try to walk through how Tableau does this. In this particular view, there are headers generated by dimensions (CV Code and Job) and headers generated by measures (ATTR(Year) and ATTR(Month)). The values of those measures are determined by the dimensions in the view.
This is the critical bit to understand... To put this another way, even though we're seeing Month and Year, they don't exist as independent values anymore because they aren't dimensions anymore, we've wrapped them in an aggregation and turned them into measures. (These are not the droidsMonth and Year you are looking for). The values of ATTR(Year) and ATTR(Month) are dependent on the level of detail created by the dimensions in the view.
This is roughly the computation that Tableau does for ATTR:
IF ISNULL(foo) THEN
ELSEIF MIN(foo) == MAX(foo) THEN
This alternative layout can help explain what ATTR(Month is doing). I put the dimensions on the left, and the aggregations on the right, along with a COUNTD(Month) to show how many months were being returned. Anytime there is more than one value of Month, ATTR(Month) is going to return an *:
What's happening is that in those first views, Tableau is aggregating (using ATTR) Year and Month for the distinct values of CV Code and Job (the dimensions). There's only one year being returned, 2013, so that is returned in the ATTR(Year). There's only one value of month for CV Code CV02 and Job J100, so that returns a value for July for ATTR(Month). There's only one value of Month for CV Code CV02 and Job J103, so that returns June for ATTR(Month). For CV Code CV00 and Job J100, there are multiple months (June and July), so ATTR(Month) returns * for that (and every other combination of CV Code and Job that has multiple months).
When you put the datetrunc version of MONTH(OPEN_TIME) in the view, that's adding the dimension to the view. With that, now there's a fine enough granularity that there's only one month per CV Code/Job/MONTH(OPEN_TIME), and ATTR(Month) always returns the month and not *:
The "trick" here is to look at the pill and understand whether it's a dimension or measure, and then to have enough knowledge of the underlying data to know its granularity and relationships to be able to decipher what Tableau has done. So, for example, in a one to many hierarchy such as the Product hierarchy in Superstore Sales we can bring out ATTR(Department) for a Category and it will work fine:
But if we use ATTR(Category) for Department then we get a bunch of *:
Does that help?
I have been away from my computer all day but following this thread closely with great interest via my email. I have to say this is one very helpful group of people and I commend all of you for taking the time to explain this in such great detail. It is an extraordinary learning experience and one that I will have to review many more times before I completely understand it all. As soon as I have the time I will deconstruct this myself.
Yes very much, thanks Jonathan! It also points out that setting up a viz like this creates what to my mind is a "fragile" viz. Because if the data is updated and a job number appears twice in the same month that will then produce a * for that cv_code/job/month combination. Correct?
It seems certain assumptions are being made that may not hold as data is updated, so Joe's encouragement that we raise our "awareness" is doubly critical, so we can understand the limits of certain techniques. And realize the underlying assumptions we are making that could affect the outcome down the road.
Shawn, I fully agree that without an understanding of all the moving pieces on this worksheet, it is a fragile visualization, and could easily produce incorrect results with pills in the wrong place or configured differently.
"if the data is updated and a job number appears twice in the same month that will then produce a * for that cv_code/job/month combination. "
This will not happen as long as Month(Open Time) is a Dimension pill. Dimension pills set the level of detail. While Month(Open Time) is a Dimension pill, ATTR(Month) will never return "*". If we remove the dimension pill Month(Open Time), then the view will break because the level of detail has changed, and even in this small sample data set, ATTR(Month) will return "*" in some cases.
The key awareness is the Dimension pills on this worksheet. Secondary is data densification, what shelves these pills are on, the compute using, and then using Discrete Measure pills to achieve the desired layout.
The fact that a pill is a Dimension is not impacted by the color of the pill, the color tells us Discrete or Continuous, the color of the pill does not indicate if the pill is a Dimension or Measure.
This will not happen as long as Month(Open Time) is a Dimension pill. Dimension pills set the level of detail. While Month(Open Time) is a Dimension pill, ATTR(Month) will never return "*".
The tumblers finally clicked in place, I got! Thanks Joe & Jonathan.