The reference line is showing the average of the averages for F and M.
If you change the partitioning to Compute Using Age and then change the reference line definition to show Max or Min of Average Age you will see that.
In terms of your real question (can I do what I really want simply and flexibly) I fear the answer is no. Reference lines work within the Table/Pane/Cell structure - so the only way I can think to achieve this would be to have multiple calculated fields, each returning the value for a dimension value (such as "Male") for which you want a reference line.
Joe may be able to come up with some tricks to make it slightly less painful - but I doubt that even he can produce exactly what you want here in terms of flexibility for drag and drop.
... or at least not with actual reference lines. Maybe there's a trick with pseudo-reference lines made from dual axis charts and table calcs lurking in here, I'll have a ponder...
I was hoping that the panes would incorporate the Gender dimension so that this would all happen automatically.
How did you change the reference line definition to show max or min of average age? When I try to set the formula to MAX(AVG[Age])) I can an error message advising me that the argument to MAX is already an aggregation and cannot be further aggregated.
Although it may not be much of a problem to set up two distinct calculated fields - one for M and one for F - for this specific situation, the more general scenario I had in mind is that I'd like to be able to drag different dimensions to the Color shelf, set up quick filters for them, and not have to create or edit calculated fields each time. Some of the dimension have dozens of values, with the user selecting a few via a quick filter. Having to define a separate calculated field for every dimension value would not be practical.
I was also hoping that the reference lines themselves would automatically be colored, the same as the chart lines. Perhaps I'm asking for too much!
You select the aggregate within the definition of the reference line itself. Reference lines effectively give you an additional layer of aggregation (calculated on the client side by Tableau) on the already aggregated measures returned from the database. This is actually a nasty trap to watch out for - taking averahes of averages (or percentiles of averages) may not give you what you thought you were getting.
Yes, I realised what you were trying to do with the colour shelf and colouring the lines etc.
I don't have time to elaborate on it now, but if you have a look at the example workbook I posted in the TCRL section for the quantile caculation I posted there, you might get a clue about what I meant by simulating reference lines with calculated fields and dual axis charts. I gave an example there of a scatter chart of detailed values plus percentile lines. In that case the point was to achieve something like reference lines against a continuous axis, but something vaguely along those lines might just be possible.
Post back how you get on and it might just remind me to take a look in a day or two.
Thanks for the additional info. I'll have a look at your sample workbook and see if I can glean some ideas from it about alternate approaches.
Seems like there are many places where one can qualify or constrain the scope of a calculated field: in the definition itself, via 'Compute using', in the reference line definition and possibly elsewhere. I see many interesting and intriguing solutions presented here to the issues confronting users, and hope that one day the world of calculated fields (and in particular table calculations) will all "fall into place" for me. I often find that I try various settings until the desired results are displayed. The challenge, though, is that often it's not clear whether what's being displayed is indeed correct. Hence the curiosity about those two values 49.51 and 43.259 mentioned in the first note.
In situations like this, I prefer to make a separate calc field for each reference line.
The key thing to remember with ref lines is that are an aggregation of a value at the mark level, so they will add another level of aggregation.
You have a variety of methods to get the result you want, attached are just a few.
With your current layout, a dimension on the color shelf, and a continuous dimension on the columns shelf, and just a measure on your rows shelf, "Table (Across)" is the same as an Advanced compute using with both dimensions in the Addressing list (the right-side list box).
Also because you only have two lines, each for a Gender, you could also set the compute using to "Age", telling Tableau to use the other dimensions for partitioning, in this case, just "Gender", you can use a reference line of the Min and Max to get the two ref lines, but I do not know a way to dynamically label then to a gender. This method may be useful in other situations.
average_ages_jm.twbx 62.2 KB
Many thanks for this workbook. I'll have to mull over what's going on in each of the worksheets you've added.
However, given your comment about preferring to have separate calc fields for each reference line it looks like my more-general wish can't be easily achieved. For example, if I have a State dimension for the US states, and want to use a quick filter to select a few of them, it looks like I can't automatically get an avg age reference line for each of the selected states by simply putting State on the Colors shelf: I would have to have a separate calc field for each state that's selected in the filter.
I guess I could set some upper limit to the number of states that a user could select - eg, 6 - and then create 6 parameters and 6 calculated fields for the appropriate 6 avg ages, eg, something like
AvgAge1 = TOTAL(AVG(IF [State]=[Parameter 1 State] THEN [Age] END))
AvgAge2 = TOTAL(AVG(IF [State]=[Parameter 2 State] THEN [Age] END))
However, I expect there would be challenges dealing with the situation where the user wanted fewer than the maximum of 6 states on the chart.
Okay, here you go, this method would draw another vertical line for each Gender.
This method uses custom SQL to duplicate the data, so I can have one set of the data to draw the histogram line, and then two more sets of the data two draw two points in the vertical lines. calculated fields and aggregations with a double dual axis take care of the rest.
Let me know if you need help applying this to your actual data (this was a hasty throw together, so with attention better details could be done).
average_ages_jm_v2.twbx 160.6 KB
okay, so I posted that last one, knowing that it did not need a double dual axis (it was easier for me to think about it quickly as a double dual axis).
Only a single dual axis is needed in order to gain the multiple level of aggregation level of detail. Here it is with a single dual axis.
Next I may try to use table calcs instead of the dual axis to gain multiple levels of aggregation.
Fun stuff, Thanks for a great question!
(I have the feeling that there are other was to accomplish this same view, and am willing to explore them if you have additional constraints that cause this method to no longer work)
average_ages_jm_v2a.twbx 173.8 KB
Here's another take on it.
I couldn't see how to get anything going with the dual-axis approach I was thinking of. This way goes some of the way to what you were after - but still needs a whole swag of individual calculated fields.
Note that I made the calculated fields null unless the relevant color dimension has been chosen, then defined reference lines for all variants. The ones that don't apply to the chosen dimension then automagically disappear.
average_ages_jm_rl.twbx 71.0 KB
Great stuff Joe - I see we've both been having fun with the same problem.
Your way is the approach I was originally thinking of - but I couldn't remember one or two of the tricks needed.
So now I've combined that with my parameter approach so Mark can choose his colour dimension and I think he'll be happy.
average_ages_jm_v2a_rl.twbx 258.2 KB
LOL - how did I know you'd do the same - and beat me to it as usual? ;-)
Note that I swapped your mark definitions around and unpinned the dual Y axis so that you can zoom in and out with impunity.
Many thanks, Joe and Richard, for all of your comments and examples. It's greatly appreciated
I'll have to spend a fair bit of time working through all of this material. I'm sure it will be enlightening.
I can't believe you called this 'fun'!....LOL
I faced a similar issue and your workbook did give a bit of direction. However, the 2 values I would like to plot (and add average as reference lines) are in turn coming from a calculated field. So while creating the calculated fields (like you've done for age segregation) the tableau gives error stating that an already aggregated measure cannot be further aggregated.
I also have to keep the secondary axis free to plot a third set of values which is coming from another data source. I have attached my demo data as well as the tableau workbook. The 'Weighted Values' is what I am trying to plot for Category A and B along with their respective reference lines.
Any help from your good self on the same would be of much help.
(I have started another thread as my reply here did not give me an option to attach files