14 Replies Latest reply on Dec 8, 2014 11:21 PM by geeta.israni.0

    Multiple Reference Lines

    Mark Isaacs

      I have a line chart with employee age on the Columns shelf and the number of staff on the Rows shelf.  A reference line shows the average age, using a calculated field field whose formula is TOTAL(AVG([Age]))  This shows that the average age is 41.90 (see worksheet Avg Age).  The field Avg Age is placed on the Level of Detail shelf.


      I can drag another dimension field such as Gender to the rows shelf (see worksheet Avg Age by Gender - 1) and the reference lines show (correctly) that the average age for females is 39.0 and for males it's 46.9.  What I really need, however, is for the two gender charts to appear as in  worksheet Avg Age by Gender - 2, where Gender is on the Color shelf rather than the row shelf.  I was hoping to have two reference lines appear, one for each gender.  However, only one line appears, with the average age across both genders.  I've tried changing the Computing Using... properties for the Avg Age on the Level of Detail shelf, but all this does is change the value for the single reference line, to values such as 49.51 when the Compute using... is set to Gender and 43.259 when it's set to Age.


      Is there a way to have multiple reference lines appear in the viz in worksheet Avg Age by Gender - 2?  If so, are separate calculated fields needed for each?  I would hope not, as I'd really like to be able to drop other dimensions on the color shelf and have the appropriate reference lines automatically generated.


      Also, can anyone explain what the values 49.51 and 43.259 refer to, when you change the Compute using... to Gender and to Age?

        • 1. Re: Multiple Reference Lines
          Richard Leeke

          The reference line is showing the average of the averages for F and M.


          If you change the partitioning to Compute Using Age and then change the reference line definition to show Max or Min of Average Age you will see that.


          In terms of your real question (can I do what I really want simply and flexibly) I fear the answer is no.  Reference lines work within the Table/Pane/Cell structure - so the only way I can think to achieve this would be to have multiple calculated fields, each returning the value for a dimension value (such as "Male") for which you want a reference line.


          Joe may be able to come up with some tricks to make it slightly less painful - but I doubt that even he can produce exactly what you want here in terms of flexibility for drag and drop.


          ... or at least not with actual reference lines.  Maybe there's a trick with pseudo-reference lines made from dual axis charts and table calcs lurking in here, I'll have a ponder...

          • 2. Re: Multiple Reference Lines
            Mark Isaacs

            Thanks, Richard.   


            I was hoping that the panes would incorporate the Gender dimension so that this would all happen automatically. 


            How did you change the reference line definition to show max or min of average age?  When I try to set the formula to MAX(AVG[Age])) I can an error message advising me that the argument to MAX is already an aggregation and cannot be further aggregated.


            Although it may not be much of a problem to set up two distinct calculated fields - one for M and one for F - for this specific situation,  the more general scenario I had in mind is that I'd like to be able to drag different dimensions to the Color shelf, set up quick filters for them, and not have to create or edit calculated fields each time.  Some of the dimension have dozens of values, with the user selecting a few via a quick filter.  Having to define a separate calculated field for every dimension value would not be practical.


            I was also hoping that the reference lines themselves would automatically be colored, the same as the chart lines.  Perhaps I'm asking for too much!

            • 3. Re: Multiple Reference Lines
              Richard Leeke

              Hi Mark


              You select the aggregate within the definition of the reference line itself.  Reference lines effectively give you an additional layer of aggregation (calculated on the client side by Tableau) on the already aggregated measures returned from the database.  This is actually a nasty trap to watch out for - taking averahes of averages (or percentiles of averages) may not give you what you thought you were getting.


              Yes, I realised what you were trying to do with the colour shelf and colouring the lines etc.


              I don't have time to elaborate on it now, but if you have a look at the example workbook I posted in the TCRL section for the quantile caculation I posted there, you might get a clue about what I meant by simulating reference lines with calculated fields and dual axis charts.  I gave an example there of a scatter chart of detailed values plus percentile lines.  In that case the point was to achieve something like reference lines against a continuous axis, but something vaguely along those lines might just be possible.


              Post back how you get on and it might just remind me to take a look in a day or two.

              • 4. Re: Multiple Reference Lines
                Mark Isaacs

                Hi Richard


                Thanks for the additional info.  I'll have a look at your sample workbook and see if I can glean some ideas from it about alternate approaches.


                Seems like there are many places where one can qualify or constrain the scope of a calculated field: in the definition itself, via 'Compute using', in the reference line definition and possibly elsewhere.  I see many interesting and intriguing solutions presented here to the issues confronting users, and hope that one day the world of calculated fields (and in particular table calculations) will all "fall into place" for me.  I often find that I try various settings until the desired results are displayed.  The challenge, though, is that often it's not clear whether what's being displayed is indeed correct.  Hence the curiosity about those two values 49.51 and 43.259  mentioned in the first note.

                • 5. Re: Multiple Reference Lines
                  Joe Mako

                  In situations like this, I prefer to make a separate calc field for each reference line.


                  The key thing to remember with ref lines is that are an aggregation of a value at the mark level, so they will add another level of aggregation.


                  You have a variety of methods to get the result you want, attached are just a few.


                  With your current layout, a dimension on the color shelf, and a continuous dimension on the columns shelf, and just a measure on your rows shelf, "Table (Across)" is the same as an Advanced compute using with both dimensions in the Addressing list (the right-side list box).


                  Also because you only have two lines, each for a Gender, you could also set the compute using to "Age", telling Tableau to use the other dimensions for partitioning, in this case, just "Gender", you can use a reference line of the Min and Max to get the two ref lines, but I do not know a way to dynamically label then to a gender. This method may be useful in other situations.

                  • 6. Re: Multiple Reference Lines
                    Mark Isaacs

                    Hi Joe


                    Many thanks for this workbook.  I'll have to mull over what's going on in each of the worksheets you've added. 


                    However, given your comment about preferring to have separate calc fields for each reference line it looks like my more-general wish can't be easily achieved.  For example, if I have a State dimension for the US states, and want to use a quick filter to select a few of them, it looks like I can't automatically get an avg age reference line for each of the selected states by simply putting State on the Colors shelf: I would have to have a separate calc field for each state that's selected in the filter. 


                    I guess I could set some upper limit to the number of states that a user could select - eg, 6 - and then create 6 parameters and 6 calculated fields for the appropriate 6 avg ages, eg, something like


                    AvgAge1 = TOTAL(AVG(IF [State]=[Parameter 1 State] THEN [Age] END))

                    AvgAge2 = TOTAL(AVG(IF [State]=[Parameter 2 State] THEN [Age] END))



                    However, I expect there would be challenges dealing with the situation where the user wanted fewer than the maximum of 6 states on the chart.

                    • 7. Re: Multiple Reference Lines
                      Joe Mako

                      Okay, here you go, this method would draw another vertical line for each Gender.


                      This method uses custom SQL to duplicate the data, so I can have one set of the data to draw the histogram line, and then two more sets of the data two draw two points in the vertical lines. calculated fields and aggregations with a double dual axis take care of the rest.


                      Let me know if you need help applying this to your actual data (this was a hasty throw together, so with attention better details could be done).

                      • 8. Re: Multiple Reference Lines
                        Joe Mako

                        okay, so I posted that last one, knowing that it did not need a double dual axis (it was easier for me to think about it quickly as a double dual axis).


                        Only a single dual axis is needed in order to gain the multiple level of aggregation level of detail. Here it is with a single dual axis.


                        Next I may try to use table calcs instead of the dual axis to gain multiple levels of aggregation.


                        Fun stuff, Thanks for a great question!


                        (I have the feeling that there are other was to accomplish this same view, and am willing to explore them if you have additional constraints that cause this method to no longer work)

                        • 9. Re: Multiple Reference Lines
                          Richard Leeke

                          Here's another take on it.


                          I couldn't see how to get anything going with the dual-axis approach I was thinking of.  This way goes some of the way to what you were after - but still needs a whole swag of individual calculated fields.


                          Note that I made the calculated fields null unless the relevant color dimension has been chosen, then defined reference lines for all variants.  The ones that don't apply to the chosen dimension then automagically disappear.

                          • 10. Re: Multiple Reference Lines
                            Joe Mako

                            Great stuff Richard, here is my non-table calc single dual axis version that incorporates your color selection parameter.

                            • 11. Re: Multiple Reference Lines
                              Richard Leeke

                              Great stuff Joe - I see we've both been having fun with the same problem.


                              Your way is the approach I was originally thinking of - but I couldn't remember one or two of the tricks needed.


                              So now I've combined that with my parameter approach so Mark can choose his colour dimension and I think he'll be happy.

                              • 12. Re: Multiple Reference Lines
                                Richard Leeke

                                LOL - how did I know you'd do the same - and beat me to it as usual?  ;-)


                                Note that I swapped your mark definitions around and unpinned the dual Y axis so that you can zoom in and out with impunity.

                                • 13. Re: Multiple Reference Lines
                                  Mark Isaacs

                                  Many thanks, Joe and Richard, for all of your comments and examples.  It's greatly appreciated


                                  I'll have to spend a fair bit of time working through all of this material.  I'm sure it will be enlightening.


                                  I can't believe you called this 'fun'!....LOL

                                  • 14. Re: Multiple Reference Lines

                                    Joe Mako

                                    Hi Joe,


                                    I faced a similar issue and your workbook did give a bit of direction. However, the 2 values I would like to plot (and add average as reference lines) are in turn coming from a calculated field. So while creating the calculated fields (like you've done for age segregation) the tableau gives error stating that an already aggregated measure cannot be further aggregated.


                                    I also have to keep the secondary axis free to plot a third set of values which is coming from another data source. I have attached my demo data as well as the tableau workbook. The 'Weighted Values' is what I am trying to plot for Category A and B along with their respective reference lines.


                                    Any help from your good self on the same would be of much help.

                                    (I have started another thread as my reply here did not give me an option to attach files