14 Replies Latest reply on Dec 6, 2017 5:37 AM by Stuart Bartmess

    Creating a Bin dimension from an aggregated calculated field

    Andrew Watson

      What I'm trying to do is create a set of bins based on a calculation. As a watered down example I have 3 fields in my data:

       

      1. Month

      2. ListingId

      3. EnquiryCount

       

      First I sum the EnquiryCount over the time period for each listing id - which gives me the total enquiry count for each ListingId. I want to use this value for the bins.

       

      It's simple to create a calculated field summing the EnquiryCount (EnqCountCalc in the attachement) then create another calculated field creating the bins (EnqCountBins).

       

      From this I'm able to count the number of listings in each EnquiryCount bin as long as I have the ListingId on the level of detail shelf.

       

      What I want to be able to do, and where I'm running to problems, is create a histogram showing for each bin the % of total enquiries over the time period which happened in the month.

       

      The sheet in the attachment HistogramNoBins shows what I'm trying to achieve but I'd like this with the bins along the X axis. My attempt, which doesn't work, is in the sheet %EnqsPerMonthPerBin. The total should be 100% for all bins but with the ListingId in the level of detail this doesn't work.

       

      I tried to use a Window_sum table calc, computing using the ListingId (WindowSum in the attachment) - but this also doesn't work as it needs the ListingId in the level of detail.

       

      Is this even possible using Tableau? I think I can do this using custom SQL and summing the EnquiryCount before bringing the data into Tableau but that's quite time consuming to process the data beforehand (in reality the dataset is large) and doing it in Tableau seems much neater.

       

      The attached Excel is just some dummy data to feed the workbook, it's actually a SQL feed that will be going into this - and it has to be an extract for performance reasons as the data takes over 30 mins to pull from SQL.

        • 1. Re: Creating a Bin dimension from an aggregated calculated field
          Richard Leeke

          Haven't read your question properly yet, but something vaguely similar came up the other day which might help, in this thread:

           

          http://www.tableausoftware.com/support/forum/topic/aggregating-aggregate-histograms

           

          Take a look at that - from a quick skim I think it could be that the "ignore in table calculations" option is all you are missing.

           

          If that doesn't help I'll try to have a better look later...

          • 2. Re: Creating a Bin dimension from an aggregated calculated field
            Andrew Watson

            Thanks for the quick response. In this case that doesn't help although was interesting to read for my continuing Tableau education :-)

            • 3. Re: Creating a Bin dimension from an aggregated calculated field
              Richard Leeke

              Well if I understood what you want correctly I think the "ignore in table calculations" setting gets you some of the way there.

               

              I think what I've attached does what you want - so have a look through the table calculations and see if you can make sense of it.  I'm not going to explain at the moment - I'm watching the Wimbledon semi-finals - so I've just been picking away at this between games.  ;-)

               

              I've added three sheets showing how I built it up.  If you click on the bars in my version of ListingCountPerBin you will see the bars are just single marks, whereas in your version they are stacking all the individual marks for each ListingId.

               

              If you actually wanted the individual marks to show the percentage of the overall total, rather than percent of total within each bin then I'm going to have to think harder.  Partitioning by an aggregate like that doesn't allow the full functionality of partitioning by a dimension (for reasons I don't pretend to understand) - so you can't include the bins calculation in the Compute Using, to get the overall size.

              • 4. Re: Creating a Bin dimension from an aggregated calculated field
                Andrew Watson

                I think maybe I wasn't clear with what I'm trying to do. In the example I have a number of listing ids and the number of enquiries they have received in month 1,2 and 3. The sum of these enquiries for each listing id defines which group they belong to - i.e. 1 - 10, 11 - 20, etc. Once I know which bin each listing belongs to I want to see for each group what % of total enquiries came in month 1, month 2 and month 3.

                 

                By pre-calculating the groups this is quite straightforward but that's taking the bin calculation out of Tableau. I don't know if it's possible to do this in Tableau (I assume it could be using a RAWSQL statement but these aren't possible using an extract).

                 

                The attached file shows how the result for the example above would appear with the bins pre defined in the csv. Note instead of the bin 40+ it's now called 50 although it's actually all those who had 40+ enquiries over the time period. Tableau nulled out the 40+ group initially (I guess it was read as a numeric data type so nulled out text) so I renamed it 50 - hopefully this isn't a cause for confusion.

                • 5. Re: Creating a Bin dimension from an aggregated calculated field
                  Richard Leeke

                  No, you were quite clear about what you wanted - looks like I just got the answer wrong (and obviously didn't check my results!).  ;-)

                   

                  Will have another look.

                   

                  BTW, the reason for the 40+ bin name being nulled is just that the MS JET engine treated the column as numeric and then decided that 40+ is not a number.  As you only want the in as a categorical dimension, you can tell JET that you want to treat it as a string.  Various ways to do that - the easiest is just to wrap the field in double quotes.

                  • 6. Re: Creating a Bin dimension from an aggregated calculated field
                    Richard Leeke

                    Ahhh, OK, so I wasn't clear what you wanted after all.  I remember when I first read it I thought you wanted the bins defined based on the overall total enquiries over the three months, but then that wasn't what your bin calculation was doing so I just assumed you wanted it to work the way you have it in [EnqCountBin].

                     

                    That works fine as long as you don't have month on the sheet, but as soon as you introduce month the bins are calculated per month.

                     

                    So you need a table calc for the bin calculation, too.  I have a sneaking suspicion that may prevent the result from being used to partition the remaining calculations, though.

                     

                    I'll try to have another look later and see if I can come up with anything.

                    • 7. Re: Creating a Bin dimension from an aggregated calculated field
                      Richard Leeke

                      As I thought, using a table calculation allows you to calculate the bins as you want them, but you can't then partition by the resulting aggregate.

                       

                      I managed to find a way around it, which is to duplicate the data connection and create a relationship to that omitting month.  This allows you to do calculations at a different level of aggregation without using table calculations.

                       

                      I included a sheet at the end showing that the original bins calculation is wrong when month is on the view, and that both the table calculation and the blending approach give the correct answer.

                       

                      I calculated the bins using the data blend at the overall level and then partitioned the result at the month level for the percent of total table calculation.

                       

                      Note that I had to use a custom table calculation for the percent of total - a quick table calculation didn't allow partitioning by the aggregate calculation.

                       

                      And this time I checked the answer (against yours).

                       

                       

                      Hopefully that makes some sort of sense - it's very hard to explain this clearly.  Let me know if you can't work out what's happening here.

                      • 8. Re: Creating a Bin dimension from an aggregated calculated field
                        Andrew Watson

                        Thanks for doing this. I didn't think it was possible but didn't think of the self join (data blending). It actually makes sense as that's how I have done this in custom SQL attempting this task from another angle - self join to pre-calculate the groups before loading into Tableau.

                         

                        I will attempt to write up how this works as I've wondered how to do this a number of times in the past and always had to give up on the Tableau solution as I could work out a SQL solution far quicker than Tableau. Once I get this written up I'll send it your way to check I'm fully understanding if that's ok as this is quite complicated to get the head around.

                         

                        It's also going to be interesting to know whether it's quicker to do this using custom SQL or the data blending technique. I'll share the answers once I've had chance to test it.

                         

                        Once again thanks for the solution, very good work as this had me completely stumped and you've introduced me to the self join technique which I'm sure will be useful for future tasks.

                        • 9. Re: Creating a Bin dimension from an aggregated calculated field
                          Richard Leeke

                          Yes, do write that up, I'm happy to cast an eye over that.

                           

                          I've only used data blending like that a few times - so I tend to forget it's an option.  Like you, I tend to do this with custom SQL a lot of the time - but there are cases where that isn't an option - especially with extracts.

                           

                          I wrote up some other examples of using blending like this to get different levels of aggregation in this post in a series on Clearly and Simply - which you might find helps you get your head around it.

                          • 10. Re: Creating a Bin dimension from an aggregated calculated field
                            Andrew Watson

                            Finally managed to get this written up and published in 2 parts on my blog at this link: http://bireports.wordpress.com/2011/08/09/tableau-create-bins-from-measure-calculation/. Feel free to comment or advise if I have written something incorrect. Thanks for your help with this.

                            • 11. Re: Creating a Bin dimension from an aggregated calculated field
                              Andrew Watson

                              Finally managed to get this written up and published in 2 parts on my blog at this link: http://bireports.wordpress.com/2011/08/09/tableau-create-bins-from-measure-calculation/, apologies it has taken so long. Feel free to comment or advise if I have written something incorrect. Thanks for your help with this.

                              • 12. Re: Creating a Bin dimension from an aggregated calculated field
                                guest contributor

                                I found the self join technique used by Richard to be very helpful with my problem.  However, I am wondering if it's possible to filter the results on the self joined table (copy) before binning - I would like to bin only ListingIDs with month 3. Using a subset example of the data,

                                 

                                Month    ListingId    EnquiryCount

                                3    126003        36

                                2    126003        55

                                2    126008        1

                                3    126008        4

                                2    126014        1

                                2    126021        1

                                3    126041        2

                                2    126041        6

                                2    126045        3

                                3    126045        7

                                3    126046        1

                                2    126046        2

                                1    126047        1

                                2    126047        2

                                 

                                 

                                I get the following crosstab,

                                 

                                        Month    Month    Month

                                ListingId                      1              2              3

                                126003            55.00    36.00

                                126008            1.00    4.00

                                126014            1.00   

                                126021            1.00   

                                126041            6.00    2.00

                                126045            3.00    7.00

                                126046            2.00    1.00

                                126047        1.00    2.00   

                                 

                                 

                                I would like to bin only these ListingIds with month 3

                                 

                                        Month    Month    Month

                                ListingId                      1              2              3

                                126003            55.00    36.00

                                126008            1.00    4.00

                                126041            6.00    2.00

                                126045            3.00    7.00

                                126046            2.00    1.00

                                 

                                 

                                I tried to use the Month ofthe joined copy table to filter the data but did not get what I wanted because the discrete value of the month is * whenever there is more than one unique month in a row (the ATTR function!).

                                 

                                 

                                                Month    Month    Month

                                ListingId    Month (copy)                      1              2              3

                                126003        *            55.00    36.00

                                126008        *            1.00    4.00

                                126014        2            1.00   

                                126021        2            1.00   

                                126041        *            6.00    2.00

                                126045        *            3.00    7.00

                                126046        *            2.00    1.00

                                126047        *        1.00    2.00   

                                 

                                I was able to successfully filter this in Access database but would like to know is there is a pure Tableau solution. 

                                 

                                Please help.  Thanks.

                                • 13. Re: Creating a Bin dimension from an aggregated calculated field
                                  Sean Otto

                                  But you still have to put something in "Level of Detail" to make it work.  The question I have, is can you make a bin on a calculated field just like you can on an uncalculated field, I don't see why the two would be treated different, the process is still the same, one field is just calculated and one field is raw.  This would allow you to use the additional leverage of a binned field in Tableau.

                                  • 14. Re: Creating a Bin dimension from an aggregated calculated field
                                    Stuart Bartmess

                                    I've encountered this problem as well, and found a workaround.

                                     

                                    Let's say we have a binning calculation named "bins" as follows:

                                     

                                    If x<1 then "bin 1"

                                    Elseif x>=1 then "bin 2"

                                    END

                                     

                                    If x is an aggregation (e.g. sum of sales), then this new "bins" field is cast as a measure, not a dimension, so it can only be displayed as an aggregation, combining the data into a single bin. This is not what anyone wants. If we can force Tableau to treat this "bins" field as a dimension, all our troubles will be solved.

                                     

                                    One workaround is to encase this function in a fixed LOD calculation as follows:

                                     

                                    {FIXED Region:

                                    If x<1 then "bin 1"

                                    Elseif x>=1 then "bin 2"

                                    END}

                                     

                                    I'm not entirely clear why, but this function is now treated as a dimension, and the bins are discrete, so you're good to go!

                                     

                                    If x is an LOD calculation, this workaround fails; if anyone knows a way to do this with LODs, I'd be much abliged!