1 2 Previous Next 16 Replies Latest reply on Jan 7, 2015 5:43 AM by Simon Runc

    Customer Segmentation calculation(s)

    Susan Oner

      Hi Tableau forum,


      I'd like to create a dimension in Tableau assigning a customer type as defined by below:


      New = customers who've made a purchase in the trailing 12 months and never before

      Retained = customers who've purchased in trailing 12 months and preceding 12 months before that

      Lapsed = customers who've purchased in the 12 months preceding the trailing 12 months, but not in trailing 12 months

      Reactivated = customers who've purchased in trailing 12 months, then NOT in the preceding 12 months, but yes prior to that


      I'd like this calculated for each point in time - that is, not just today looking back.  The end use would be to sum the counts for each customer type and see what the % breakdown looks like over time.


      As an example from the dummy data set, customer type would look like this, if we looked at it from the end of each year:


      Customer2010Cust Type 20102011Cust Type 20112012Cust Type 20122013Cust Type 2013
      Aaron Davies Bruce$2,390New
      Aaron Day$710NewLapsed
      Aaron Dillon$18New$79RetainedLapsed$161Reactivated
      Aaron Fuller Davidson$208NewLapsed
      Aaron Riggs$3,598New$122RetainedLapsed$40Reactivated
      Aaron Shaffer$895New$40Retained$30RetainedLapsed
      Adam Barton$1,282New$17Retained$1,297Retained
      Adam G Sawyer$1,639New$254Retained$27Retained$1,178Retained


      I'm attaching an example workbook but to be honest there isn't much there besides the Superstore data.  All the IF/THENs and DATEADDs and various calcs I tried didn't work.


      Any help would be really appreciated!


      Thank you,


        • 1. Re: Customer Segmentation calculation(s)
          Diego medrano

          Hey Susan,


          Were you able to make any progress on this? Thanks for attaching the packaged workbook. Contacting support@tableausoftware.com may also help you to find a resolution.





          • 2. Re: Customer Segmentation calculation(s)
            Susan Oner

            Hi Diego,


            Not much progress, no.  There are a few issues: 1- the complexity of the

            calculation(s), 2 - the additional complexity of fiscal year (with a 4-5-4

            retail calendar) that's not a 'date' variable and finally, 3-



            I'm still struggling to understand if best practice is to pre-calculate

            outside of Tableau or do the calculation within it.




            • 3. Re: Customer Segmentation calculation(s)
              Zach Pipkin

              Susan if your data is structured more columnar, instead of horizontally, this can be done with table calculations. However, the fiscal date makes it very difficult. If you had an example data set I could show you how to do this, but not accounting for fiscal year.


              End of the line though, the number of table calcs needed will be a big hit to performance. If there is a way to do it in the back-end, I would encourage you pursue that route.

              • 4. Re: Customer Segmentation calculation(s)
                Susan Oner

                So Zach - let's say the end goal here is to get counts of customers in each segment - will this still be a big performance hit?  And should this still be done with table calcs?  Each customer's classification isn't important, just at the aggregate level.


                Ideally customer segments would be a stacked bar by year, i.e. in 2014 we had 35% New, 35% Lapsed, 10% Reactivated, 20% Retained customers; in 2013...etc etc.


                Thanks again-


                • 5. Re: Customer Segmentation calculation(s)
                  Zach Pipkin

                  I don't have a data set too similar to this. If you could provide a sample, I could try to mock up an example.

                  • 6. Re: Customer Segmentation calculation(s)
                    Susan Oner

                    Zach - Sorry, I meant to add that my data is structured very similarly to

                    the Superstore data in that it's down to the order level of detail (that's

                    why I'd used this as example data in the first post).  Here is some

                    attached dummy data - one row for each transaction, with several

                    dimensions (the real data has many more).  Customer segmentation is to be

                    based off of Brand ID.  This only has about 2600 lines, real table is much





                    Thanks again,


                    • 7. Re: Customer Segmentation calculation(s)
                      Simon Runc

                      hi All,


                      ...I'm a bit late to the party! but the problem of customer segmentation within Tableau is one I've wrestled with for a while. The problem being that you need to keep the customer level in the view to allow the aggregated calculations to run. I'm a bit short of time now, but Bethany Lyons Let's Talk About Set's Baby Webinar helped me out hugely. It's an hour long, but I must have got that hour back 100 times over!!




                      Look specifically at the Cohort Analysis bit, the formula structured [MAX( IIF([Category] = [Category Parameter], 1, 0 ) ) = 0] and combining of sets.


                      Hope this helps, it really helped me.

                      • 8. Re: Customer Segmentation calculation(s)
                        Susan Oner



                        Thank you so very much for your help with this, I really appreciate it.

                        The pass-through sequel might be the way to go, I will try this and see

                        what happens with performance.  The data source is HP Vertica tables and

                        they are very very large.  Perhaps one thing that would help is to limit

                        the look-back period for segmentation.  Another solution might be a

                        back-end append, like you suggest with Excel.  I agree with Simon that the

                        challenge is to have the customer level detail in the view, especially

                        when we are dealing with millions of data points.


                        Thanks again for your help.  These forums are always so helpful-



                        • 9. Re: Customer Segmentation calculation(s)
                          Susan Oner



                          Thank you for the response, I will definitely watch this video.  It's

                          really timely because I had another question about sets, hoping this will

                          give some insight.  Basically trying to use a flag to define a set of

                          customers without filtering for other dimensions.  Odd thing is it works

                          great in Superstore data otherwise I would have posted to the forum.


                          I see from your bio that you have experience in retail analytics... very

                          interesting.  Just out of curiosity what have you found that Tableau

                          provides that other BI tools do not?




                          • 10. Re: Customer Segmentation calculation(s)
                            Simon Runc

                            hi Susan,


                            Yes the use of Sets is very handy, although I have had a quick play with your problem using sets...and sets doesn't seem the be the answer here! You can do it but you'll need 3 sets for every year!...so I'd stick with kettan solution. To do this in Tableau directly it might be worth voting up the following idea http://community.tableau.com/ideas/2177 (don't want to turn Tableau into Excel! but this would be very useful)


                            btw you can still keep Customer Level in the detail, and produce good Vizes, when running aggregated calculations, but there are limitations. I've attached an example of a question I was asked a while back. I have used the SUM[Invoice Amount]) to create the Groups, and by bringing Customer Name (in the attached example from SuperStore Sales) into the detail shelf the calculation will perform at this level. If you click in the Bar Charts, you'll see that they are actually made up from stacking each customer. This is fine for Counts and Sums, but breaks down if you want to visualize a ratio calculation (say Avg Basket Spend) as with the requirement to have the Customer in the ID shelf, as any Visual will stack the individual Av Basket Spend, and so not show you the average of that group. You can get round this with Table Calcs, but it gets a quite complicated/convoluted!


                            My current way of handling this, is to use Tableau to create the segments as in the attached, use the Export data to create a table of customerID, Customer Group, and then join this back in (for me Via SQL) to use as a proper dimension. I could create the Bin/Customer Segment rule in the database itself, but I like to use Tableau to visualize where the best 'segment breaks' should be (I'm thinking RFM Customer segment models here).


                            Below is an image (unfortunately the data it too sensitive (and around 20M rows!) to share the workbook), but you can see that I use Parameters to change the 'breaks' between groups, so that I get 'sensible' proportions in each of my final RFM groups (...I split the Recency into 3 sections, where the bottom and top [and thus the middle group] are controlled by the parameters, and do the same for Frequency and Value, there is then a final RFM segment calculation that puts people into their final group, say IF Recency = High AND Frequency = High AND Spend = High then 'Gold Customer'...etc.) . As I change the parameters then numbers in each group changes, so is a really nice way to ensure sensible group sizes...Once happy I then, usually, get our amazing DevOps team to hard code this into the database (appreciate this is a bit of a luxury!) to use as proper dimensions. The Viz below is, again, individuals stacked, so the aggregate calculations work.


                            Customer Segment Investigation.JPG


                            ...Now for the actual question you asked!! I've not used a huge number of BI systems prior to Tableau (mainly a relatively unheard of one called 'Omniscope'), and then the ones that Statistics software provide (and my old friend Excel!).The main thing for me with Tableau is the flexibility and speed (speed of making Visualisations), this means I can play/investigate the data in lots of different ways before finding the optimum solution. The connections to many datasources (although the lack of direct connection to Google Sheets seems an oversight!), the speedy handling of large(ish) datasets (seems to be fine with upto 70-80 Million rows), and integration with R are also big draws for me.


                            By flexibility I find it offers the best of both worlds, you can (using the Show Me icons)  use it like a traditional BI tool, where you choose the 'Chart Type'. But if you look what it's doing, it's actually creating the Viz from the components of Marks, Size, Colour...etc. (if you choose a Tree/Tile Chart, and then change the Marks from Squares to Text you'll see what I mean). Once I'd got that you can start creating some really powerful Vizes (Even create your own)...for example on a line chart, try dragging a measure on the size tile, and the thickness of the line reflects the measure...I found that I can get an extra dimension on view which is still intuitive to view. Or for some real 'free form'/'info-graphics' check this one out!




                            So you can go from traditional Vizes, Stephen Few would be proud of (hopefully!), but also full info-graphicy type


                            There are quite a few comparisons on the Web, and in the Community (this one has some good links, to some in depth discussions)


                            Comparison of Tableau vs. Qlikview vs. Spotfire


                            Of course nothings perfect, so a lack of a Front End ETL (would be so good to have Altryx front end), and although Table Calcs give you a lot of flexibility, it is sometimes necessary to 're-shape' data outside Tableau (which can be a problem if you don't have decent SQL skills...or access to them), to make life easier in the long run (as you are having to do here) are slight negatives.


                            Hope this all helps

                            1 of 1 people found this helpful
                            • 11. Re: Customer Segmentation calculation(s)
                              Cristian Vasile



                              1. Omniscope rocks!

                              2. For Google spreadsheet try this driver Google Spreadsheet ODBC Driver - Overview

                                  Google Spreadsheets to Tableau connector - a working driver! Yay!

                              3. "lack of a Front End ETL" check this out third parties tools able to export/import data to/from Tableau Data Engine




                              • 12. Re: Customer Segmentation calculation(s)
                                Chris McClellan

                                Hi Susan,


                                I think you should be summarising the data in the database as much as possible before throwing it into Tableau, especially if you need it at such a summarised level.  Let the database do the crunching, and let Tableau display the end-result.  I've never used Vertica either, but hopefully you can create a view or another table to summarise the data in the way you need it and then let Tableau query that object.

                                • 13. Re: Customer Segmentation calculation(s)
                                  Jim Wahl

                                  While Vertica was built to do this kind of thing, it might not be too bad in Tableau :).


                                  A table calc approach isn't all that difficult. Table calcs are done inside Tableau, which makes for a big internal table, but all you need is the customer ID level of detail for each time period.


                                  Here's the path I started down:

                                  Customer Status =

                                  CASE PREVIOUS_VALUE("")
                                  WHEN "New"         THEN IF ISNULL(SUM(Sales)) THEN "Lapsed" ELSE "Retained" END
                                  WHEN "Retained"    THEN IF ISNULL(SUM(Sales)) THEN "Lapsed" ELSE "Retained" END
                                  WHEN "Reactivated" THEN IF ISNULL(SUM(Sales)) THEN "Lapsed" ELSE "Retained" END
                                  WHEN "Lapsed"      THEN IF ISNULL(SUM(Sales)) THEN "" ELSE "Reactivated" END
                                  WHEN ""            THEN IF ISNULL(SUM(Sales)) THEN "" ELSE "New" END


                                  PREVIOUS_VALUE returns the customer status for the previous year, when addressing is set to Order Date / Year. For the first year / row in the partition, "" is returned.

                                  In a table view this looks like this:



                                  Now you can sort of turn this into a graph using another calculated field to sum the customer IDs (we already have distinct customer IDs, because the view is at the customer ID level of detail).


                                  Count of Customers =

                                  // Normally you could just use COUNT([Customer ID]) or SUM[Number of Records]
                                  // but these won't count null values, which is required for the "Lapsed" count. 
                                  // The workaround is to use the table calc function LOOKUP(1, 0) to trigger a value on 
                                  // every cell including the non-existent Lapsed cells. If the cell exists LOOKUP(1,0)
                                  // will return 1. If it does not exist, it will return NULL and IFNULL returns 1. 
                                  // These values are then summed. 
                                  WINDOW_SUM(IFNULL(LOOKUP(1, 0), 1))




                                  The problem is that the bar chart below is really a stacked bar with one sliver for each customer ID. Tooltips and clicking on the bars are annoying, since you see just that sliver, but you can disable tooltips and use reference lines to put a the total value on each bar.


                                  If you're thinking that WINDOW_SUM() should fix this if the addressing / compute using is set to Customer ID, you'd be right except that Tableau doesn't let you partition a table calc (Count of Customer) using a field that is also a table calc (Customer Status).


                                  While it might be cleaner and faster to take the WINDOW_SUM() out, I've left it in, because I believe there is a workaround that requires some voodoo zen.


                                  In addition to the clicking / tooltip issues, this means that you can't create a stacked bar chart or show the percentage of total.


                                  Again, I agree with the advice to do as much as you can in the database. But sometimes you want to play with something before digging into the SQL. And I get a kick out of finding a use for PREVIOUS_VALUE().




                                  3 of 3 people found this helpful
                                  • 14. Re: Customer Segmentation calculation(s)
                                    Susan Oner

                                    Thank you, Jim.  I will try this - totally makes sense to test it before creating additional tables/bank-end work.


                                    I've used the stacked bar view with individual records in the past...  The only issue I have with this is that if you want to create a true stack by year (i.e. new, retained, react on top of one another then color by segment) there isn't a way to label the segments or % of total.  But I appreciate that this gets almost all of the way there without SQL - hugely helpful.



                                    1 2 Previous Next