7 Replies Latest reply on Apr 19, 2016 3:28 AM by ANTONIO GAZQUEZ

    visualise survey data - Multi-response questions and response combinations

    Amy Dyer

      I’ve looked on various blogs and forums and am really struggling on a way to analyse/visualise multi-response questions in groups? Within my survey dataset I have about 8 “select all that apply” questions and 6 “single response” questions which have been restructured within alteryx and is ready to use in tableau.  A couple of the key 'select all' questions would have a much stronger story if I was able to group some of the responses.  Hopefully below explains what I am getting at a bit better.

       

      e.g. Q1. What is your favourite fruit? (select all that apply)

      I have the data structured so I can see the numbers of people who selected each individual fruit:
      A. Apple
      B. Pear
      C. Lemon
      D. Lime
      E. Kiwi
      F. Banana
      G.Mango

       

      But I also want to look at the number of people who have made specific selections:
      H. Apple and Banana only
      I. Apple, Banana and Lemon only
      J.Kiwi and Lime only

       

      Does anyone have suggestions and recommendations on how i could visualise this? I have only recently had the world of tableau shown to me, and I would previously have done this as pivot tables in excel, but this not only seemed clumsy but I want to use tablau and lateryx more and more and we carry out similar survey's on a regular basis with this type of question and analysis needed.

       

      Thanks

      Amy

        • 1. Re: visualise survey data - Multi-response questions and response combinations
          Jennifer Pattershall-Geide

          Hi Amy,

           

          Perhaps a solution is to create a "sum" variable (either in Tableau with a calculated field or prior to import) that will let you know how many options a person selected? When I have multi-response items, typically they come coded as 1 for selected and 0 for not selected. So, if you create a sum of the fruit questions, you'd know that a 2 means two options were selected, and so on.

           

          From there, I could see possibly using a series of filters or parameters to view the data, though it might be kind of clunky. For instance, a filter for the "sum of fruit" variable, then filters for each type of fruit. So, I could set the filter to 2, and check "apple" as well, and then I'd only be seeing people who selected two fruits, where one of them was apple.

           

          I've not actually done this, so this is more a brainstorm, but I think it's a possible solution. I won't claim that it would be the best one, though!

           

          Good luck!

          Jen

          • 2. Re: visualise survey data - Multi-response questions and response combinations
            Jonathan Drummey

            Hi Amy,

             

            Are you looking to show the combo groups all at once (i.e. all three in the same view) or have the user make dynamic selections as Jennifer had described?

             

            Jonathan

            • 3. Re: visualise survey data - Multi-response questions and response combinations
              Amy Dyer

              Jonathan,

               

              Ideally I would like to show the combo grups together.

               

              My data is not actually based around fruit but healthcare information :-).  We have asked health professionals who use a particular tool when they use it.  So being able to see how many use it during particular stages (or fruits) eg Do you use the toold during any of these stages ? Stage 1,2,3,4,5 or 6 (select all that apply).  Analysing and visualising the number of responses to various combo group "Stage 1,2,3 only" vs "Stage 1 and 3 only" vs. "Stage 1,2 and 4" vs. "All stages" would be really useful.

               

              Thanks

              Amy

              • 4. Re: visualise survey data - Multi-response questions and response combinations
                Jonathan Drummey

                Hi Amy,

                 

                My assumption here given you mentioning Alteryx is that your data is in a "tall" structure with one record per respondent/question/answer, so for the multiple answer questions there will be multiple rows. If your data is in a different structure then a different approach would be required.

                 

                There are at least a couple of ways to tackle this, I chose to do this via multiple measures. In this example I used Superstore where the product Category is a stand-in for the stage/fruit (i..e the answer for a particular question) and Customer ID for the health professional. The goal is to end up with something like this view that counts how many customers purchased how much of what combination:

                 

                 

                1. I built out record-level calculations that return the Category for the given categories of a given cohort and Null for everything else. Here's the Binders, Office Furnishings, Storage & Org calc:

                 

                CASE [Category]

                    WHEN 'Binders and Binder Accessories' THEN [Category]

                    WHEN 'Office Furnishings' THEN [Category]

                    WHEN 'Storage & Organization' THEN [Category]

                END

                 

                And here's a view showing this:

                 

                 

                2. Since in the final view we only want to show 3 rows and not bazillions we've got to do some aggregation. There are two stages of the aggregation - one is to go across the records to determine if a customer has purchases from all N categories in a cohort, and then to count all the customers in each cohort. This is where Tableau's Level of Detail expressions come into play, they are great for situations where we want a calculation at a finer grain than the resulting view.

                 

                Here's the Binders, Office Furn, Storage & Org LOD initial calc:

                 

                {INCLUDE [Customer ID]: COUNTD([Binders, Office Furnishings, Storage & Org])}

                 

                In a view with Customer ID this is going to return the number of Categories in the cohort:

                 

                 

                FYI the reason why I'm using COUNTD() and not COUNT() is because in Superstore a customer may have multiple purchases in the same Category.

                 

                3. Then we can use that formula inside a larger calculation to only return the Customer ID if there have been enough purchases (or answers) to meet the criteria. Here's the Binders, Office Furn, Storage & Org LOD final calc:

                 

                IF {INCLUDE [Customer ID]: COUNTD([Binders, Office Furnishings, Storage & Org])} = 3 THEN

                    [Customer ID]

                END

                 

                Here's that calc in a view:

                 

                 

                4. Then I changed the aggregation of the 3 final calcs to COUNTD and put them in a view all their own:

                 

                 

                The reason to use COUNTD() here is also that the data has a finer grain, it's at the transaction level and I really just want to count the customers.

                 

                To validate that the calcs are accurate we can also use Tableau. I set up  view that filters for just Binders and Binder Accessorites, Office Furnishings, and Storage & Organization, put Customer ID as the dimension in the view, and set up a filter on COUNTD(Category). This only returns 203 marks, and since Customer ID is the dimension in the that means that there are 203 customers meeting these criteria, which is exactly the same as the measure we just built:

                 

                 

                I've attached a Tableau v9.2 packaged workbook. If you have any questions let me know!


                In the above instructions there are 9 calculated fields, though in the final view only 6 are used. Those 6 could be simplified to 3 (one for each cohort) by embedding the record-level calcs into the LOD expressions, or using Sets to build your cohorts and putting the Sets into the LOD expressions instead of the record-level calcs.

                 

                Jonathan

                1 of 1 people found this helpful
                • 5. Re: visualise survey data - Multi-response questions and response combinations
                  Amy Dyer

                  Jonathan,

                   

                  That's really helpful.

                   

                  It may be that I haven't quite followed your directions correctly but am I have found that by using the record-level calculation that return the category I am including respondents who have selected "Binders, Office Furnishings, Storage & Org" as well as other categories.

                   

                  What I'm trying to achieve is to look at the number of cusomters who ONLY have these products and not anything else. e.g.'Binders and Binder Accessories',

                  'Office Furnishings' and 'Storage & Organization' but NOT scissors, paper... etc.

                   

                  Is this a 'simple' calculation change?

                   

                  Thanks

                  Amy

                  • 6. Re: visualise survey data - Multi-response questions and response combinations
                    Jonathan Drummey

                    Hi Amy,

                     

                    I'm unable to tell what's going on since I can't see your calculations. If

                    possible, please post a Tableau packaged workbook (.twbx) with some sample

                    data. Sometimes posting screenshots of your calculations and the Tableau

                    workspace (including all shelves) can work. If you're unable to post any of

                    that here due to confidentiality issues, then we might be able to arrange a

                    screen share.

                     

                    In general, here's what I'd be looking at:

                     

                    1) Do the field values in the CASE statements exactly match the values of

                    the dimension you are using? This include punctuation and any leading or

                    trailing spaces that might be there.

                     

                    2) In the ...LOD final calcs do the numbers of answers/categories in the

                    cohort used in the IF statement exactly match the number of values in the

                    cohort? For example in what I'd posted the first cohort has 3 members and

                    the latter two both have 2 members.

                     

                    3) What is the viz level of detail (vizLOD) as determined by the dimensions

                    in the view and what are the ...LOD final calcs returning in that? What

                    aggregation is applied to the calculations? Understanding the level of

                    detail and aggregations is critical to making sense of what Tableau is

                    doing.

                     

                    4) Are there any filters in place that would be removing expected data

                    prior to aggregation?

                     

                    Hopefully that helps you figure it out, if not let me know!

                     

                    Jonathan

                     

                     

                     

                     

                     

                    On Mon, Apr 18, 2016 at 6:06 AM, Amy Dyer <tableaucommunity@tableau.com>

                    • 7. Re: visualise survey data - Multi-response questions and response combinations
                      ANTONIO GAZQUEZ

                      Hi Amy!

                       

                      I've been thinking about this and I guess what you need is a new variable which will be the result of the combination you previously set.

                       

                      Probably you have your data this way:

                          

                      Captura1.JPG

                       

                      To make this new dynamic variable (first or second combination) you will need n parameters and n calculated fields (as options you have in your multiple question).

                       

                      Parameters would be like:

                       

                      Captura2.PNG

                       

                      And calculated fields:

                       

                      CASE [Choose 1st fruit]

                      WHEN 'Apple' THEN [Q1_FRUIT]

                      WHEN 'Pear' THEN [Q2_FRUIT]

                      WHEN 'Lemon' THEN [Q3_FRUIT]

                      WHEN 'Lime' THEN [Q4_FRUIT]

                      (and so on...)

                      WHEN 'Null' THEN 0

                      END

                       

                      Then you would have a final calculated field which will build the new variable 'combination', the one you will visualize in your worksheet:

                       

                      IF [1stcalculatedfield]=1 AND [2ndcalculatedfield]=1 THEN 1 ELSE 0 END

                       

                      Finally, you can calculate with an average of the final combination variable. The user would have to interact with parameters to calculate the combination:

                       

                      Captura3.PNG

                       

                      Captura5.PNG

                       

                      Drawback here is that if you want just the 1st fruit you will have to select in every compact list the same fruit. I don't know how to solve this...

                       

                      Hope that helps! If you have any question please let me know!