2 Replies Latest reply on Jan 31, 2017 10:20 PM by Jacob Dee

    Managing large number of disaggregations for attributes

    Jacob Dee

      Good morning all,

       

      I am working with a dataset that contains a single continuous measure variable and a single variable for the primary attribute that describes the category of the continuous measure (e.g. height, weight, BMI). However, the dataset then contains a "disaggregation type" variable which describes the kind of disaggregation (e.g. Age, ***, marital status, etc) and then a third variable that a "disaggregation category" variable that actually lists the disaggregation groups (e.g. 0-5, 6-10, 1-15 /  male, female /  single, divorced, married / etc). These are quite numerous.

       

      Is this the best formatting for analyzing these data in Tableau? If not, how would you change? If so, how would you allocate these three variables to the different panes to be most efficient with the analysis?

       

      Apologies if this is a naive question.

       

      Many thanks.

        • 1. Re: Managing large number of disaggregations for attributes
          Jonathan Drummey

          Hi Jacob,

           

          This is an example of a super "tall" format of the data and can work fine in Tableau for many analyses. Where you can run into difficulties are:

           

          1) Number formats for your primary variable. In this case you might create additional calculations like IF [primary attribute] = "BMI" THEN [variable] END, then format that one to be a number with one decimal place. Then another might be IF [primary attribute] = "Height" THEN [variable] END and that is formatted to a number with 0 decimal places.

           

          2) Creating user changeable filters. For example if you have a view with a disaggregation category of age and then want a quick filter to just show the values for the age groups then you'll need to make sure the quick filter has Only Relevant Values selected.

           

          3) If there's no "respondent ID" or "person ID" kind of field in the data then you can skip this. If there is an ID and you want to be setting up filters for users to get at things like males/18-25/single then you'd need to do some work to either "unpivot" the data to be wider format (so there would be a row for each person with columns for the demographic categories) or build out calculations to create new dimensions to use. For example {FIXED [person ID] : MAX(IF [category] = '***' THEN [attribute] END)} would create a new dimension identifying the *** of each respondent, and so on.

           

          Jonathan

          • 2. Re: Managing large number of disaggregations for attributes
            Jacob Dee

            Thanks Very much Johnathan, this is very appreciated. 

            So there is no particular benefit to concatenating these three variables into one ("BMI male 15-20") as opposed to having this information spread across three separate variables? I am struggling to conceptualize how I would distribute these three variables across the panes if I wanted to see the relationship between, for example, "BMI male 15-20" and "blood pressure female 20-25". Or if I wanted to create a calculated variable based on just the disaggregation BMI male 15-20. 

            Again, apologies if this is a naive question. Your assistance is really valued. 

            Thanksjacob