3 Replies Latest reply on Oct 20, 2016 7:55 AM by David Li

    How can I normalize percentages based off number of records?

    Kristie Wirth

      Hi there,

       

      I'm doing a basic graph to show the percent of students who graduated within each major. Something like:

       

      Biology - 50% graduate, 50% don't graduate

      Art - 30% graduate, 70% don't graduate

      etc.

       

      I want to show the majors that have the highest graduation percentages at the top of my graph. However, I want to figure out if there is a way to normalize the percents. What I mean is, some majors are so small they only have 3 students total. So, some majors may have a graduation rate of 66.6%, but really that just means 2 of the 3 students graduated. I want to somehow standardize the percents based off the number of students that make up each percent so that these smaller majors are weighted less. Does anyone know a way to do this in Tableau?

        • 1. Re: How can I normalize percentages based off number of records?
          David Li

          Sure, you can do this a number of different ways. But how would you want to adjust the percentage results, exactly? It seems misleading to say that the sum of graduates and non-graduates for any major should be less than 100%. Instead, perhaps you could add another number indicating how many people graduated?

          • 2. Re: How can I normalize percentages based off number of records?
            Kristie Wirth

            Hi David,

             

            I'm honestly not sure. Are there any best practices around normalizing data like this?

             

            What are the ways that you know about? I'd be open to doing the normalizing on the hard numbers behind the percentages instead if that makes more sense.

            • 3. Re: How can I normalize percentages based off number of records?
              David Li

              Well, I admit I'm not a trained statistician, but I would think that in most cases, you should report the number with the standard calculation and then just add some kind of note saying that the sample size is small. This is especially the case here, where it wouldn't make any sense if the two numbers don't add up to 100%. I also don't think it would make sense to adjust both values so they have a different midpoint.

               

              For instance, let's think about what numbers you would actually want to show. Let's say that you have a major, Underwater Basket-Weaving, that has 2 graduates and 1 non-graduate. What % graduated number would you show that would both accurately represent the graduation rate and the small sample size? Would you decrease the % graduated? Increase it? Those are really your two options, and personally, I think they're both misleading, and thus I think it'd be wise to avoid trying to encode two entirely different pieces of information into one number.

               

              Instead, I highly recommend that you consider creating a view that encodes both data separately. For instance, maybe something like this: