2 Replies Latest reply on Jun 14, 2013 5:11 AM by Matt Lutton

    Help with Data Blend and Grouping

    Matt Lutton

      Hello all:

       

      I have created a sample workbook to exemplify what I'd like to do.  I have connected to three data sources--one with attendance details, another with reshaped attendance data, and a 3rd Excel sheet.  I am using the Excel sheet to incorporate a number of possible days in a school year (180 in this example).  The attendance details and summary are basically the same fields, but the summary has a class period count that helped me calculated Unexcused Absences without a Table Calculation (thanks Jim Wahl

       

      As shown in the attached workbook, I calculated each student's attendance rate based on an Unexcused Absence count and a set # of possible days.  This is all just for testing purposes, I realize there are other variables that could come into play, but my question is specific to one area:  I have class periods grouped together in the second sheet.  I'd like to do the same type of attendance rate calculation, but at the class period grouping level instead of of student level. 

       

      I'm not sure if blending the data was ideal in this situation, so feel free to give advice in that realm as well.  My main question is:  how should I approach the class period groupings and calculating an attendance rate for those class periods.  The periods are grouped because in one scenario I am looking at, periods 1 & 5 are both period 1, but one is a red day and the other is a blue day (block scheduling).  So, grouping them together made sense.

       

      I am pretty new to Tableau.  If you can explain the numbers represented in the second sheet with the class period groupings, that would help me in my approach.  I want to make sure the numbers I include in this type of sheet are accurate--and anytime I try to add a distinct count of IDs to the view, it greys everything out so I can't check the numbers.  I would like to understand what affect the fields I choose to blend on have on the data--for instance, if ID is not blended on, the days possible stays at 180.  Are the numbers shows in days possible now representative of the # of students in each grouping multiplied by 180?? 

       

      I hope what I'm saying and trying to do here makes sense.  Any advice on what fields I should or should not use for blending, and where these numbers that are displayed come from, would be appreciated!  Also, if there is a better approach for calculating attendance rates for the class period level, I am all ears.  I know I have a lot to learn.

       

      As always, THANK YOU forum members, for helping me learn.

        • 1. Re: Help with Data Blend and Grouping
          Jim Wahl

          Hi Matt,

           

          Sorry I missed your ping on this. Are you still working on this?

           

          I would like to understand what affect the fields I choose to blend on have on the data--for instance, if ID is not blended on, the days possible stays at 180.  Are the numbers shows in days possible now representative of the # of students in each grouping multiplied by 180??

          Blending is Tableau's version of Excel's VLOOKUP, with a twist or two. The main one in your example is that the blend is done at the level of detail specified by the linked fields, which in Tableau 8 you can manually configure regardless of whether the fields are in the view. This allows blending at different levels of aggregation.

           

          When you blend on School Year, Tableau retrieves the 180 DaysPossible for the 2012-2013 school year, for each row in your view. If you weren't filtering on School Year, you'd see 180*the number of school years where the class period was present (presumably all).

           

          When you add the ID link, Tableau does the blend at the ID level of detail, so you get 180*the number of rows---since each row has an ID. Again this is segmented by the other dimensions in the view---Class Period Group in this case. At this point, blending starts to look like a JOIN, since it's reporting the DaysPossible for every row in the primary data source.

           

          Calculating Attendance Rate by Class Period

          I'm not sure this is possible with your current data set.

           

          Attendance rate for students has the formula:

          (1) DaysPossile - (2) number of days student absent

          ----------------------------------------------------------------

          (1) DaysPossible

           

          (1) is the total number of days the student should have been present.

           

          Assuming I understand your calculation for class period attendance rate, the denominator needs to include the total number of students that should have been present during that period.

           

          (1) StudentsPossible - (2) number of students absent

          ----------------------------------------------------

          (1) StudentsPossible

           

          Say there are 5 class periods per day and 10 students. If a student is absent in the first period, that period would have an attendance rate of 90% and the remaining would be 100%. Your data shows the count of students absent by period, but it doesn't have the total number of students?

           

          To calculate this over a time period, you could average the daily rate. Taking the above example over 10 days, if the student was only absent the first period on one day, the 10-day attendance rate for that period would be 99%. (90% + 100%*9) / 10 = 99%.

           

          You could get the same result with

          (1a) StudentsPossible*DaysPossible - (2a) number of students absent over that time period

          ---------------------------------------------------------------------

          (1a)

           

          If you assume that the number of students possible is fixed over the school year, this might easily be added to your secondary data source with days possible. You could even do the multiplication in the secondary data source, and now you just need to lookup one value. And I don't think you need the third data source (absences summarized by class period at the student level) at all.

           

          Anyway, hope this is still of some help to you.

           

          Jim

          • 2. Re: Help with Data Blend and Grouping
            Matt Lutton

            Thanks.  I know what calculations are needed,but didn't know if there was a way to utilize some option I'm unaware of in this example.  I appreciate your response, as it is helpful in helping me understand data blending and what some options are if the data just isn't there.  Thanks again!