6 Replies Latest reply on Jun 9, 2013 9:55 AM by Matt Lutton

    Calculating Full Day vs Half Day

    Matt Lutton

      I am pretty new to Tableau and am trying to learn as much as possible about calculations, in particular, as data is presented to me in a way that will require me to calculate often.

       

      That being said, I've gotten a lot of great help from forum members already, and I really appreciate it.  I've been working with some sample attendance data, and am now trying to calculate a half day versus a full day absence.  I am attaching a packaged workbook.  My attempt at calculating a very simplified value for a half day or full day is shown in the calculation "Absence Value".  I end up with .5 regardless of the number of periods listed.  (I want to get a total day count of absences, based on these values).

       

      I have tried a few formulas, but left it at what you see in the attached workbook.

       

      Again, it is important that I am able to aggregate these values further (show a total # of absences by day count for a given time period).  I will also need to take into account whether the field "Absence" is marked as "Absent" or "Tardy" and am only concerned with instances where "AbsenceType" is marked as "Unexcused"

       

      In another thread (http://community.tableau.com/message/213215), Dimitri Blyumin helped me reach a simplified Day Count of Unexcused Absences using the formula in the calculation "Unexcused Absent Count (fixed)". I am trying to add another layer into this problem and would like advice on how to approach it.  If I can nest this in one calculation for a Day Count of Unexcused Absences, that would be the ideal scenario.  If that isn't possible, I understand.

       

      I hope my problem makes sense.  Again, I am very new to this. I do not post to get others to solve my problems, but rather, to learn from the best. If you can walk me through some logic, and the HOW and WHY things must be done a certain way, I would be forever in your debt.

        • 1. Re: Calculating Full Day vs Half Day
          Jim Wahl

          Hi Matthew,

           

          I don't fully understand your data or your goal, but I believe the 0.5 issue is a result of how Tableau handles aggregations.

           

          When you add an aggregate such as SUM(Sales) or COUNT(Items) to a view, you get the sum of sales over all values. You can try this by double clicking on Sales in the Superstore example data set. But when you add additional dimensions, Tableau segments the data into "panes" for each dimension and the aggregation is calculated per pane. Double click on category to see an example.

           

          In your case, you have several dimensions including ID > Date > Period in the view and, therefore, the COUNTD() function in Absence Value is being calculated on a per period granularity---it will never "see" more than one period.

           

          To aggregate at a different level of granularity, you need table calcs. Either use the WINDOW_xxx() functions, or you can wrap a normal aggregate function, such as COUND(), in a TOTAL() function. Then you can edit the compute using to adjust how the values are combined (or addressed) and how they are divided (or partitioned).

           

          Again, I don't quite understand the value you're trying to calculate and how this will be used in a view, but you have a couple of options.

           

          One is to use a WINDOW_SUM() function around the Unexcused Absent Count (fixed) field. For example

          IF WINDOW_SUM([Unexcused Absent Count (fixed)]) > 2 THEN 1

          ELSEIF WINDOW_SUM([Unexcused Absent Count (fixed)]) > 0 THEN 0.5

          ELSE 0

          END

           

          Once added to the view, you'll need to make sure ID, Date, and Period are all in the right-hand "Addressing" box in the table calc's advanced compute using section. The addressing is hierarchical, so they'll need to be in that order (from top to bottom). Then you'll want to select reset every Date.

           

          Another option, I think, is to extend Dimitri's trick to add period to the string COUNTD(). So you get something like

          COUNTD(

              IIF([Absence] = 'Absent' AND [AbsenceType] = 'Unexcused', STR([ID])+STR([Absence Date])+[ClassPeriodName], Null)

                 ) )

           

          You could wrap this in IF ... > 2 THEN 1 ELSEIF ... > 0 THEN 0.5 ELSE 0 END. Note that in the IF-THEN clause, you want >2 condition first; otherwise if the >0 is first, a value of 3 will be TRUE on this first condition and return 0.5.

           

          Now if you use this formula by itself in a view, it should work fine, but as you start adding dimensions (to any part of the view, including color or shape) then the aggregate calculation will be at the finest level of detail. You avoid this by wrapping this function in TOTAL() and setting the table calc's compute using field.

           

          I hope this helps (and I didn't completely miss your issue). I'd be happy to take this a bit further, but it would be easier for me to get my head around a smaller data set---10 students, 3 dates, 4 periods---and a few use cases / visualization goals.

           

          Jim

          1 of 1 people found this helpful
          • 2. Re: Calculating Full Day vs Half Day
            Matt Lutton

            Thanks, Jim.  I haven't studied this in depth yet, but can I calculate a total of these values?


            The end goal is:  a summation of the values, based on a filtered time period.  So, if the last 5 days are selected, I need to be able to see the total value of absences based on this calculation.  These table calculations returned the results desired, at the student level, but how do I get the total of the values added together?


            I need to be able to adjust the "day count" graph to represent the sum of these values, which I don't know if I can do with a Table Calculation.  See example below:

            AbsenceTrend2.jpg

             

            It may be important to note that I don't actually need to see the class periods in the view, I just had them there to check the calculation.  I really just need a very simplified table of data, similar to this:

            Absence or Tardy Calc.jpg

            Thank you for taking the time to help!

            • 3. Re: Re: Calculating Full Day vs Half Day
              Jim Wahl

              Hi Matthew,

               

              Thanks for the additional details. The basic problem is to calculate for every day and every student (ID) the number of periods the student is absent or tardy. If > 2 periods, then count it as 1 day of if between 1 and 2 periods, 0.5 day. You want to generate

              1. a plot showing sum of absent days for all students over time and
              2. a table report with student ID, count of absent days, and other metrics.

               

              The tasks generally require table calcs, since you need to aggregate at different levels, first calculating the number of periods a student is absent on a particular day and then either 1) summing the result of all students for each day for the graph of 2) summing all days for a student for the table report.

               

              While these table calcs are not difficult, they are a bit tedious and most of the items in the table report will require a table calc, since the class period level of detail is required in the view (but you need your fields to operate at a higher level of aggregation). Depending on your data "density" you may also encounter issues with discontinuity in the line graph. I don't see this in your sample data, but you'll encounter it if you don't have at least one absence per period per day in the data.

               

              Reshaping Data

              I'll show you the table calcs I used below, but you may want to consider reshaping your data and consolidating the periods into a "count of periods" column. Your new data would have one row per ID, Date, Absence and Absence Type. You could either create a view in your database or use Tableau's custom SQL with a count(*) and a group by statement.

               

              Once you do this, I don't think you'll need table calcs. And your life will be easier.

              For example, Unexcused Absences Days becomes

              IF [AbsenceType] = 'Unexcused' AND [Absence] == 'Absent'  THEN

                  IF [count_of_periods] > 2 THEN 1

                  ELSEIF [count_of_periods] > 0 THEN 0.5

                  ELSE 0

                  END

              END

               

              You can use the same calc in either the table or graph view. And fields like Date Last Absence are easy ---

              MAX(IF [Absence]='Absent' THEN [Absence Date] END).

               

              Table Calcs

              The other approach is table calcs. For a field like Unexcused Absences Days, you need to calcs because you need to aggregate at two levels. First you need to count the absence days for each student, based on class periods (see comments in workbook fields for more detail on FIRST() and TOTAL()):

              Unexcused Absence Days (intermediate)

              IF FIRST() == 0 THEN 

                  IF TOTAL(SUM(IIF([Absence] = 'Absent' AND [AbsenceType] = 'Unexcused', 1, 0))) > 2 THEN 1

                  ELSEIF TOTAL(SUM(IIF([Absence] = 'Absent' AND [AbsenceType] = 'Unexcused', 1, 0))) > 0 THEN 0.5

                  ELSE 0

                  END

              END

               

              And then add these days for a each date or ID, depending on the view.

              Unexcused Absence Days

              IF FIRST() == 0 THEN

                  WINDOW_SUM([Unexcused Absence Days (intermediate)])

              END

               

              When adding these to the view, you need to have ID, ClassPeriodName and Absence Date in the view. For the graph, Absence Date is on the columns shelf and ID and ClassPeriodName are on the level of Detail shelf. After adding Unexcused Absence Days, select compute using advanced and you'll see that this is a nested table calc---there's a pull-down menu at the top of the edit table calc dialog box)---and you set the compute using parameters for both Unexcused Absence Days and the inner calc Unexcused Absence Days (intermediate).

               

              Start with the intermediate field. You want Absence Date > ID > ClassPeriodName in the right-hand addressing box in that order. Click OK and select reset every ID. This will run the calculation using all Class Periods for every ID and Access Date.

               

              The compute using for the outer field depends on the view. For the graph, you want the same hierarchy as above, but want to select reset every Access Date. For the table, which is summarized by student ID, you'll want ID on top of the hierarchy, and you'll want to sum for all Access Dates.

               

              Now you'll also need table calcs for fields like Last Absence Date, assuming you want this field on the same view as the Unexcused Absence Days. These will look like WINDOW_MAX(MAX(IF [Absence] == "Absent" THEN [Absence Date] END)).


              In the table, you'll also want to filter to exclude NULL values.

               

              Anyway, hope this helps. Again, I'd probably reshape the data if I were in your shoes. ...

               

              Jim

              • 4. Re: Re: Calculating Full Day vs Half Day
                Matt Lutton

                Thank you for your reply.  I agree, if the data were presented in a way that made more sense, my life would be a lot easier.  We are working with a LotusNotes based system that feeds into our data warehouse... and its messy to say the least.

                 

                We did what you suggested (in terms of reshaping the data) on Friday, and now I have a period count like you described  However, it does not differentiate between tardies and absences.  I will try nesting if statements as you exemplified and see if that works. 

                 

                I will read and re-read your post and see what seems to make the most sense for us.  Thank you, again!

                • 5. Re: Re: Calculating Full Day vs Half Day
                  Jim Wahl

                  No problem. Good luck.

                  1 of 1 people found this helpful
                  • 6. Re: Re: Calculating Full Day vs Half Day
                    Matt Lutton

                    Seriously, you just gave me a very nice "Ah ha" moment.  I am forever in your debt.    THANK YOU, I believe the reshaped data works with your calculations, and the explanation you provided really helped me get my head around this problem.

                     

                    Thank you, thank you, THANK YOU.