1 2 Previous Next 19 Replies Latest reply on Feb 4, 2013 12:22 PM by Joshua Milligan

    need help with table calculation of status history

    mohen leo

      Hi!

       

      I need a little help with a table calculation:

       

      I have a record of tasks and status changes on the tasks, with the dates the status changes were made.

       

      task

      status change date

      status

      A070

      1/4/2012

      inactive

      A040

      1/7/2012

      A

      A060

      1/9/2012

      A

      A060

      1/10/2012

      inactive

      A080

      1/13/2012

      B

      A060

      1/13/2012

      inactive

      A080

      1/14/2012

      inactive

      A040

      1/17/2012

      inactive

      A030

      1/20/2012

      B

      ...

      (and so on)

       

      What I need is an area chart that shows the status history of the active tasks.

       

      With the date on the horizontal axis, I would like an area chart that for each date shows the total number of tasks with status A and total number of tasks with status B at that time, while excluding inactive tasks.

       

      (rough hand-drawn sketch, not matching provided data)

       

      taskhistorysketch.jpg

       

      Tasks can go back and forth between statuses over time, become inactive and active again, and there aren’t status entries on every date for every task. Also, new tasks are added over time that didn’t exist from the start.

       

      For every date, I need to get the most recent status of every task up to that date, exclude inactive tasks, and chart the number of status A tasks and status B tasks.

       

      I can't figure out the table calculation that will do that.

       

      Any help would be appreciated.

       

      A sample data excel file is attached.

       

      Thanks!

        • 1. Re: need help with table calculation of status history
          Mark Holtz

          Hello,

          I am not sure you can achieve what you want to do with your data structured as it is now. From what I understand, you essentially want to be able to count a task as being in a certain status on each day through a period of time.

           

          I am assuming that the data you supplied shows only the points when the status CHANGES from whatever it was to the new status designated for that record. To make the chart you mocked up for task A070, you would want to have a record for each day of its "lifespan." I have done what I'm describing for a a handful of the tasks in your data and then mocked them up in Tableau.

           

          This is the easiest way I know to get at what you're after.  Essentially, you want to graph data that you do not currently have. (the Status for tasks in between their "change" dates). Within Tableau, I don't otherwise know how to create this data that is missing from your original source.

          • 2. Re: need help with table calculation of status history
            mohen leo

            Thanks for your response.

             

            Would there maybe be a way to "fill in the gaps" in a calculated field?

            • 3. Re: need help with table calculation of status history
              Joshua Milligan

              Mark,

               

              Tableau can indeed create the missing data.  Whether it can be done easily is another question. 

               

              The concept is known as domain densification and I am just starting on a journey to understand how it works.  Some of the Zen Masters like Joe Mako and Jonathan Drummey have been pioneering how domain densification (domain padding and domain completion) works in Tableau for some time now.

               

              I took a shot at solving this question and feel like I got close.  I haven't given up yet -- I think it can be done.  I'll post what I have so far (which is a horrible mess, as it contains about 4 or 5 different attempts, poorly named calculations, etc...).  I certainly welcome the guidance of the masters - though they may want to start with a clean slate to show how it can be done!

               

              Joshua

               

              Padded Status.png

              • 4. Re: need help with table calculation of status history
                Shawn Wallwork

                You're both in luck, if you can wait 'til Thursday. Joe, Richard & Jonathan are leading a domain padding discussion. Registration info here: http://community.tableau.com/message/198586#198586

                 

                --Shawn

                • 5. Re: need help with table calculation of status history
                  Joshua Milligan

                  Mohen,

                   

                  Yes, believe the gaps can be filled.  You can see above my attempt, which gets close, but I can't quite get it to the visualization you want.

                   

                  I failed to mention  Shawn Wallwork and Richard Leeke who also have a good grasp of domain densification.  They may have some guidance as well.

                   

                  Joshua

                  • 6. Re: need help with table calculation of status history
                    Joshua Milligan

                    Shawn,

                     

                    That sounds wonderful!  Except, that Thursday is the one day I can't do.  Will the discussion be recorded?

                     

                    Joshua

                    • 7. Re: need help with table calculation of status history
                      Shawn Wallwork

                      Yep, they usually are. You should join the TDT group and you'll get email notices. [FYI: I ain't part of the domain padding crowd, just an interested by-stander.]

                       

                      --Shawn

                      • 8. Re: need help with table calculation of status history
                        Joshua Milligan

                        Shawn,

                         

                        Thanks!  I will definitely have to join and watch.  I assumed you were part of the domain padding crowd as you always seem to have a good grasp of the subject.

                         

                        I did come up with an answer!  (I hope it's right... )

                         

                        Joshua

                         

                        Padded Status.png

                        • 9. Re: need help with table calculation of status history
                          mohen leo

                          That looks correct, thank you!

                           

                          I'm a little intimidated by the solution, though. There's no way I would have come up with that on my own.

                           

                          I'll definitely watch the discussion on Thursday.

                          • 10. Re: need help with table calculation of status history
                            mohen leo

                            I have a follow-up question, though:

                             

                            If statuses could change more than once in a day (and the status change time was recorded with date and time), how could you solve this problem (whether it's in Tableau or even formatting the data before giving it to Tableau)?

                             

                            You wouldn't be able to pad statuses to the level of detail of every minute or even second of every day, right?

                            • 11. Re: need help with table calculation of status history
                              Joshua Milligan

                              Mohen,

                               

                              As soon as I get a chance, I'll clean-up my work and add some explanation.  It is complex, and I've observed quite a few discussions on whether this kind of a solution is practical or not.  I'm still learning the complexities, so maybe some of the experts would like to jump in and critique my work or provide alternatives.

                               

                              Joshua

                              • 12. Re: need help with table calculation of status history
                                Joshua Milligan

                                Mohen,

                                 

                                That's a good point.  I worked the solution at a day level -- though domain padding can work at any level (hour, minute, second).  I'd have to double check whether the first table calculation is set to run along the deepest level of the date field, or just the day.  Changing the date to continuous seems to be okay (and it actually results in a better looking chart), but I'd have to double check at different levels to make sure that didn't impact anything either.

                                 

                                When I clean up the workbook and write-up some explanation, I'll run some test cases and let you know what I find out.  I'm also looking forward to the discussion on Thursday!  (Though I'll have to watch it later)

                                 

                                Joshua

                                • 13. Re: need help with table calculation of status history
                                  mohen leo

                                  Great, thanks!

                                   

                                  I watched the discussion today and while it was really interesting and informative, I still don't know how to solve this particular problem.

                                   

                                  I've attached a slightly revised version of the data source excel file.

                                   

                                  In this one I'm displaying the date and time, and I've manipulated a few entries so that a handful of statuses change twice within a matter of minutes or seconds.

                                  • 14. Re: need help with table calculation of status history
                                    Joe Mako

                                    This takes advantage of undocumented features of Tableau, so this is potentially risky without deep awareness of how Tableau works.

                                     

                                    The logic I used is:

                                    - Complete the Domain for all potential Task Date Time combinations

                                    - The formula for "Fill Status" is:

                                     

                                    IF ISNULL(LOOKUP(AVG(1),0)) THEN

                                      PREVIOUS_VALUE("inactive")

                                    ELSE

                                      MIN([status])

                                    END

                                     

                                    The test <code>ISNULL(LOOKUP(AVG(1),0))</code> checks to see if the mark is a padded one or exists in the query results. If the mark (combination of Task and Date Time value) does exist in query results, then return "Status", otherwise carry forward the previous status returned from this formula, or "inactive" if there is no previous mark. This will address along the Date Time field

                                     

                                    - Then sum 1s for each mark within each Date Time value, only returning a mark for the first Task with the formulas like:

                                     

                                    IF FIRST()==0 THEN

                                    WINDOW_SUM(IF [Fill Status]=="A" THEN 1 END,0,IF FIRST()==0 THEN LAST() ELSE 0 END)

                                    END

                                     

                                    in v8, you would want to write this as:

                                     

                                    IF FIRST()==0 THEN

                                    WINDOW_SUM(IF [Fill Status]=="A" THEN 1 END)

                                    END

                                     

                                    the arguments removed were to help v7 evaluate the computation faster.

                                     

                                    - Finally this takes advantage of the Area Mark Type to create the final result.

                                     

                                    I left some extra sheets so you can see the progression.

                                     

                                    Anyone is welcome to get in contact with me if they would like to discuss the details of the techniques used here.

                                    1 2 Previous Next