13 Replies Latest reply on Oct 7, 2019 12:34 PM by Soumitra Godbole

    How to visualize average concurrent events per day

    Federica Calanca

      Hello everyone,
      I am fairly new to Tableau and the task I've been assigned is causing me some headaches...
      I have a data base of events, each with a given "start_date" and "end_date", in the format < DD/MM/YYYY hh:mm:ss > and I would like to visualize them per day (ideally horizontally, with a bar as long as their duration) and be able to tell which is the average number, each day, of concurrent events.

       

      I really don't know where to start... I read a couple of related topics, but felt they didn't suit my case. Any help/documentation/reference would be more than appreciated.

       

      Thanks!

        • 1. Re: How to visualize average concurrent events per day
          Jim Dehner

          Good morning

          this is what is referred to as the 2 date problem

          the way to address the issue is with a date scaffold - it is just another file with a sequential list of all the dates in the range of your data - you Cartesian join it with the base data and then count the programs that are active on each date -

           

          now the good news - I keep a step by step HOW To instruction model on my public site at Tableau Public

          it will show you haw to do the analysis - not difficult - just several steps

           

          it returns this

           

           

          Jim

          If this posts assists in resolving the question, please mark it helpful or as the 'correct answer' if it resolves the question. This will help other users find the same answer/resolution.  Thank you.

          • 2. Re: How to visualize average concurrent events per day
            Soumitra Godbole

            Hi Federica,

             

                No reason to worry or for Headaches (or Tylenol  ). Here is my simple solution with

            dummy data. I have attached the excel spreadsheet along with the twbx below.

             

            Dashboard.png

             

             

            The following are the steps outlined below

             

            1) Data: -

                There are 2 tabs in the Excel file, one containing the data and the other only the

                dates starting from 23rd Sep (Today) till the 30th of this month (1 week into the future)

             

               Tab 1 - Data

            Tab 1.png

             

            Tab 2 - Dates (all days between the start and end date)

             

            Tab 2.png

             

            The reason why the date field is repeated twice is because we will be doing a join and we need 2 such

            joins i.e. one for the  Start Date and One for the End Date.

             

             

            2) Data Join:-

                This is the heart of the problem and understanding this is important with regards to solving this

                problem. Here we do 3 joins, the first is a simple Cartesian Join (m x n). The 2nd and 3rd joins

                apply the conditions i.e. for each event, the Date has to be greater or equal to the start date

                and also lesser or equal to the end date. Lastly the reason for using Date function on the left

                was to convert the Date-time field on the left to equate it to the Date field on the right.

             

            Data Join.png

             

               Once you have got to this stage the rest is a piece of cake.

             

            3) Calculated Fields:-

             

            Formula 1 - Duration.png

                 Formula 2 - Concurrent Events.png

                These are the only two fields needed

             

             

            4) Creating the Visual :-

             

                Simply place the necessary fields (including the above calculated fields) into the shelves as shown

                below.

             

            Overall.png

             

            Finally , with some creativity and imagination you can arrive at the dashboard shown on

            top. I also added a filter action for a specific day (date) that filters the Gantt chart on the

            right. Double clicking it displays all the days and events. For Eg:- 27th Sep has 5 concurrent

            events and the Average duration would be 1.83. Also the breakup of each Gantt Bar shows

            the duration for each event ID.

             

            Filter Action.png

             

             

             

            Hope this helps and do let me know in case of any question or for more clarity. In case this

            answers your question, then please mark as helpful, correct answer. Thanks !

             

            Sincerely,

             

            Soumitra

            2 of 2 people found this helpful
            • 3. Re: How to visualize average concurrent events per day
              Federica Calanca

              Thank you Soumitra for the clear and detailed explanation!

              1 of 1 people found this helpful
              • 4. Re: How to visualize average concurrent events per day
                Federica Calanca

                Hi Soumitra Godbole,

                 

                Now that I pay closer attention to my requirements, your solution is good, but not exactly what I was looking for.
                Indeed, it counts as "concurrents" all events that took place in that day. But this is not quite true...

                 

                For example, I've added a "Test event" to your data:

                 

                Screenshot 2019-09-25 at 13.48.38.png

                 

                As you can see Event 01 and Event 11 are not concurrent (they do not take place at the same time).  Event 01 and Event 02 do not overlap as well... So the count for that day should be 0

                 

                That's what I am struggling with... Daily granularity is not sufficient. How can I adapt your solution?

                 

                Many thanks,
                Federica

                1 of 1 people found this helpful
                • 5. Re: How to visualize average concurrent events per day
                  Soumitra Godbole

                  Hi Federica,

                   

                      Thanks your you sincere opinion which is greatly appreciated. Believe it or not, this was the first

                  question I had in my mind after looking at my completed solution. What if some events were much

                  shorter than the others.

                   

                      The problem was that was no dummy data provided or an idea of roughly how long does

                  an event last (average, max time, min time) so I had no idea of what data you were looking at.

                  That is why I created  dummy data that typically had a duration in hours.

                   

                       No worries your problem can still be solved with minimum change as soon as you tell me what is

                  your least count (smallest unit) for time. I presume it would be seconds as your date format is

                   

                  dd-mm-yyyy  hh:MM:ss

                   

                  All this will do is add unnecessary rows  (86400 times the number when the unit is day). Luckily if the

                  events are short like in seconds, minutes or maximum an hour, then it should not be that bad.

                   

                      So before I start working towards arriving at the solution for your problem, I need to get an idea of

                  typically how long the events can last (duration). If not simply tell me the Max duration, Min duration and

                  the Avg Duration. Also it would greatly help if you provided a few rows of dummy (or fictitious) data.

                   

                  Sincerely,

                   

                  Soumitra

                  1 of 1 people found this helpful
                  • 6. Re: How to visualize average concurrent events per day
                    Federica Calanca

                    Hi Soumitra,

                     

                    thank you again for your reply and my apologies for not providing sufficient content in my question.

                     

                    So, the duration of the events contained in my DB is typically hours or days (depending on its "category"), but there are also events that last weeks or months or seconds, even though the majority stays in the hours / days range.

                    You are right in assuming the smallest unit is seconds: ideally I would like to have a "minutes granularity" in my viz, but rounding to hours can also be fine if it simplifies things.

                     

                    I'm attaching an enhanced version of your dummy data, that may be more similar to mine.

                    Screenshot 2019-09-25 at 19.31.11.png

                    Whilst your previous Gnatt chart is a great solution that I would like to keep in the Dash I am building, my goal would be to have also a similar chart, that is focused only on a selected day.
                    However, there are some things I am struggling with:

                    • Fixing the X axis to start at 12AM (or 00:00) and end at 11:59PM (or 23:59) for that day
                    • Visualizing ALL active events of that day: when I specify a date, say 23/09/2019, I would like to be able to see not only event 1, 2 and 11, but also event 12. If I filter on the "Start Date", then I am losing all events that are still active on that day, but started in the past; same happens if I add a filter for the "End Date" (I would lose event 2, as well).

                     

                    Hopefully I provided a better context, now.

                     

                    I know this is too ambitious for my current Tableau knowledge, but I really hope to at leat get close to it.

                    • 7. Re: How to visualize average concurrent events per day
                      Soumitra Godbole

                      Grazie Federica ! This should be all I need and I will come up with a solution keeping

                      the granularity as hours which should not be that hard as 1 hr = 1/24 = 0.041667 of a day

                       

                      For Eg:- As per the date time convention followed in Excel and Tableau

                      (no of days since 12-31-1899) Today (25th Sep at 1:00 am) would be written as

                       

                                                                43733.0417

                       

                      and similarly 12:00pm ET (Lunch time )  would be 43733.5000

                       

                      Here 43733 represents the Day No and 0.041667 represents the fraction of a day (1 in 24).

                       

                      In case the hours don't work (like the example you showed earlier for short projects), I will also

                      using minutes (1/1440 th of a day). Ciao !

                       

                      Sincerely,

                       

                      Soumitra

                      • 8. Re: How to visualize average concurrent events per day
                        Soumitra Godbole

                        Hi Federica,

                         

                            Thanks for your patience as the long wait is finally over and here is the final solution taking into

                        account minor details of events (with very short duration) that do not overlap.

                         

                        Dashboard.png

                         

                        It took longer than expected as I had first tried using the hours only but I did not like the solution and

                        finally tried minutes which worked out. Interestingly it looks completely different from the earlier

                        solution but the Avg Concurrence values make a lot more sense too just by looking at the skewness

                        of the curve.

                         

                        Data Join (3 Tabs using 1 = 1 i.e. Cartesian Join m x n )

                        Data Join.png

                         

                        Now we are ready for doing the analysis in Tableau

                         

                        Calculated Fields:

                         

                        Formula 1 - DateTime.png

                        Formula 2 - Date Filter.png

                        Formula 3 - Duration (Days).png

                        Formula 4 - Concurrent Events.png

                            This is the most important formula as it is a Table calculation that actually dices the Datetime

                             into individual minutes and calculated the Count of Events for each minute. Then it averages

                             this for each Day. You can also arrive at the same result using LOD calculations.

                         

                         

                        Creating the visuals:

                        Simply place the fields in the shelves as shown below to get the  2 visuals

                         

                        Visual 1

                        Visual 2.png

                         

                        Visual 2

                        Visual 1.png

                         

                        You can also add the dummy data table as I did and build the dashboard (shown at the top)

                        Let me know if you have any questions or need further explanation or clarity. Hope this correctly

                        answers your question. The twbx file along with the Modified Data spreadsheet are attached below.

                         

                        Best wishes !

                         

                        Sincerely,

                         

                        Soumitra

                        • 9. Re: How to visualize average concurrent events per day
                          Federica Calanca

                          Hi Soumitra,

                           

                          thank you for the effort.

                           

                          I would really like to deep dive in your solution, but unfortunately it only works with the dummy data. My db (that contains years of events - from 2018 to 2021) cannot load those visualizations... It gets stuck on the query for 20+ minutes, then my only option is to force quite the app.

                          However I can tell from the example here, that it is still not close to what I need to produce

                           

                          The main thing is the ability to filter on the dates... First I want to select a given month, or a given year , or a week, or most often a single day in order to see what was happening in that time window (on a Gnatt chart). In this sense your 1st solution was more appropriate, as one could click on a day and see all active events.

                          Once I selected my desired time span I would like to know how many concurrent events I had on average.

                           

                          However, as I mentioned in my previous message, I am not able to do so, because as soon as I put a filter on a Date, I lose all events that started before / will end later, w.r.t that date. (see again my example on events 1,2,11 and 12 on 23/09)

                           

                          So thank you, but it seems the headaches will go on, or maybe I will give up...who knows...

                          1 of 1 people found this helpful
                          • 10. Re: How to visualize average concurrent events per day
                            Soumitra Godbole

                            "Headaches will go on" (sounds like a song from the Titanic ). Please don't despair or lose

                            hope as you just gave me a Minor hint (filtering for a single day) which struck a Major chord.

                             

                            "My mind will go on" (till I arrive at the solution). Filtering for a specific day, using the Continuous

                            date range (green) instead of the Discrete dates (blue) simplifies the problem as it does not mess

                            up the calculation.

                             

                            So please wait and "I'll be back soon with another cartoon"  (I mean solution)

                             

                                                                                                                          - Woody Woodpecker

                            • 11. Re: How to visualize average concurrent events per day
                              Soumitra Godbole

                              Buon Giorno Federica,

                               

                                  "Happy Days are back again". The good news is: the earlier solution was sound and sane

                              (not sure about the creator ) but the way the visuals were displayed made it difficult to

                              understand, confusing and like it was wrong.

                               

                                    To illustrate the fine details of this challenging problem (5 stars * * * * *), I have modified the

                              data to just have 4 events and kept some very short while others very long (with and without

                              overlap). Here is the Solution (No change in the logic but a lot less data for a simple and lengthy

                              explanation).

                               

                              Dashboard.png.

                              I have enclosed the twbx file and excel spreadsheet below. For the explanation, I have split the

                              above dates into 3 days (Sep 23, Sep 24 & Sep 25). Before I go into the Statistics of Arithmetic

                              mean, just by looking at the above Gantt chart and the Resulting Bar Chart (Right) we can

                              summarize the results.

                               

                              Sep 25th:

                              We only have 1 event (ID 12) running for 22hrs

                               

                              Sep 24th:

                              We have a total of 3 events  with a maximum of 3 event overlap

                               

                              3 events Overlap of 20 min(ID's 2,11 & 12),

                              2 events Overlap of 450 min (ID's 2 & 12) and 960 min (ID's 11 & 12)

                              1 event for 50 min (ID 12 towards the end of the day)

                               

                              Sep 23rd:

                              We have a total of 3 events with a maximum of 2 event overlap

                              event (ID 12) running for 22hrs

                               

                              2 events Overlap of 75 min (ID's 1 & 12)  and 390 min (ID's 11 & 12)

                              1 event (ID 12) running for 990 min

                               

                              Doing a Weighted mean for the above days, we see that :

                               

                              For Sep 25th (trivial) we have only 1 Event which means that there were 0 Concurrent events

                              as you need a minimum of 2 events for concurrency

                               

                              For Sep 24th we do the following weighted avg

                              [ (3 x 20) + (2 x 450 + 2 x 960) + (1 x 50) ] / 1440      which gives us an Average of  2.0  events

                               

                              For Sep 23rd we do the following weighted avg

                              [  (2 x 75 + 2 x 390) + (1 x 990) ] / 1440                     which gives us an Average of  1.4  events

                               

                              I have also arrived at a similar result using another method which I will be attaching soon. Hope

                              this makes sense. Awaiting your feedback !


                              Soumitra

                              1 of 1 people found this helpful
                              • 12. Re: How to visualize average concurrent events per day
                                Federica Calanca

                                Hello Soumitra,

                                 

                                thank you for not losing hope

                                 

                                I think we are very close to the perfect solution now.. unfortunately I cannot replicate what you've done on dummy data because of the time it takes to process the request and just adding "Day Start of Event" to the columns for the Gnatt chart requires 1+ hours of processing time and maybe more, because at some point I just closed the window and canceled the operation.
                                I tried to add filters, but same thing happened. Everything I try to do with the data, after the 3 joins with Day, Month, Hour, requires so much time that I am not able to do it at all.

                                 

                                Any idea on how to improve things? I cannot take a subset of the data... I need people to actually be able to set the time window they prefer to visualize, even if it is from 1 year ago.

                                 

                                Thank you,
                                Federica

                                1 of 1 people found this helpful
                                • 13. Re: How to visualize average concurrent events per day
                                  Soumitra Godbole

                                  Hello Federica,

                                   

                                      It was a pleasant surprise after almost a week ago and glad to know that all is well. I also

                                  hope the headaches and worries about solving this problem have gone after looking at the

                                  last posted solution.

                                   

                                  Also, I don't lose hope :

                                   

                                  "for I can't lose what I never had to begin with"

                                   

                                  Sorry to know about your situation as that can be tedious and time intensive. I once remember

                                  this humongous datasource with over 1 Billion records in Teradata. The time it took just to connect

                                  to this source using Tableau ver 7 (long ago) was a little more than the time a ray of light took from

                                  the Sun to reach the Earth and then back to the sun (about 16min 40sec).

                                   

                                  Anyway, regarding your problem the only idea that comes to my mind is before doing any join, first try

                                  and do some statistical analysis to try and understand at what time of the day do these events take place,

                                  then are they regular i.e. happen around the same time on other days.

                                   

                                  Also is there a time when there is no event (say for 3 hours from 2pm to 5pm). This way if we can determine

                                  for eg:- that almost 99% of all the events start and end between  8am to 17pm daily with a few exceptions,

                                  then we can add this as a filter and drastically reduce the number of rows that we get doing the analysis for

                                  day, hr & min (for 1440 min in a day).

                                   

                                  Lastly I hope this makes sense and do let me know if you have an idea of the following (before doing a join) :

                                   

                                  1) No of events daily

                                  2) Min duration of an event

                                  3) Max duration of an event

                                  4) Is the number of new events a constant on a daily basis or varies ?

                                   

                                  Best wishes !

                                  Sincerely,

                                   

                                  Soumitra