7 Replies Latest reply on Jul 26, 2012 12:26 PM by Jonathan Drummey

    Rolling average using number of records

      I have a rather large set of log information. There is an EventTime field, and a few other Dimensions, but very few Measures.

      I am attempting to make a graph that compares a moving average (and std dev) of the number of records against the current average number of records by hour.

       

      Let me word that once more, just to make sure I didn't confuse something:

      I want a graph that has the [# of records] on the vertical axis, and the [hour of day] on the horizontal axis. I want there to be two lines, one is the average number of records for that hour across the whole dataset, and the other is just the count of the number of records for the particular day/hour.

       

      I am having trouble calculating this because I can not seem to get one line to look at all of the records, and the other just look at the current records. Additionally, the number of records measure is simply a 1, so taking the average of it normally will result in just a 1.

       

      Does anyone have any experience making this kind of graph or any suggestions?

      Thanks

        • 1. Re: Rolling average using number of records
          Jonathan Drummey

          Hi Tony,

           

          When you are wanting to have calculations that are at different levels of aggregation on in the same data set, there are a variety of options:

           

          - Custom calculations to embed the necessary filtering/aggregation with the calculations.

          - Table calculations

          - Reference lines

          -  Duplicating your data source and using a blend where the blend is controlled to be at the appropriate level of aggregation

           

          For the situation you describe, I'd either use a table calculation (which would be something like WINDOW_AVG(SUM([Number of Records])) with an appropriate Compute Using) or duplicate the data source. If you post some sample data in a packaged workbook (.twbx), I'm sure something can be worked out.

           

          Cheers,

           

          Jonathan

          1 of 1 people found this helpful
          • 2. Re: Rolling average using number of records

            I unfortunately can not post the workbook as it contains information that can not be shared. I have however attempted to use a table calculation since it did seem to be the best approach to me. Here is what I did:

             

            Add the Hour of an event time to the horizontal axis,

            Add the SUM of the Number of Records measure to the horizontal axis with no table calculation. I add the Day of the event time to the path for this entry which creates vertical bars indicating a range of values hour by hour.

            Now I want to add a duplicate of the SUM of Number of Records but which is only for the last week of values and is averaged to show one line (not the vertical bars like with the other measure)

             

            That wasn't quite what I wanted though so I tried another way:

             

            I was shooting for just two lines, no vertical bars showing a range. Once again, hour in the horizontal axis, and I added the SUM Number of Records to the vertical axis twice. On the second one I added a table calculation that went like this:

             

            IF DATEDIFF('day', [EventTime], NOW()) < 7 THEN

                WINDOW_AVG(SUM([Number of Records within Range]), -7, 0)

            END

             

            I think I am getting close. If I can use this table calculation to restrict to the last week of data, then I should be able to create my average that way. Sadly the calculation is not valid because I "Cannot mix aggregate and non-aggregate comparisons or results in if expressions"

             

            Any further tips?

            Thanks again!

            • 3. Re: Rolling average using number of records
              Jonathan Drummey

              Hi Tony,

               

              I mocked up some data and put together the attached. Most of the time it's much easier to "explain" how to do something in Tableau by working with a workbook with data, which is why I'd asked for something, it saves time for me and helps me get a better answer for you.

               

              In your first request, you'd wanted the average across all data, now in your second one you are looking for the last week of data. Those are two rather different things, and the latter is more complicated because you'd need to calculate the 7 day average and make that available to all the other rows in such a way that Tableau can still draw lines using it. And you'd need to specify whether you wanted it to be the last 7 calendar days vs. last 7 days in the data set vs. days within the last week. Some of those are harder to get than others.

               

              So I just used an average from the whole (very small) data set.

               

              Let me know if this works for you,

               

              Jonathan

              • 4. Re: Rolling average using number of records

                Wow, thank you for putting so much work into this answer. I have attempted to replicate your workbook using my data, however for some reason the WINDOW_AVG(SUM([Number of Records])) is a constant number through out all the days that I have information for. What would cause this? I copied the measures that you created directly.

                 

                EDIT:

                Okay so I have taken a closer look at what I have and what I need, and think I have it a bit more solidified. Here it goes:

                 

                In the sample data you created, imagine there was a dimension that randomly divided all the data into either 'A', 'B', or 'C' categories. Now one of the lines in the graph should show the average number of records by day and by hour for all the data and the other should show it for just 'B' or for just 'C' depending on what is selected. So more generally, I want to compare a line across all dimensions and for a specific dimension in one graph. Is that possible with a table calculation?

                • 5. Re: Rolling average using number of records

                  Good news, I found out what the problem was. I added paging to the day, and then ran the table calculation based on the Day of Event. Everything is working now. Thanks Jonathan!

                  • 6. Re: Rolling average using number of records
                    Jonathan Drummey

                    Hi Tony,

                     

                    Here you go. I created a dimension in the data, then created a parameter called "Choose Dimension Value" and then a calculated field to return the number of records for the chosen dimension value. I used that in place of the orginal number of records in the "graph with dimension picker" chart.

                     

                    Jonathan

                    • 7. Re: Rolling average using number of records
                      Jonathan Drummey

                      Great! Knowing what the Compute Using settings need to be for table calculations is one of the hardest parts of Tableau to master, I'm still learning more almost every day!