13 Replies Latest reply on Nov 27, 2012 10:52 AM by Richard Leeke

    how to compare event data

    Fil Mackay

      Hi, just wondering if there is a way in Tableau to do event peer comparison. Lets say you want to compare each event with other events near it - is that possible to do in Tableau, or do I need to map the data prior to putting in to Tableau?


      For each event (say, by time) you want to compare each event to others within a specified time period. You then want to do aggregate comparisons of that sub-population with the event in question. These problems can take enormous storage depending on the ratios, perhaps Tableau can do these types of queries on-demand?


      Thanks in advance.

        • 1. Re: how to compare event data
          Tracy Rodgers

          Hi Fil,


          This type of comparison can be done in Tableau. The comparisons can mostly likely be visualized in any number of ways. Then, calculations and other aggregations can be added to your measures to find and understand the data better.


          If you're able to post some sample data, I may be able to show you some examples!



          1 of 1 people found this helpful
          • 2. Re: how to compare event data
            Fil Mackay

            Hi Tracy,


            The "windows" in Tableau seem to be only defined by offset - ie. x rows prior to this and r rows after. I am looking for a window that is defined by a filter, eg. rows with a time within 1 sec prior, and 3 sec after - and also filtered on another column value. Can you do this type of window definition?


            As for sample data:




            Lets start with comparing each value with its peers (1sec prior to 3 sec after) - compare each value with the average of all the others. Then look at a visualization of the time of each data point, and the relative comparison (ie. diff between it's value and the average of others in the window).

            • 3. Re: how to compare event data
              Tracy Rodgers

              Hi Fil,


              I'm not sure I completely understand quite yet what the desired view is. However, I have attached a workbook using the data provided. First, the Date Time calculation had to be created so that the Time field is recognized as a date field. Then, Calculation2 was created to look at the seconds and those in orange meet the criteria of the time before it either being within 1 second or the one after it is within 3 seconds. I hope this is at least a start!



              1 of 1 people found this helpful
              • 4. Re: how to compare event data
                Fil Mackay

                Thanks for this. Where I need to take it is to define a window of the 3 seconds - rather than it being a offset window (ie. nearest 2 rows either side). Know if that's possible?


                The window is required so I can do aggregations, and compare these to each data point..

                • 5. Re: how to compare event data
                  Jonathan Drummey

                  The hard part about this is that you want to have one calculation look across rows in the data to determine what's in the bucket, and that requires an aggregation, then aggregate over that. Performing two levels of aggregation in Tableau requires a table calculation for the 2nd level (at least), and though it is possible to make table calculations look across data rows like this, I believe this is one of those cases where the additional complexity & performance issues are likely not worth the effort.


                  The alternative is to do this in SQL, the way I'd approach it would be to do a self-join where within the SQL you'd do the first level of aggregation to get the window average, then the second level of aggregation could be done using Tableau's regular aggregates. This wouldn't result in a large amount of extra storage being required, only a few more columns. In addition, in Tableau v8 pass-through parameters could be used so you wouldn't have to hard-code the window sizes.


                  Richard Leeke does some of this kind of analysis, he might have more thoughts.

                  1 of 1 people found this helpful
                  • 6. Re: how to compare event data
                    Buzz Burhans

                    I had a similar question recently that Joe Mako had a solution for.  In my post I had asked two related questions, the second one was similar to yours. I think Joe's answer to my second question might work for you. The post was here:


                    The description Joe provided was:


                    "Here is an option for looking at the last 2 months:


                    IF LOOKUP(SUM([Milk lb.]),LAST()) > LOOKUP(SUM([Milk lb.]),LAST()-1) THEN


                    ELSEIF LOOKUP(SUM([Milk lb.]),LAST()) < LOOKUP(SUM([Milk lb.]),LAST()-1) THEN



                      "No Change"



                    In order for either route to work properly, you need to filter out the null values prior to table calcs, as done with the filter on SUM([Milk lb.]) with Special -> non-null values. Another route would be additional logic in the formulas to detect and handle nulls, not really desirable."



                    In the thread Joe also sent a workbook demonstrating how he implemented the above.  Note that part of his response was about regression coefficients and is not pertinent to your question, but the part about comparing with previous period, which I had described in my second post to him in the thread, could be adapted to your question I think. My question in the followup was:


                    "An alternative would be to identify the difference between the yield for last month with data and the yield for the month previous to the last month with data.  The difference (a positive or negative amount) could be used as the color indicator."




                    1 of 1 people found this helpful
                    • 7. Re: how to compare event data
                      Fil Mackay

                      Thanks for the reply Buzz. You've highlighted the easy form of comparison, that based based on simple offsets. It seems to come down to the limitations of the LOOKUP() function (as with this example), you must define the lookup in terms of offset.


                      I'm am looking to define the "lookup" in terms of a value filter - ie. the other values within a certain range of the current. That seems to be the distinguishing factor to the other example.


                      Regards, Fil.

                      • 8. Re: how to compare event data
                        Richard Leeke

                        I think Jonathan is right that the most promising looking approach is to do the relative correlation in SQL - either with a custom SQL connection or with RAWSQL pass-through expressions.


                        Depending on your data source (i.e. what database engine you are using) and how your data is structured and indexed this can work well - even with very large data sets.


                        Up to and including version 7 you would have to fix the lookup criteria statically if you use a custom SQL connection, whereas they can be parameterised if you use the RAWSQL approach. Version 8 will allow parameters within the custom SQL connection definition, so that restriction would be removed.


                        Based on what you've described (T-1 to T+3 plus another filter condition), my first reaction would be to try defining a RAWSQL expression. It might look something roughly like I've shown below (though bear in mind that I haven't actually run this so the syntax may be wrong - and anyway RAWSQL has to be written in the appropriate SQL dialect for the underlying database, and datetime manipulation is one area where there is a lot of variation - so this almost certainly won't work without tweaks).


                        Assuming you data is in a table called [YourTable] and has columns [DateTime], [Filter] and [Value], you can define a calculated field to get the average of [Value] over the window you specified and subject to some other filter condition, with something like this.



                        SELECT AVG([t2].[Value])

                        FROM [YourTable] AS [t2]

                        WHERE [t2].[DateTime] BETWEEN DATEADD('second', -1, %1) AND DATEADD('second', 3, %1)

                        AND [t2].[Filter] = %2



                        [FilterParameter] )


                        I've tried to show how you would use a parameter for the filter condition (but may have got the syntax wrong - it will be something like that!). You could also make the 1 second and 3 second range boundaries parameters if you want.


                        It's a long time since I've had to do this, so I'm a bit vague on the detail, but I think I remember that you can create a data extract which includes the values of RAWSQL expressions like this - but bear in mind that if you change the parameter values you will need to refresh the extract. I can't remember if Tableau forces you to do that or not.

                        • 9. Re: how to compare event data
                          Shawn Wallwork

                          Richard your suspicions were correct, you weren't remembering the way RAWSQL and extracts work; specifically, they don't. As soon as you do an extract you move to the Tableau data engine, which doesn't support RAWSQL and the functions disappear. One of the few advantages of Jet.



                          • 10. Re: how to compare event data
                            Richard Leeke

                            Not so fast, Shawn.


                            When the data engine first came along it prevented the use of RAWSQL if you created an extract (I spent several months testing the "technology preview" of this - pre alpha - and this was the one big downside for me).


                            But some time more recently Tableau brought in support for materialising the value of RAWSQL expressions at extract creation time (i.e., use the database engine to evaluate the RAWSQL expression as you build the extract and store the result in the extract). I remember Robert Morton posting a note about it on the forums.


                            The thing I couldn't remember was whether or not Tableau notices if you do something (like changing a parameter value) which will cause the result of the RAWSQL to change, so that you have to regenerate the extract.


                            But thinking about it again, I think I also remember that not all RAWSQL expressions are materialised. Maybe dimensions are and measures aren't (or maybe aggregate functions aren't). If that's the case, this may not work in this case - which would mean that you do have to use the original data source.


                            I'll probably do some experimenting to confirm exactly what does and doesn't get materialised at some point.

                            • 11. Re: how to compare event data
                              Shawn Wallwork

                              That's good news (maybe). I use RAWSQL to get labels formatted correctly when doing parameter-driven vizzes. If I set it all up and then do an extract that will give me the best of both engines. Interested to hear what you figure out.



                              • 12. Re: how to compare event data
                                Shawn Wallwork

                                OK Richard you are right (of course). I tested to see what happens when I take a workbook that is using RAWSQL_STR to fix the label formats, and extracted it. No problems, everything ran as expected. You do get the red 'X' in the edit calc field box,



                                but as long as you don't edit it, it works fine. And if you remove the extract everything works, and you can edit the calc and re-extract.


                                This is definitely going to be Week's Best Thing I Learned!





                                • 13. Re: how to compare event data
                                  Richard Leeke

                                  I just checked up, and as I thought you can create extracts of the non-aggregate RAWSQL expressions as in the example you show here - but not the RAWSQLAGG...() functions. Thinking about it for a moment the reason for this restriction is clear: the result of the aggregate function depends on the sheet layout, which of course is unknown when you create the extract and will change all the time as you work on the viz, so it would make no sense to try to fix the value in the extract.