The way you were defining your calc, the use of FIRST(),0 make it dependent on the sort order of the data, which I don't like to do unless absolutely necessary because it makes the view a bit more fragile.
The way I generally think of these kinds of problems is either to do this is to create a table calc that will return the desired value, and then a second calc that will do an if-then to compare each row to see if it matches. That second calc is then put on the Filter shelf, with the Compute Along set so the partitioning works out. Here's an example:
Basic find maximum: WINDOW_MAX(MAX([TimeStamp]))
Show/Hide filter calc: IF ATTR([TimeStamp]) = [WM M TimeStamp] THEN "Show" ELSE "Hide" END
The one key bit to remember is that the filter calc needs to be on the Level of Detail shelf with the Compute Along appropriately set before you add it as a filter, that way Tableau will have computed data to work with.
See the attached for an example.
Thank you Jonathan!
However, when I use this calculation, my workbook slows down tremendously as the number of machines included increases. Is this because of the internal processing done by Tableau? Or should I be using a different approach?
1 of 1 people found this helpful
Window calculations get hopelessly slow on large data sets (tens of thousands of rows or more) unless you use some trickery to speed them up, so depending on the size of your result set, that could well be the issue.
I described in some detail why they go so slowly and what you can do to speed them up in this posting on Clearly and Simply.
There are a couple of different variants that you sometimes need to use. I haven't looked at Jonathan's workbook, but from his description, I'm pretty sure you need to change that WINDOW_MAX() calculation to say this:
PREVIOUS_VALUE(WINDOW_MAX(MAX([TimeStamp]), 0, IIF(FIRST()==0, LAST(), 0)))
If this is indeed the issue, that should hopefully make a big difference.
I'm really surprised that Tableau haven't done anything about this. By my way of thinking this was probably the biggest flaw in version 6.0 - and I'd need a lot of convincing that it's not fairly easily fixable. I guess it just hasn't reared it's head on the priority list these past 18 months. ;-)
1 of 1 people found this helpful
Richard - One tiny fix on the calc, there needs to be a closing parentheses after [TimeStamp]. My sample data set is tiny, so I didn't get to test it out here, but I've used your calcs a lot and they've tremendously speeded up processing. I also wish Tableau would implement some performance improvements in table calculations.
Tandra - Another option you might look into is creating an additional datasource to get the information for this view using Custom SQL/a new view, etc. For some views it can be a lot easier to have make the underlying database do the heavy lifting than Tableau.
Thanks - now fixed above.
Yes, fixing this particular issue seems such a no-brainer - so I suppose there must be some subtleties that I'm missing or it would have been done by now.
I agree with the database heavy-lifting comment, too.
However, even with the new formula the performance of my workbook doesn't improve much. Any ideas what could be affecting the speed?
Jonathan - I hear you! I am trying to push most of these calculations on my database, but this is an example of a case where we need this functionality for multiple charts/maps.
How many rows returned to Tableau for your view (i.e. how many rows before applying the show/hide filter)?
Are there any other calculations in use on the view, or is it really as simple as the example Jonathan posted?
It's very hard to speculate on what else might be taking the time without seeing it. Are you able to post a sample workbook that shows exactly how the view is structured?
The other thing which sometimes gives a strong clue is looking carefully at the Tableau logs. Open your workbook and display the view with lots of data so it takes a long time and then immediately open the log.txt file the Logs directory under your Tableau repository with a text editor (i.e. without even closing Tableau), go to the very end of the file and then search backwards for "Updating sheet 'sheetname' for View 'viewname". Copy everything from that point to the end and save it as another text file and attach that to this thread I can have a quick look and see if I can see where the time is going.
It goes down from around 2 M rows to 20K rows, with no other calculations on this workbook. Also attached is the log file.
I'd be grateful if you can tell me how to fix this!
Log file.txt.zip 11.2 KB
Your log file is chopped off before the timing details for the sheet you are asking about.
I can see that this entry:
Update sheet 'Average Temp' for view 'Machine distribution on a map'
shows that sheet returning 500,000 rows in about 6 seconds and then spending 26 seconds "Computing Local Calculations" and another 460 seconds on the "Compute Percentages" phase of Table calculations.
The entry for the sheet that sounds like the one you are concerned with is here:
Update sheet 'Last Location by PIN' for view 'Machine distribution on a map'
but that is chopped off after the details of the query against the extract, so doesn't show number of rows or timing.
But having said that - anything doing table calculations against even 500,000 rows, let alone the 2M you mentioned is going to take a very long time - even with that tweak of mine. Without that the WINDOW_MAX() would take hours (days?) on 2M rows.
It may be that you need to consider doing this another way. It may be better to flag the last location for each machine in the original query used to create your extract. That would require some slightly tricky custom SQL. Just how tricky depends on what your data source is - different databases have very different levels of support for that sort of thing.