2 Replies Latest reply on Apr 21, 2018 3:35 PM by Jonathan Drummey

    CASE vs IF Statements in Tableau 10.4.3 - which is more efficient in most instances?

    Charlotte White

      Hi Tableau Gurus,

       

      I am creating a tutorial for our group on how to use CASE and IF Statements in Tableau with our specific data for our Company.  In preparing for showing them how to set up both IF and CASE Statements I stumbled upon a few articles. One article is from Interworks - "CASE Statements vs. IF Statements" written in February 2015 (find it here: https://interworks.com/blog/tmcconnell/2015/02/10/case-statements-vs-if-statements-tableau/ ).  I also found a Tableau Whitepaper written for Tableau version 10.0 on efficiency of workbooks written in June 2016 ( page 50 talks about IF Statements, find it here: https://www.tableau.com/sites/default/files/whitepapers/designing-efficient-workbooks-v10.0.pdf )

       

      The White paper was produced by Tableau and is more recent, but I still would like to know the definitive answer to this question:

       

      Question:

      In instances where both the IF Statement or the CASE Statement can be used in Tableau 10.4  (as CASE Statements can't always be used, for example with boolean algebra) is the IF Statement or the CASE Statement a better statement to use for a basic logical expression?

       

      It appears that this question has been asked before back in 2016, however I definitely understand that with technology always changing that this situation could have been much different back in 2015 or 2016 than it is today.  Here is the post from 2016:  Case Statement Vs If Statement

       

      Any thoughts or feedback you have on this would be greatly appreciated!

       

      Thanks!

      Charlotte

        • 1. Re: CASE vs IF Statements in Tableau 10.4.3 - which is more efficient in most instances?
          Mark Holtz

          Some additional thoughts:

           

          It depends on your developer base too. If you have users used to Excel IF statements, iif() will make them feel at home, albeit the worst of the IF family to actually read.

          If you have programmers/SQL people, they will probably prefer the IF or CASE construct.

           

          As far as I know, CASE in Tableau can only compare equality, and so my personal preference is simply to always go with IF.

           

          I have never encountered a scenario where performance of the IF was so bad that I sought to tune it by investigating the use of CASE instead.

           

          Also note that if your developer team is using an extract, performance is less of an issue because calculated fields are computed ahead of time--unless they rely on parameters. Even so, I've built plenty of views with millions of records and have never encountered a performance issue.

           

          Because it will ALWAYS work for any use case, I prefer IF when within the confines of Tableau, and that's what I tell all analysts/developers I train in Tableau (I've been the primary Tableau trainer in my last 2 organizations, spanning dozens and dozens of developers).

           

          Also tell them to beware "comfort" or old habits. Most Tableau developers are at least somewhat fluent in some form of SQL.

          CASE seems like a good idea in Tableau because in T-SQL (and most other SQL dialects I presume), CASE can be used both ways.

           

          CASE WHEN ... THEN in SQL is syntactically equivalent to IF ... THEN in Tableau.

          CASE ... WHEN ... THEN in SQL is the equivalent of CASE ... WHEN ... THEN in Tableau.

          • 2. Re: CASE vs IF Statements in Tableau 10.4.3 - which is more efficient in most instances?
            Jonathan Drummey

            I don't think there is a definitive answer as to which one is "better".

             

            There used to be some differences between CASE and IF statements for performance, based on changes in Tableau those have almost/entirely went away so that's not a driving factor, here are some things I think about:

             

            Booleans:

            IIF can return booleans, IF and CASE through Tableau v10.2 can't. For example: IIF(1=2, True, False) will work while CASE [Region] WHEN "West" THEN 1=2 END will fail with an error. In that situation if I want to return a boolean from an IF or CASE statement I'll do something like:

             

            1 =  IF [field1] > 2 THEN 1 ELSE 0 END

             

            and that will evaluate to True or False. Tableau v10.3 added support for this.

             

            Nulls:

            IIF has an optional Null return value e.g. IIF(condition, true return value, false return value, Null return value) whereas we have to explicitly evaluate for Null in IF & CASE statements

             

            Conditions:

            IF and IIF can use any boolean condition like IF [field1]>[field2], whereas Tableau's version of CASE can only do CASE [field], i.e. evaluate values of a field (which may be a calculated field). This is different from many SQL implementations where CASE can evaluate a boolean condition.

             

            Personally I prefer using Tableau's CASE statement when I'm evaluating a list of values because for me it's faster/easier to read than an IF statement:

             

            CASE [Region]

                 WHEN "West" THEN 42

                 WHEN "East" THEN 23

                 WHEN "South" THEN 11

                 ELSE 3.14

            END

             

            Otherwise I mostly use IF statements because it's easier to read them when they are nested.

             

            One more point re: Mark's point about calculated fields being computed ahead of time in the extract. This is only true at this time for record-level calculations that don't depend on run-time variables. For example IF [record level datefield] = #1900-01-01# THEN #2018-01-01# ELSE [record level datefield] END would get materialized in the extract, whereas IF [record level datefield] = #1900-01-01# THEN TODAY() ELSE [record level datefield] END would *not* be materialized because TODAY() is a function always evaluated at run-time. Also an aggregate calculation like IF SUM([Sales]) > 180000 THEN "High" ELSE "Low" END won't be materialized in the extract because the context of the aggregation for SUM([Sales]) is not known when the extract is created. e.g. is that SUM(Sales) for the whole data set, SUM(Sales) for the East region for customers who have previously purchased this year, SUM(Sales) for a single customer, or something else??

             

            This leads to a last point...it's really key to teach people about the level of calculation in Tableau and the difference between aggregate and record level to explain things like why IF [Region] = "East" THEN SUM([Sales]) END throws the dreaded "cannot mix aggregate and non-aggregate" error. When I'm training on Tableau I teach people about levels of calculation first before teaching logical calculations so they know how to deal with that error.

             

            Jonathan

             

            [edited 2018-04-21 to note that Tableau v10.3 added support for booleans as output from IF and CASE statements]

            2 of 2 people found this helpful