4 Replies Latest reply on Feb 5, 2013 11:02 AM by Zacharia Mathew

    Tableau generating comparison operators instead of IN or = operators

    Zacharia Mathew

      I use Tableau v 7.0.12 to connect to HP Vertica 6.1.0.

       

      I am increasingly noticing that in certain situation, the query generated by Tableau is using a comparison operator similar to below instead of an IN operator or = operator.

      (("table1"."field1" >= '0093J99999') AND ("table1"."field1" <= '0093K99999')) OR (("table1"."field1" >= '0151K99999') AND ("table1"."field1" <= '0151K99999')).

       

      This is an inefficient way to filter on fields. Any thoughts on how this can be avoided ? Is this a Tableau bug ?

       

      Zacharia Mathew

        • 1. Re: Tableau generating comparison operators instead of IN or = operators
          Robert Morton

          Hi Zacharia,

           

          This is not a bug, the Tableau query compiler attempts to produce the most concise query expression and in some cases it's simpler to express a range of possible values (even on a string dimension!) than to enumerate all values with the IN operator. This heuristic of course does not mean that the queries will be faster, as you have identified. That depends on a variety of factors involving the shape and size of the data, the physical database organization, etc., which is even a challenge for the database itself to get right.

           

          Thanks for submitting this to the Ideas forum as well, that will ensure it gets some additional visibility for our roadmap planning. This may be something we can address in a future release, though I cannot commit to specific features or a timeframe.

           

          -Robert

          • 2. Re: Tableau generating comparison operators instead of IN or = operators
            Zacharia Mathew

            Robert,

            I think your answer that "in some cases it's simpler to express a range of possible values" is misleading. In the issue, I am pointing out the upper and lower limit defined by >= and <= are the same value.

             

            How can the below comparison query

             

            (("table1"."field1" >= '0093J99999') AND ("table1"."field1" <= '0093K99999'))

            OR

            (("table1"."field1" >= '0151K99999') AND ("table1"."field1" <= '0151K99999')).

             

            be any better than

             

            ("table1"."field1" IN ('0151K99999',  '0151K99999')). ?

             

            If tableau generates such SQL statements, the performance will greatly suffer.

             

            Zacharia

             


            • 3. Re: Tableau generating comparison operators instead of IN or = operators
              Robert Morton

              Hi Zacharia,

               

              Thanks for following up. I hadn't noticed that '0151K99999' appeared twice in the second clause, that's an interesting point. Since it is constrained by the range predicate to the single value '0151K99999', we could simple use a single equality comparison. However the first clause covering the range ['0093J99999', '0093K99999'] allows for the possibility that there are numerous values in alphabetic order between those two values; this may be too large a collection of values to succinctly express with an IN clause.

               

              I'll track this internally as a report to our Product Feedback team to consider improvements to query generation for specific databases and use cases such as this one. In the meantime, you could help us understand the other factors at play in your situation by contacting our Support team (support@tableausoftware.com) and informing them of this forum thread and your request for better query generation. They will be able to work with you to collect more information including Tableau logs that may help us better understand the problem.

               

              Thanks,

              Robert

              • 4. Re: Tableau generating comparison operators instead of IN or = operators
                Zacharia Mathew

                Robert,

                 

                Thank you for looking into this. Even if there are 10,000 or 100,000 distinct values of interest, I would think that showing all of them in the IN operator is anyday better than constructing an SQL with 10,000 (x 2) or 100,000 (x 2) comparison operators. 

                 

                Regards

                Zac.