14 Replies Latest reply on Jan 9, 2014 11:45 AM by Patrice Collette

    Just how different is the new Google BigQuery connector?

    stephen black

      According to the blurb for the 8.1 beta, the  Google BigQuery connector is now native.

       

      I had hoped this would resolve some limitations in the original Samba ODBC connector. The new connector looks the same and has fixed some known errors, but some other big limitations appear to remain. For example, for some queries we are still seeing the BigQuery error message that the query should use GROUP EACH BY rather than GROUP BY to avoid running out of resources. Similar BigQuery errors related to the use of large tables in joins (where non-default settings are required to allow large joins) also seem to be present.

       

      Am I missing something or is there some way to change the BQ settings from the connector? Or, are there two separate BQ connectors and I've got the wrong one?

        • 1. Re: Just how different is the new Google BigQuery connector?
          Patrice Collette

          Hi Stephen,

          I am glad to know there will be a new connector for Google BigQuery!

           

          For anyone interested in BigQuery, I invite you to vote for these 2 ideas:

          http://community.tableau.com/ideas/2542

          http://community.tableau.com/ideas/2540

           

          Thank you!

          • 2. Re: Just how different is the new Google BigQuery connector?
            Bora Beran

            Hi Stephen,

            The new connector allows you to write BQL instead of SQL. So you can use everything BQL allows in your Custom SQL for data connection which includes non-SQL options like GROUP EACH BY and JOIN EACH as well as FLATTEN etc.

             

            You will also notice a list of new functions in calculation dialog for parsing hostnames or just using Regex against BQL.

             

            While Tableau now allows you to use JOIN EACH when defining your data connection in 8.1, it does not do it automatically for you. The reason is that EACH operators are much slower compared to their standard counterparts and today Big Query does not provide any metadata that would imply a particular query would fail with JOIN but succeed with JOIN EACH. What Tableau could do is try GROUP BY and automatically switch to GROUP EACH BY when it fails but it would be a hacky solution and you would see two queries to the big query and get billed twice. We are working with Google on how to make this better.

             

            Could you try using JOIN EACH in Custom SQL while joining two large tables and see if it works? If it does, you have the right connector. If you get an error about JOIN EACH not being recognized as a valid SQL command, you probably have the old connector.

            1 of 1 people found this helpful
            • 3. Re: Just how different is the new Google BigQuery connector?
              Patrice Collette

              Hi Bora,

               

              I wonder if the new connector will have an option to do a real count distinct and not the approximation that is done by default in BigQuery?

              • 4. Re: Just how different is the new Google BigQuery connector?
                Robert Morton

                Hi Patrice,

                 

                Thanks for pointing out that limitation of Count Distinct, I had not previously noticed that detail in the documentation. Since the only way to bypass sampling is to provide a numeric upper limit for the possible number of distinct values, could you provide an estimate to how high this may need to be for your use cases?

                 

                In the meantime you can build your own Count Distinct aggregate calculation as follows:

                RAWSQLAGG_REAL("COUNT(DISTINCT %1, %2)", [dimension], [limit])

                 

                For the upper limit field [limit], you could either use a constant value or define [limit] as a parameter control.

                 

                I hope this helps,

                Robert

                1 of 1 people found this helpful
                • 5. Re: Just how different is the new Google BigQuery connector?
                  Patrice Collette

                  Hi Robert,

                   

                  Thank you very much for the reply!  In version 8.0.4, I created a calculated field with RAWSQLAGG_REAL like you proposed but I got the following error:

                  An error occured while communicating with data source "MyGoogleBigQueryDataSource".

                  [Simba][SQLEngine] (31480) syntax error near 'SELECT (COUNT(DISTINCT "MyTable"."myId",<<< ??? >>> 500000)) AS "usr_Calculation_5541007090346415_qk",

                    COUNT(1) AS "X__alias__1"

                  FROM "TKCO"."MyTable" "MyTable"

                  HAVING (COUNT(1) > 0)'.

                   

                  I don't have any errors if I don't put the second parameter (%2), but this will give me the default behavior.

                   

                  As for the estimate as how high the limit should be for our case, I say that for now a limit of 500K will be good for us. However I expect that we need to go up to 1 Million eventually.

                   

                  Thanks again for your help!

                  Patrice

                  • 6. Re: Just how different is the new Google BigQuery connector?
                    Robert Morton

                    Hi Stephen,

                    I'm happy to report that Tableau 8.1 addresses many requests for enhancements to the BigQuery connector. These include:

                     

                    • Full Custom SQL support using BigQuery BQL language syntax.
                    • Joins between more than two tables.
                    • Support the 'EACH' syntax for working with large tables, including automatic error detection & query retry support when it's not possible to know in advance that 'EACH' syntax is required.
                    • Several native BigQuery scalar and aggregate functions are exposed in our calculated field formula language.
                    • You can connect to any public data set, including those not listed in the drop-down (such as 'httparchive') simply by typing in the name of the public data set. Google BigQuery does not yet support enumerating all publicly listed data sets, so we decided to allow you to specify them explicitly if desired.
                    • Other stability and performance improvements.

                     

                    I hope you have a great experience with the new features for BigQuery in Tableau 8.1.

                    -Robert

                    1 of 1 people found this helpful
                    • 7. Re: Just how different is the new Google BigQuery connector?
                      Patrice Collette

                      Hi Robert,

                      I was wondering if workbooks made with version 8.0.X that have a BigQuery connection will still continue to work as usual when upgrading to 8.1 and to the new BigQuery driver (on Tableau Server and Desktop)?

                      Thanks

                      Patrice

                      • 8. Re: Just how different is the new Google BigQuery connector?
                        Robert Morton

                        Hi Patrice,

                        Yes, existing workbooks should work -- with the exception that any Custom SQL or RAWSQL will need to be rewritten to conform to the BigQuery BQL syntax instead of the SQL-92 syntax that we initially required in the v8.0 release.

                        -Robert

                        • 9. Re: Just how different is the new Google BigQuery connector?
                          Patrice Collette

                          I just installed version 8.1 (version 8.1.3) and I don't know how to have have access to the "Full Custom SQL" in order to use BigQuery BQL language syntax.  Anyone can help me on this?

                           

                          I wrote the following in my CustomSQL and it works :

                               SELECT count(distinct "MyTable"."myId") as "DistinctIds"

                               FROM "MyDataset"."MyTable" "MyTable"

                           

                          However if I write this, it doesn't work (added a second parameter to the count distinct) :

                               SELECT count(distinct "MyTable"."myId", 10000) as "DistinctIds"

                               FROM "MyDataset"."MyTable" "MyTable"

                          • 10. Re: Just how different is the new Google BigQuery connector?
                            Robert Morton

                            Hi Patrice,

                             

                            You do not need to take any special action to use Google BigQuery BQL syntax, since it is the default in Tableau 8.1. I'm not sure how your first Custom SQL query could have worked, since it doesn't follow the BQL syntax.

                             

                            It's important to note that BQL deviates far from the SQL standard in many respects. I'll give some examples based on the queries you tried. First, BQL quotes identifiers using '[]', and the double-quote character signifies a string literal. Second, BQL does not allow you to qualify fields using the relation alias when the relation is a single table, so you must only reference [myId] without the relation alias qualifier [MyTable]. Third, you must express qualified identifiers entirely within a single pair of [] brackets, e.g. ... FROM [MyDataset.MyTable].

                             

                            I hope this helps,

                            Robert

                            • 11. Re: Just how different is the new Google BigQuery connector?
                              Patrice Collette

                              Hi Robert,

                              Thanks for the quick response! 

                               

                              I know all the rules that you mentioned about the BQL syntax (I have been using BigQuery for over a year now).

                               

                              However this does not seem to work on my machine when I use the BQL syntax in the Custom SQL part of my BigQuery connection.

                               

                              For example, this doesn't work when I enter it in Tableau (this works in the BigQuery browser on Google website) :

                              SELECT count(myColumn) FROM [MyDataset.MyTable]

                               

                              But this will work:

                              SELECT count(myColumn) FROM "MyDataset"."MyTable"

                               

                              I have installed the new 64 bits.  I have also kept the 32 bit driver that I had previously installed a few months ago, since the machine that I work on runs both the Desktop version (64-bit) and the Server version (64-bit). Tableau web site is mentioning to use the 32 bit drivers for the 64-bit Server. So that is why I keep both drivers.

                               

                              Just to make sure, the BigQuery custom SQL connection has to be created by selecting in Data the Google BigQuery connector and not the Other Database (ODBC) connector, right?

                              • 12. Re: Just how different is the new Google BigQuery connector?
                                Patrice Collette

                                Hi Robert,

                                 

                                Just to let you know that I figured out that the installation of the 8.1.3 desktop version was not made and that I still have version 8.0.6. Sorry for the inconvenience, I will install version 8.1.3 and try again. I will give some news after.

                                 

                                Thanks and sorry again

                                • 13. Re: Just how different is the new Google BigQuery connector?
                                  Robert Morton

                                  Hi Patrice,

                                   

                                  Your existing workbook may have an artifact that is preserving the requirement that you use SQL-standard syntax. Try creating a new workbook with a new connection -- does the behavior change? If so, you may need to modify your existing workbooks to completely remove the following key/value pair: UseNativeQuery=0.

                                   

                                  I hope this helps,

                                  Robert

                                  • 14. Re: Just how different is the new Google BigQuery connector?
                                    Patrice Collette

                                    Cool, it works now!

                                    Thank you so much for your help!!