13 Replies Latest reply on May 26, 2015 1:20 PM by Andres Parra

    VizQL query optimization for MySQL

      Hi,

       

      I am using the MySql database to store data in a Star Schema. I have created a tableau datasource to map the star schema of the database. But I am facing some issues related to the queries generated by VizQL.

       

      I noticed that when the MySql database uses engines that provide foreign key constraints, (Like InnoDB), VizQL is able to generate optimized queries joining only the tables that are needed. However the MyISAM engine, which doesnt provide foreign key constraints results in unoptimized queries joining all possible tables even though the foreign key reference is provided in the datasource.

       

      MyISAM is much faster for Reads as compared to InnoDB, and we would prefer to use that engine instead. So how can we get VizQL to optimize queries for MyISAM? Is there a step i am missing?

       

      Regards,

       

      Jai

       

      Tableau Desktop v. 5.2

      MySQL v.5.1

        • 1. Re: VizQL query optimization for MySQL

          Hi Jai,

           

          I'd try making an extract of your data (you can either extract all or portion of the data). If you've done that already and are still having perf issues, I'd go to: data --> extract --> optimize.

           

          Hope that helps,

          Mike

          • 2. Re: VizQL query optimization for MySQL
            Kyle B

            Jai - some of the query structure tableau uses is TERRIBLE for MySQL (fine for other engines). Subqueries, which for some reason tableau loves to use (it renames everything "TABLEAUSQL") are horribly ill-performing in MySQL.

             

            Mike is right on it - extract it if you can. I'm guessing your table isn't too big since MySQL also sucks a large tables. MySQL is great for many things, but I would say BI is not one of them (I speak from experience working with a approx 500M row [largest table] MySQL OLTP). We did ETL and moved everything to SQL Server for reporting. The only Windows machine in the organization.

            • 3. Re: VizQL query optimization for MySQL
              Dean Glasener

              Has anyone one used InfiniDB, uses a mySQL interface, with Tableau?

              Calpoint has a community edition and an enterprise edition - and InfiniDB is a columnar data store. 

               

              Using Navicat Premium(Data Mgr)- I have InfiniDB on port 3309 - and in Navicat Works assigning a mySQL data source type to InfiniDB.

               

              5 Queries to a 1+ million row 5 column datasource has mySQL @ 37 secs

              5 Queries to a 1+ million row 5 column datasource has InfiniDB @  3.712 secs

               

              5 Queries - CFThreaded to mySQL  9.3 seconds  - run 5 queries in separate Threads using Coldfusion CFTHREAD

              5 Queries - CFThreaded to InfiniDB  1.487 seconds

               

              When there are 3 or 4 worksheets and queries must be refreshed on a dashboard - how does threading work?

              Will InfiniDB assigned a mySQL data type - work in Tableau?  I don't have a Desktop License to test with.

               

              As cpu & thread counts grow - will we be able to see a parallel process advantage as we scale up cpu counts?

              • 4. Re: VizQL query optimization for MySQL
                Robert Morton

                Hi Jai,

                 

                Currently the only way Tableau can optimize table joins as you described is to determine the PK/FK integrity constraints.  This helps ensure that removing unused tables from the join clause is a safe operation that won't affect the number of records returned.

                 

                One way you can address your problem is to duplicate your data connection and tailor each clone to a specific visualization by removing the unneeded tables.  This may not be ideal, but until MySQL evolves to give you both performance and functionality then this will serve as a workaround.

                 

                -Robert

                • 5. Re: VizQL query optimization for MySQL
                  guest contributor

                  Hello,

                   

                  Echoing this thread, I would love to have more control over the queries that Tableau issues in its engine. It seems that a lot of its behavior can be configured, as I saw another thread about TDC and I noticed also that its behavior is very different depending on what DB you are attacking. I would love for there to be a way for advanced users to just have a list of options about what to and what not to do. For example, I really hate the temporary tables that tableau builds. I understand why you do it, but I spend a lot of time optimizing the original data table, and by creating large unoptimized temporary tables, Tableau often actually makes performance worse.

                   

                  I recently have been playing with tableau against Vertica and I notice that against that DB, the SQL it issues gets really convoluted, but it doesnt create temp tables. I would like to be able to get this behavior against mysql relational sources.

                   

                  Additionally,  there are other databases you do not yet support... I would love to expand the options available when creating a ODBC data connection, to include Tableau-based options (or essentially the TDC concept again).

                   

                  I deal with relatively big datasets, and I spend a lot of time optimizing the database-tableau interaction both on the db & tableau side to make sure that it "doesnt get into trouble", some of this work could be resolved by more control

                   

                  Thanks,

                   

                  Mike

                  • 6. Re: VizQL query optimization for MySQL
                    . Tableautester

                    What other databases have you found aren't supported?

                    • 7. Re: VizQL query optimization for MySQL
                      guest contributor

                      Infinidb was of particular interest to me

                      • 8. Re: VizQL query optimization for MySQL
                        Robert Morton

                        Hi Mike,

                         

                        I agree with your sentiment that giving users more control can lead to better performance.  It comes with several challenges: providing a good experience for tuning these parameters; supporting collaboration or distribution of these settings across a team, department or enterprise deployment of Tableau Desktop and Server; and testing our product internally against every possible combination of settings to protect our product's functionality and performance.

                         

                        TDC files offers some degree of control without any of the above considerations.  You must edit a raw XML file, there are few options for collaboration, and we do not officially test or support TDC-based connections.  Despite this, TDCs are appearing useful in more contexts and we hope to produce a knowledgebase article explaining how to use this experimental feature.

                         

                        In the interim, may I suggest a workaround for your temp table concerns in MySQL?  If you revoke privileges for creating temporary tables from the user accounts which you use for your analytical connections from Tableau, the application will detect this lack of support and cease to attempt using temporary tables in your queries.  The revoke operation should look something like:

                        REVOKE CREATE TEMPORARY TABLES FROM `user`


                         

                        Finally, it may help to explain why Tableau creates temporary tables in the first place.  In some cases this is necessary to support the functionality of certain complex visualizations that users can build with Tableau -- for example, a subquery may not support the TOP / ORDER BY clauses needed for advanced filter conditions.  In other cases, we want to avoid issuing a massive query string that expresses the conditions of a large include / exclude filter.  For example, we have some test cases for MySQL which produce a query containing nearly 300,000 characters, which causes the MySQL parser to choke.

                         

                        I hope this discussion helps, and I wish you luck with your performance tuning.

                         

                        -Robert

                        • 9. Re: VizQL query optimization for MySQL
                          Robert Morton

                          Mike, regarding Infinidb - it appears to use the MySQL front-end.  Can you simply use the Tableau MySQL connector, as other folks have been able to do with Infobright?

                          -Robert

                          • 10. Re: VizQL query optimization for MySQL
                            guest contributor

                            Negatory, it throws massive amounts of syntax errors. Theres a lot of things Tableau thinks Infinidb can do that it can't.

                            • 11. Re: VizQL query optimization for MySQL
                              Robert Morton

                              Hi Mike,

                              Have you tried using the ODBC connector for InfiniDB?  If you are still interested in trying to work with this data source then may I contact you via your forum email to discuss the ODBC-specific TDC options?

                              -Robert

                              • 12. Re: VizQL query optimization for MySQL
                                guest contributor

                                Hey Rob, I did try to build my a ODBC connection, I made a little progress there by checking certain options off... but still ran into errors. Sure, email me, thanks!

                                • 13. Re: VizQL query optimization for MySQL
                                  Andres Parra

                                  Hi Robert, I'm using InfiniDB on a DB with 4 GB of daily information. I need optimize the ODBC connection and use of parallel queries, or any other things you can think because the reports are very slow, you could help me with that