11 Replies Latest reply on Jun 1, 2013 6:02 PM by Matt Lutton

    Database Management/Connection Question

    Shawn Wallwork

      Our complete Access 2000 database has 70+ tables with 5-50 fields in each table. We only need 10 of these tables so I created a second database and then linked to the 10 tables we want to look at. Within these 10 tables we only need to work with 5-10 fields for each table. So I did a query for each table that includes only the fields of interest; each query only includes one table.


      At this point I've got the data whittled down to a manageable size; the most records in any of the tables is 65,000, so it's really not a big dataset.


      Question, is it better to:


      1. setup the relationships in the database and run queries on these joins, and then connect to these queries in Tableau, or
      2. connect to all the individual queries and use Tableau's multi-table options to join the data?


      What are the pros and cons of each route? What do most folks do, join in the database or join in Tableau?



      • we will be doing extracts on the connections,
      • the whole thing will be updated once a week
      • we are working in Desktop, NOT Server


      Thanks a lot, will appreciate any guidance.




      PS: No packaged workbook available.

        • 2. Re: Database Management/Connection Question

          In theory, doing joins in Tableau should be better from performance point of view as Tableau is supposed to have a way of constructing efficient SQL to pass on to DBMS, compared to just triggering a query with the same joins inside database and getting the dataset. However, considering that MS Jet driver us used for connecting to Access, I am not sure if this will make any difference.


          The other problem with "joins inside Tableau" might arise if you need some complex joins, like


          order_date BETWEEN some_date AND some_date + 1

          No problem for DBMS, but how you do that inside Tableau?


          I work in Oracle, and my personal preference is to use Tableau joins for simple connections to small tables with simple joins. Anything complex is done inside Oracle, either a view or custom SQL.

          I use custom SQL a lot, mainly because I often need to apply additional filtering and processing to the data due to limitations of Tableau when it comes to inter-record processing, i.e. "for each record check if there is a matching record, using this criteria", etc., and due to performance limitations of DBMS if a table has too much data.

          1 of 1 people found this helpful
          • 3. Re: Database Management/Connection Question
            Shawn Wallwork

            Thank you, Dimitri for your insights.


            [I've watched the video you posted -- totally excellent! Comments tomorrow on both threads.]



            • 4. Re: Database Management/Connection Question
              Jonathan Drummey

              Hi Shawn,


              95% of what I currently do is in Access as well. (Most of the rest is Excel, with a little SQL Server just entering the mix).


              My workflow is that I'll build a query or set of queries in Access and use that as the data source for building extracts. I hardly ever use Custom SQL and instead put the intelligence into the queries, here's how I've come to this:


              - Keeping the logic in as few places as possible. Rather than spreading logic between Access queries/tables, Custom SQL, the data connection, and then Tableau calculated fields, it's just in Tableau calculated fields and Access queries/tables.

              - Custom SQL for JET is painful. There are basically two error messages that JET generates, no matter the actual problem, Access gives me more information and will catch me from mistyping something.

              - Related to what Dimitri said about complex join criteria, I'm often using assorted WHERE clauses (aka criteria in Access), for example to only get inpatient records from a master table that also has emergency department visits, lab visits, rehab visits, etc.

              - Access has a graphical interface for manipulating tables, joins, etc. I have a preference for visual interfaces.

              - Extract vs live connection performance. There are various issues that impact our Access performance that I could go on and on about. I've found extracts to be 1 or more orders of magnitude faster than Access queries. The time lost by having to rebuild an extract here and there to add rows or add a new field is very much compensated for by the time saved inside Tableau.

              - Also, the performance gains from letting Tableau manage a multiple tables live connection are ignored when using an extract. I've also found that when an Access database has a ton of tables & queries, or one or more tables with 500K+ rows, connecting to that database can get really really really slow (like 3-5 minutes just to get the connection dialog up).

              - If you're using the same query structure to generate multiple extracts for multiple workbooks (like one workbook per client), then maintenance wlll be easier if all you have to do is change one query than the connection in every workbook.


              Some notes about working with Access:


              - If you set up the relationships in your master database(s), those will carry through to the child databases to help you build queries.

              - You can't use a parameter in a JET query, Tableau has no way of issuing that parameter. So for example, instead of asking the user for a date range, I'll hard code a large enough range into the WHERE clause, then give the user control of that inside Tableau.

              - UNION queries aren't exposed by JET in the list of available queries. However, if you make the UNION query a subquery, like SELECT * FROM [your union query], it will work fine.


              Hope this helps!



              1 of 1 people found this helpful
              • 5. Re: Database Management/Connection Question
                Shawn Wallwork

                Thanks Jonathan, this was very helpful!



                • 6. Re: Database Management/Connection Question
                  Toby Erkson

                  I've worked with Access, Excel*, Oracle, MS SQL Server, DB2, a little SAP, Cognos versions 4 thru 10, and a smattering of Business Objects here-n-there.  The one thing you want is a single version of the truth.  Without that data integrity becomes questionable...even could become useless.


                  I agree with D & J.  Let your db do the heavy lifting, indexing, and query optimizations.  Set up referential integrity in your database as it will help 'self document' how data flows.  You need to think about who's going to be taking over your project when you're gone (for whatever reason) so things need to be easy to understand as best as possible.




                  * I mention Excel because I've had customers use it as a database.  Egad!

                  1 of 1 people found this helpful
                  • 7. Re: Database Management/Connection Question

                    You might consider converting Access 2000 to SQL Server 2012 Express:

                    Convert Access DB to SQL Server 2012 Express



                    Database System | Performance & Scalability | SQL Server Express Edition

                    Easy to get started. Free to use.

                    Take advantage of the same powerful database engine as the other versions of SQL Server in a version great for redistribution and embedding, free! SQL Server Express includes 10GB of storage per database, easy backup and restore functionality and compatibility with all editions of SQL Server and Windows Azure SQL Database.

                    1 of 1 people found this helpful
                    • 8. Re: Database Management/Connection Question
                      Shawn Wallwork

                      Thanks guys. Johan I'll take a look at SQL Server (for my own knowledge), but this isn't my database, I'm just the viz guy. And like Toby said I'll be gone in a week or two and they'll need to maintain and update the workbooks on their own. So I think it's best to leave them with what they know.


                      This has all been very helpful. Interesting to know the consensus seems to be do as much in the DB as possible before connecting to Tableau.



                      • 9. Re: Database Management/Connection Question

                        I probably read too much into "our"
                        I agree it is better for them to stick to what they have and know.


                        Ps. In 2009 I worked a little bit with the free SQL Server 2008 Express helping someone moving away from Access (if I remember correctly).

                        I liked it perhaps even better than the full power (and expensive) SQL Server, which I use daily. It somehow felt lighter, but maybe the reason was simpler & fewer data    It definitely felt lighter than Access and I am quite sure the developer I helped also felt a relief.

                        1 of 1 people found this helpful
                        • 10. Re: Database Management/Connection Question
                          Robert Morton

                          Thanks to everyone here for your replies to Shawn! Many Tableau employees read the forum discussions, and these kinds of answers provide great product feedback.


                          1 of 1 people found this helpful
                          • 11. Re: Database Management/Connection Question
                            Matt Lutton

                            I agree with doing as much as you can on the back end.