1 2 Previous Next 23 Replies Latest reply on Jan 8, 2013 5:02 AM by BRIAN SULLIVAN

    Trouble with Joins

    BRIAN SULLIVAN

      I am trying to create some variance reporting for budget purposes and have 2 tables: Original Budget (left table) and Current Budget (right table).  Each table has a key and I did a RIGHT join on the KEY field (OB.key = CB.key).  This for the most part worked great, but there are instances where there are keys in the Original Budget table that are not in the Current budget table.  I thought I could then add a LEFT join to get all of the records, but Tableau doesn't allow that. 

       

      I'm not sure what to do about this.  Any Suggestions?  Is there another way?

       

      Thanks.

        • 1. Re: Trouble with Joins
          Tracy Rodgers

          Hi Brian,

           

          Can you not just do a left join from the start? Or depending on the desired outcome, perhaps CustomSQL is a better option:

           

          http://kb.tableausoftware.com/articles/knowledgebase/custom-sql-join

           

          -Tracy

          1 of 1 people found this helpful
          • 2. Re: Trouble with Joins
            Richard Leeke

            If I understand correctly, the reason you can't just use a left join is that you may have missing rows in either table: i.e. some rows exist in both OB and CB, some are in OB only and some are in CB only. Is that right?

             

            If so, what you really need is a SQL FULL JOIN (sometimes referred to as FULL OUTER JOIN) which is like the combination of a LEFT and a RIGH JOIN. But lot's of databases don't support FULL JOIN, and Tableau doesn't support them for multiple table connections, so you'd have to use Custom SQL.

             

            So if you are using a datasource that supports them you can just convert your connection to Custom SQL and replace the RIGHT with FULL. If not, you need to simulate it with a UNION of a LEFT and a RIGHT join.

             

            You need to define the UNION carefully. Depending on your data structure it can be very easy to double-count some of the rows, or remove duplicate rows that you wanted to keep.

             

            You basically want something that looks like this:

             

            SELECT * from OB

               LEFT JOIN CB on OB.key = CB.key

            UNION ALL

            SELECT * from OB

              RIGHT JOIN CB on OB.key = CB.key

            WHERE OB.key IS NULL

             

            The 'WHERE OB.key IS NULL' is to stop you including the rows that exist in both tables twice and you need to use 'UNION ALL' rather than just 'UNION' to avoid losing any duplicates in your original data that you do want to keep.

             

            There's quite a good description here (that article is specifically about MySQL, but most of it applies to other databases).

            1 of 1 people found this helpful
            • 3. Re: Trouble with Joins
              BRIAN SULLIVAN

              Sorry for getting back to both of you so late.  This is all good information, but when I go and do the custom SQL I get an error that says the table cannot be found.  Any solution for that?

               

              The goal is to get all records from both tables.  I can get the desired result manually, but something tells me there is a better way to do this.

               

              Maybe once I get the custom SQL to work it will give me what I need.

               

              Thanks to both for your help.

              • 4. Re: Trouble with Joins
                Robin Kennedy

                Richard Leeke's approach is the right way to go and should work for your needs. What SQL are you using when you get the error?

                 

                Can you also post the SQL from the custom box before you do any edits to it (i.e. the connection works when you just have the initial right join, what is the SQL when you change to custom at that point?)

                • 5. Re: Trouble with Joins
                  BRIAN SULLIVAN

                  Here is the Custom SQL:

                   

                  SELECT * FROM [2012OB]

                  LEFT JOIN [CurrentNov] on [2012OB].[Key] =  [CurrentNov].[Key]

                  UNION ALL

                  SELECT * FROM [CurrentNov]

                  RIGHT JOIN [CurrentNov] ON [2012OB].[Key] =  [CurrentNov].[Key]

                  WHERE [2012OB].[Key] IS NULL;

                   

                  It follows the same syntax as Richard's post suggests as far as I can tell.  It also looks like others are getting the same error.

                   

                  Here's the error I get:

                   

                  Database error 0x80040E14: Syntax error in FROM clause.

                  Unable to connect to the Microsoft Excel file "G:\Tableau\Budget Test.xlsx". Check that you have access privileges for the requested file and that it is not open in another application.

                   

                  The file is not open when I try this.

                  • 6. Re: Trouble with Joins
                    Robin Kennedy

                    The second SELECT statement (after the UNION ALL) should read

                     

                    SELECT * FROM [2012OB]

                    • 7. Re: Trouble with Joins
                      BRIAN SULLIVAN

                      You're right, but it does not solve my error unfortunately.

                      • 8. Re: Trouble with Joins
                        Robin Kennedy

                        Hmm.. just realised you are connecting to a spreadsheet... you will need to put $ signs in your table (sheet) names.

                         

                        i.e. [2012OB$], [CurrentNov$]

                         

                         

                        SELECT * FROM [2012OB$]

                        LEFT JOIN [CurrentNov$] on [2012OB$].[Key] =  [CurrentNov$].[Key]

                        UNION ALL

                        SELECT * FROM [2012OB$]

                        RIGHT JOIN [CurrentNov$] ON [2012OB$].[Key] =  [CurrentNov$].[Key]

                        WHERE [2012OB$].[Key] IS NULL

                         

                        Any luck with that?

                        • 9. Re: Trouble with Joins
                          BRIAN SULLIVAN

                          Getting closer.  Now I'm getting this error:

                           

                          Database error 0x80004005: '' is not a valid name. Make sure that it does not include invalid characters or punctuation and that it is not too long.

                           

                          One interesting observation.  When coding Custom SQL do you need to put a semi-colon at the end like you normally would?  When I put the semi-colon at the end I get the Syntax error in FROM clause.  When I omit it, I get the error above in this post.

                          • 10. Re: Trouble with Joins
                            BRIAN SULLIVAN

                            OK disregard my last post.  The errors were occurring when I clicked "Preview Results."  When I click through the process it works with your coding.

                             

                            Thanks for your help!

                            • 11. Re: Trouble with Joins
                              BRIAN SULLIVAN

                              OK, so I still get the Database error 0x80004005: '' is not a valid name error after it connects.  This happens when I click on View Data.

                               

                              Any thoughts?

                              • 12. Re: Trouble with Joins
                                Robin Kennedy

                                This is a guess, but it could be something to do with the spreadsheet having some data in it with no column names (i.e. in Row 1)

                                 

                                Try deleting (highlight rows, right click, Delete) any columns in the spreadsheet tabs (both 2012OB and CurrentNov) that are not going to be used in your viz. I would recommend doing a mass delete from the column to the right of your last column of data, all the way to the far right of the sheet. E.g. if your data is in columns A to F, the select Column G by clicking on the letter G then press Shift+Ctrl+<right arrow> Then right click and delete. Then save and try reconnecting in Tableau. Might be best to start from a new spreadsheet and Tableau session.

                                 

                                Failing that, edit the custom SQL so that instead of SELECT *, you call the specific column names i.e.

                                 

                                SELECT date, category, name, budget FROM [2012OB$]

                                ...

                                ...

                                etc.

                                1 of 1 people found this helpful
                                • 13. Re: Trouble with Joins
                                  BRIAN SULLIVAN

                                  Mass deletion of the columns worked like a charm.  Thanks for ALL your help!

                                  • 14. Re: Trouble with Joins
                                    BRIAN SULLIVAN

                                    Not to beat a dead horse, but the trouble I'm having now is that when I pull in dimensions (ie account) from one of the tables, I get NULLs.  I understand why but is there any way around this?  I tried creating sets, but that doesn't seem to get me exactly where I want.  Is there a way to combine 2 like dimensions from different tables?

                                    1 2 Previous Next