8 Replies Latest reply on Jul 19, 2013 5:23 PM by Richard Leeke

    Question on joining sheets - figures inflated

    Brad McCormick

      Hi,

       

      After joining two sheets via the Date dimension (one has sales by multiple categories per day, the other has daily targets), both sales and target figures are inflated.

       

      Workbook attached and any advice welcome!

       

      Thanks.

        • 1. Re: Question on joining sheets - figures inflated
          Brad McCormick

          Hi Shawn.

           

          Thanks for clarifying - twbx file attached!

          • 2. Re: Question on joining sheets - figures inflated
            Brad McCormick

            Shawn,

             

            Sure thing - here you go.

             

            In the meanwhile I seem to have made this work by blending sep files - rather than joining sep sheets in one file. I assume joining in this case is preferable however?

             

            Thanks.

            • 3. Re: Question on joining sheets - figures inflated
              Matt Lutton

              Brad McCormick:

               

              Did you get help on this?  I would be interested to hear about the results.  I am finding that there is very little information on joins in Tableau available.  I think most folks who are using the tool understand SQL pretty well, which may explain why so little information is available.  I am having a very hard time understanding what type of join to use in what scenario.  Would be interested to hear what you ended up with.

              • 4. Re: Question on joining sheets - figures inflated
                Brad McCormick

                No, not really. Have been able to make blends work w two sep files but can't seem to find resources on joining w/in the same file. Would love to hear if you have any luck as well!

                • 5. Re: Re: Question on joining sheets - figures inflated
                  Richard Leeke

                  There are a few things going on here, I'll try to explain each of them.

                   

                  Firstly, the crux of the problem is that you have targets expressed at one level of detail (daily targets by channel and store), whereas the sales data is broken down further (by category).So for one day, for one channel and store you need to compare the sum of sales for all categories with the target value. If you want to compare overall totals, you will need to compare the sum of all sales with the sum of all targets. That sounds easy, but as you've found, there are some traps.

                   

                  This is actually a good example of how data blending can be much the easiest approach for comparing figures at different levels of aggregation. By way of illustration - I put an example together to show how to do your overall sales vs target comparison with a join (plus table calcs) and with a blend. For the blend I just dragged a couple of fields onto the view and immediately got the correct answer. With the join I spent 10 minutes figuring out why it was wrong.

                   

                  The first issue with the way that you have the multiple table connection defined in the workbook is that you haven't specified enough join fields. You have only joined on date, which means that for each row in the sales table you have joined to every row of target data for that day - even if the rows were for different channels and stores. That is why your numbers have come out so inflated. Instead of just ending up with one row for every row of sales data you have 10 or more rows in your join.

                   

                  To fix that, you need to add join fields of channel and store to the connection.

                   

                  Exactly the same is true when using data blending, by the way - you need the relationship to include date, channel and store.

                   

                  With a blend, once you have the relationship defined correctly you just need to drag sum(sales) and sum(target) onto the view and Tableau figures it all out for you. (Actually that might be a slight over-simplification - but it does tend to make good default choices.)

                   

                  With a join, you need to be aware of the differing levels of detail and the fact that you have (in this case) multiple copies of your target amount. This is because you have rows at the level of the sales data, but each row includes a field which shows the overall target for all categories for the date/channel/store.

                   

                  No time to explain fully, but in the attached sample workbook I've added individual connections to targets and sales (to demonstrate how to do the blend) and also a connection joining those in the data source. Note that as the data includes targets for which there is no sales data, you need to define an outer join so that you still include the target data even when there are no sales. In this case I made it a right join because of the order I happened to pick the tables, (I wanted to keep rows from the right hand table even when there were no corresponding rows in the left hand table).

                   

                  I've added example sheets showing comparison of sales vs target both at the overall level and broken out by date, channel and store, using both the join and the blend approach. Note that the join approach gets trickier for the overall totals because you need to use table calculations to do the second level of aggregation. I've also included a sheet showing the daily sales data broken out by category as a stacked area chart compared against the overall target for the day/channel/store. I couldn't immediately see any way of doing that with a data blend - because to get the category data I would have needed to make the primary data source sales, which would have introduced two problems. Firstly we would have got multiple copies of the target again (solvable with table calcs) but we also wouldn't have got any target data at all for date/channel/store combinations with no sales. I couldn't see a way around that.

                   

                  Have a pick through the example and see if that all makes sense.

                  • 6. Re: Question on joining sheets - figures inflated
                    Matt Lutton

                    Richard: Thank you for taking the time to explain this.  I often feel very limited by my lack of SQL/join knowledge, but am learning a lot through folks like you in the community.  Really appreciate it.

                    • 7. Re: Question on joining sheets - figures inflated
                      Brad McCormick

                      Richard,

                       

                      Thank you very much for the detailed explanation and example. I really appreciate your taking the time to outline the issue and solutions so clearly.

                       

                      This really helps me a lot!

                      Brad

                      • 8. Re: Question on joining sheets - figures inflated
                        Richard Leeke

                        Glad it helped and thanks for reporting back - it does make a difference to how motivated I feel to answer future questions!