7 Replies Latest reply on Aug 17, 2017 6:50 PM by Isaac Briseno

    Duplicates caused by joins

    Isaac Briseno

      Hello,

       

      I have a spreadsheet that has 2 tabs at different levels of detail

      - Tab 1: Detailed transactions per day

      - Tab 2: Equity balance at the end of the day for each account

       

      I'm trying to make a simple line chart of the sum of the equities of all the accounts, but the join is causing that the days that have transactions duplicate the data (screenshot below)

       

      Is there any way to fix this? I'm new to Tableau so I'm not even sure in which direction to continue, maybe Blending? or using FIXED? or I got the joins wrong? or something else?

       

      Additionally I have another sheet in the attached .twbx, so every time I try to fix my chart then the other one gets the data wrong

       

      Thanks and Regards,

      Isaac

       

      duplicates.png

      summaryGetsAffected.png

        • 1. Re: Duplicates caused by joins
          Chris McClellan

          Do you really need to join the data ?

          And if you do, then SUM() will possibly give the wrong answer (especially in this case where you're joining a detailed dataset to a summarised dataset.

           

          Check your join condition as well, with a full outer you're getting 50 records but with an inner you're getting 20 records.

           

          If you really need to join (I"m not convinced yet ) and you need the data to be right, then usually a LOD calculated field will fix the problem.

          • 2. Re: Duplicates caused by joins
            Isaac Briseno

            Hi Chris,

             

            I tried the inner join and as you mention the count goes from 50 to 20 records

             

            But when using the inner join the chart only shows data for the days that have transactions instead the closing equity balance of each day of April

             

            What is the alternative to the join? is it blending? or Tableau has something else?

             

            Regards,

            Isaac

             

             

            aprilDays.png

            • 3. Re: Duplicates caused by joins
              Chris McClellan

              Isaac Briseno wrote:

               

              But when using the inner join the chart only shows data for the days that have transactions instead the closing equity balance of each day of April

               

              Ah OK, I'm looking at the data a bit more now.  You might want to read this Join Your Data

               

              But your Equity datasource has a record for every day, but the Trade datasource has a record only when there was a trade (obviously).

               

              I'm still guessing a lot about what you want to do with all the data, but you could get closer to what you're after by doing a left join.  That way you get all the equity dates, but still get the trades on the days when they are available.

               

              I'd also suggest trying to do this without joining - that way your equity (basically an "end of day balance"  datasource) is one datasource, and your trade (when a trade is placed and the results of that trade) data is a different datasource.  So no joining, no blending, just 2 completely independent datasources.

               

              BUT, I'm making a lot of assumptions about what you're trying to get done as an end result, so I'd suggest playing with this but it might not do exactly what you want.

              1 of 1 people found this helpful
              • 4. Re: Duplicates caused by joins
                Isaac Briseno

                Thanks a lot Chris!

                 

                I had all my Tableau concepts wrong about Connections, Data Sources, The Excel sheets that get added

                 

                Not joining and having independent data sources as you suggested was the solution

                 

                I didn't know how to do it but doing some research based on your post I found the following article

                 

                http://kb.tableau.com/articles/howto/connecting-multiple-data-sources-without-joining-or-blending

                 

                Regards,

                Isaac

                • 5. Re: Duplicates caused by joins
                  Okechukwu Ossai

                  Hi Isaac,

                   

                  Just to emphasize what Chris said earlier.....

                  If you must join the data sources to get the analysis done, then I'd suggest using an LOD expression to calculate the correct Amount USD.

                   

                  The issue you have is that Equity Tab is at the Account level of granularity while the TradeTrans Tab is at the Symbol level of granularity. So, each Account is repeated for every Symbol record in the TradeTrans Tab for any given date. For example there are 8 repeated records for 26th April; 4 each for USA and Mexico respectively.

                   

                   

                   

                  Using an LOD expression to recalculate Amount USD at the Account level will resolve the issue

                   

                  [Amount USD per Account]

                  {FIXED [Equity Date], [Account]: MAX([Amount USD])}

                  This will sum Amount for both countries and give $107,300 for 26th April . Modify this if you wish to calculate Amount at country level.

                   

                   

                   

                   

                  Hope this helps

                  Ossai

                  • 6. Re: Duplicates caused by joins
                    Okechukwu Ossai

                    I was writing my comments same time as Chris, so didn't realize this has been resolved. Glad you found the solution at last.

                    1 of 1 people found this helpful
                    • 7. Re: Duplicates caused by joins
                      Isaac Briseno

                      Thanks!

                       

                      I'm checking your solution as I think that it will be useful when the requirement forces me to join

                       

                      Fortunately this time separating the data sources worked

                       

                      Regards,

                      Isaac