1 2 Previous Next 18 Replies Latest reply on Feb 26, 2016 7:22 PM by pooja.gandhi

    SUM returns value much larger than expected

    Jay Sridharan

      Hey guys,

       

      I've got a workbook with a table called teamMatches.

      Each teamMatch has a teamNumber (int) and collectionEnded (0 or 1). I am simply trying to SUM collectionEnded, grouped by teamNumber

       

       

      Sample data for teamNumber 81:

      Expected Output:

       

       

      Current Output from Tableau :

       

      I have attached the workbook. I believe this is a LOD problem, but I am not sure how to fix it.

       

      Thank you so much for all the help!

       

        • 1. Re: SUM returns value much larger than expected
          Amanjot Klair

          Hi Jay,

           

          I checked the workbook and based on the data in workbook the numbers are correct. If you do a view data in the workbook there is 1 for CollectionEnded in each of the rows for TeamNumber 81. I am not able to get what you are trying to do here. Attached is the excel workbook with data extracted from your extract which shows there is a value of 1 for each row and there are 96 records for 81 so the sum is correct to be 96 against 81. Can you please check your data once again.

           

          -Amanjot

          1 of 1 people found this helpful
          • 2. Re: SUM returns value much larger than expected
            Mark Fraser

            Hi Jay

             

            I have to agree with Amanjot, there seems to be a difference between your screenshots, and the data you provided.

             

            If the data was related to the SQL query below - we shouldn't have other teams?! Only teamnumber 81, that is not the case, you shown that yourself in your screenshots.

            I suspect you have inadvertently cross joined the data sets (with each other) and are duplicating the data,

            Sample data for teamNumber 81:

             

             

            if we look at this record in Tableau

            We are missing MatchID 30

            And for some reason you have 60 rows for matchid 35 and 36 rows for match id 38  (hence the 96 total)

            You were expecting, a single row for each.

             

            So either the data set is different, or you have accidentally duplicated data while importing/ joining.

             

            Hope that helps/ makes sense!

             

            Cheers

            Mark

            1 of 1 people found this helpful
            • 3. Re: SUM returns value much larger than expected
              Jay Sridharan

              Thank you guys for the help so far. I do not believe the cartesian product is the problem here. I am joining many tables together, and Tableau gets a large cartesian product to compute with. For example, if I do

               

              Team Number / CNT(Match Number)

              I get numbers very similar to the SUM(collectionEnded) - because for many of the matches, collectionEnded = 1. However, if I do COUNTD(matchNumber), I get the following:

               

               

              These look like better numbers. Basically, I need the SUM distinct of collectionEnded.

               

              That being said, Mark has pointed out a good point. Match 30 is missing, but it shouldn't be. I suspect the reason for that is the joining, but I'm not sure why it is acting that way. I believe it is somewhere in this section of the relationships that it is being lost. Let me explain.

               

              There are a series of matches, each have a matchNumber and compID. Their primary key is matchID.

               

              Through matchID, matches makes a one to many relationship with teamMatches, which has a teamNumber, collectionEnded, and other random data. The primary key here is teamMatchID

               

              Each teamMatch forms a one to many relationship with 4 tables : matchBreaches, matchClimbs, matchFeeds, and matchShoots through it's teamMatchID.

               

              For the purposes of this document, I don't really care if there is any data in any of those 4 tables, I still want to see the record from teamMatches, hence all the left joins. For match 30, there is only data in matchClimbs - the other 3 have no records, and it gets rid of the record from teamMatches.

               

              So, for the first step, what do I need to change in this diagram to make it show all teamMatches?

               

              Thank you all for the help! Hopefully we can get this sorted out quickly.

              • 4. Re: SUM returns value much larger than expected
                Mark Fraser

                Hi Jay

                 

                This is good.

                I want to focus on a small part to start -

                There are a series of matches, each have a matchNumber and compID. Their primary key is matchID.

                Should the PK, not be both? Or is matchnumber unique?

                I worry about this...

                Through matchID, matches makes a one to many relationship with teamMatches, which has a teamNumber, collectionEnded, and other random data. The primary key here is teamMatchID

                is matchnumber in teamMatches as a foreign key? and is teamNumber in matches?

                How do you link teamMatches to Matches?

                 

                I'll stop there, or its going to get confusing!

                1 of 1 people found this helpful
                • 5. Re: SUM returns value much larger than expected
                  Jay Sridharan

                  Alright. MatchID and MatchNumber are two different things. Ill explain from the begenning:

                   

                  matches has 3 columns, id (sorry i had mentioned this as matchID before), matchNumber, and compID

                  • matchNumber and compID are in a unique index.
                  • id is auto increment

                   

                  Example Data :

                   

                   

                  matches.id is a foreign key to teamMatches.matchID

                   

                  Example Data for teamMatches :

                   

                   

                  I hope that makes sense..

                  • 6. Re: SUM returns value much larger than expected
                    Mark Fraser

                    Sorry Jay, 'real' work overtook my afternoon.

                    I will try and have a further look tomorrow...

                     

                    Seems like we are on the right track, attached something i find very helpful.

                     

                    Cheers

                    Mark

                    1 of 1 people found this helpful
                    • 7. Re: SUM returns value much larger than expected
                      Jay Sridharan

                      Haha,  that's not a problem Mark,  thank you very much for the help.

                      (please don't forget though!)

                      • 8. Re: SUM returns value much larger than expected
                        Jay Sridharan

                        Hey Mark,

                         

                        Thanks for that image.

                        I figured that if Tableau was cutting that record out of the result set, the same must happen if I were to run an SQL query, so I joined the tables together the same way and found that the result set is complete, which makes this problem a bit more puzzling.

                         

                        The query:

                         

                        SELECT COUNT(DISTINCT matchNumber)

                        FROM teammatches

                          JOIN  matches ON teammatches.matchID = matches.id
                          LEFT JOIN matchbreaches ON teammatches.id = matchbreaches.teamMatchID
                          LEFT JOIN matchclimbs ON teammatches.id = matchclimbs.teamMatchID
                          LEFT JOIN matchfeeds ON teammatches.id = matchfeeds.teamMatchID
                          LEFT JOIN matchshoots ON teammatches.id = matchshoots.teamMatchID
                        WHERE teamNumber = 81;

                         

                        The result:

                         

                        This must be something tableau specific. Let me know if you know what is going on.

                         

                        Thanks for the support.

                        Jay

                        • 9. Re: SUM returns value much larger than expected
                          Mark Fraser

                          HI Jay

                           

                          The SQL certainly looks OK.

                          It must be something to do with the join logic in Tableau.

                           

                          My advice is to take it back... replicate the teammatches & matches join in SQL, then replicate in Tableau < this is the key join.

                          At this stage, do not do the other joins. You may also only want to take team 81, less data means easier to track/ analyze.

                          Check if Tableau is showing as expected vs SQL. Assuming so, add 1 table, stop, check and use that methodology to proceed.

                          Make sure the first join is an inner join, vs the others which are outer.

                           

                          I think you need a way of identifying the mis-join and by starting small and building up, i hope it helps to diagnose....

                          If you could pass some sample data (in Excel) I'll certainly take a look!

                           

                          Cheers

                          Mark

                          1 of 1 people found this helpful
                          • 10. Re: SUM returns value much larger than expected
                            Jay Sridharan

                            Mark,

                             

                            That sounds good.  I'll try it when I get the chance, and I'll send you

                            some data as well.  But you said to make sure the other joins are outer.

                            In the Tableau interface,  I believe it only has inner, left and right,  no

                            outer joins.  Am I misunderstood?

                            • 11. Re: SUM returns value much larger than expected
                              Mark Fraser

                              I'll send you some data as well. 

                              that would be helpful, just something small... enough to replicate.

                              But you said to make sure the other joins are outer.

                              In the Tableau interface,  I believe it only has inner, left and right,  no

                              outer joins.  Am I misunderstood?

                              Sorry, my fault.

                              teammatches & matches = inner join

                              everything else to teammatches = left join

                               

                              To test, try the attachment - add both sheets and then swap between inner and left and see what impact it has... answers below

                              Inner join Left join

                              1 of 1 people found this helpful
                              • 12. Re: SUM returns value much larger than expected
                                Jay Sridharan

                                Hey Mark,

                                 

                                I've attached the data for teams 81 and 93. Sorry for the delay. I will start to check the joins now, and I'll let you know if I find anything.

                                 

                                Really appreciate your continued support.

                                 

                                Thanks,

                                Jay

                                • 13. Re: SUM returns value much larger than expected
                                  pooja.gandhi

                                  Hi Jay!

                                   

                                  Sorry to jump in, but I would think that would just be CNTD(Match Number) grouped by team?

                                   

                                   

                                  If you actually drop sum(collections ended) it is grouping by team the number of rows as a result of all the joins you are performing:

                                   

                                  Also it seems like collections ended is always 1 not (0 or 1). Collections started is 0.

                                   

                                   

                                  So if you take collections ended and sum it, you get this:

                                   

                                   

                                  Which is not really grouped by match id. 318 is actual number rows for the 2 teams resulting from the joins. I haven't caught up reading the entire thread, this is just what I noticed instantly! Hope that helps.

                                   

                                  Pooja.

                                  • 14. Re: SUM returns value much larger than expected
                                    Jay Sridharan

                                    Hi Pooja,

                                     

                                    I apologize, the data I attached in the excel sheet was just an excerpt, and coincidentally, they were all 1. But in the original dataset, I have over 50 teams, and there are 1's and 0's for collectionEnded, which is why I need to SUM(collectionEnded) grouped by team. You can try changing one of the 1's to a 0, and the graph wouldn't change, but I need it to.

                                     

                                    Thanks,

                                    Jay

                                    1 2 Previous Next