5 Replies Latest reply on Jul 6, 2017 5:56 AM by Jim Dehner

    LOD Issue on Blended Primary and Secondary Data sources

    Antony raj

      Hi Friends

       

      I am facing an issue with Data Blending. I have a primary Source which contains all the Account Number. In a Secondary Source I have a Recharge Count. I blend these two Data Sources using Account Number Column. When I create a View with Account Number and Recharge Count , I get correct Values. But, When I want to see data by Nationality ID (Passport, PR, WP, etc) , higher values are given. noticed that this because of Duplication of Records. In the Secondary data Sources data is stored at Account Number Level, whereas in the first one we have stored data at a different level(Phone Number). One Account Number might be having more than one phone Numbers, with different Nationality Ids. So Recharges values are duplicated. 

       

      First Image shows AccountNumber with recharge Count.

      Main.PNG

       

      When I see data by Nationality ID,  I get below which is wrong.

      Nationality.PNG

      I added account Number to the same View and noticed that , values are getting duplicated .

      All fields in the

      Duplicated.PNG

      So thought of writing LOD Expressions as {FIXED [IntgratedAccountID] : Sum (Transaction COunt(#)) }  . But getting an error saying below.

       

      Please Note that, in the LOD view, I should Use a Dimension from Primary Source. But Measures in LOD is from Secondary Source.  Could anybody help me to to fix this issue?

        • 1. Re: LOD Issue on Blended Primary and Secondary Data sources
          Jim Dehner

          Good morning

          There are a couple of things here - yes when you blended your data with a one to many relationship some of your data get duplicated - one way to approach this is to take the average (AVG) versus sum when aggregating

          The other issue is that in an LOD expression all the elements must come from the same data source - that is just a limitation - if you use the average as noted above you can avoid the need for an LOD expression

          Let me know if this helped

          Jim

          • 2. Re: LOD Issue on Blended Primary and Secondary Data sources
            Antony raj

            Thank you Jim.

            When I use Avg(transaction_Count), it is giving the desired.

            But when I see the Grand total, it gives  an average value which is wrong.

            It should give me value of 253709 as given in the below screen shot. Should we use Windows Functions? I tried, but no luck.

            Main.PNG

             

            Thanks

            Antony

            • 3. Re: LOD Issue on Blended Primary and Secondary Data sources
              Jim Dehner

              I would need a copy of your workbook to know - my first choice would be a window_something function --- probably

              Window_sum(avg(and your measure goes here))   but I would need to play with the data to make it work

               

              Jim

              • 4. Re: LOD Issue on Blended Primary and Secondary Data sources
                Antony raj

                Hi Jim

                 

                Thank you so much. I have replicated this issue with dummy data and shared as a twbx. I will share additional information If needed.

                 

                Thanks

                Antony

                • 5. Re: LOD Issue on Blended Primary and Secondary Data sources
                  Jim Dehner

                  Hi

                  I looked at your data and there are a couple of issues to think about

                   

                  you have 2 data sources that are joined together on the integration account id - one of your data sources assigns the integration account id to 2 different nationalities - as a result the data in your second source is duplicated - once for each nationality - is that what you really want?

                   

                  The second consideration is the average calculated field - since you are working with 2 sources the Avg that you used is taking data from only a single source and getting an incorrect total

                   

                  the equation should tie the transaction count in the secondary source to the number of records in the first (could be different based on your answer to the above question)

                       sum([Trans Count] )/SUM([Primary].[Number of Records])

                   

                  That will yield this

                   

                  But the real question that needs to be resolved is how the data sources should be joined

                   

                  Hope that helps

                  Jim