1 Reply Latest reply on Jan 31, 2014 7:03 AM by Robert Hoekstra

    Data Blending and Aggregation

    Robert Hoekstra

      I have two data sources... one is an Oracle database of participants and the other is an Excel spreadsheet of cases.

      In the oracle database, there is a case# which corresponds to a case number in the case spreadsheet.  Note that that case number can be listed multiple times in both databases.


      I need to be able to calculate the number of days between a date in the Oracle database (start date), and a date stored in the case spreadsheet (issuance date).  This should not only provide the average time delay, but allow me to plot the distribution of cases based on that delay (both a measure and a dimension) without the case number on the sheet.  All rows with that case number in the spreadsheet have the same issuance date.


      The problem is that when I do data blending, the issuance date comes in as an aggregation, meaning that subsequent calculations need to be based on attributes and need the case number on the sheet making the reports I need impossible.  The simplest version would be an X-axis with the number of days between start and issuance and the Y-axis being a count of records with that number of days between them.


      I have created a dummy excel table to illustrate the problem.  I apologize ahead of time that the size and confidentiality of the databases prevents me from posting the actual data.


      There is another complication, typically I have solved the above dilemma by using a rawsql calculation to grab the date rather than doing a join or data blending.  Unfortunately, I need to use an extract for the oracle database (for performance with a little over 4 million records)... which removes the ability to use rawsql calculations.


      If you have any thoughts, please let me know.