1 Reply Latest reply on Apr 5, 2016 9:31 AM by Stephen Rizzo

    Binning with bin limits defined in a separate SQL table

    Stephen Rizzo

      Hi everyone,

       

      I am having an issue with binning I was hoping someone could help me with. I am working with survey data, and I would like to "bin" average scores on specific questions into categories defined in another SQL table. I have attached a workbook with some sample data to illustrate.

       

      My use case does impose several limitations:

       

      • Pre-aggregating response-level data is not an option - I will need access to respondent-level demographics for slicers.
      • Doing a cross-join between the bins and the data at the response-level is not an option due to the volume of data I am dealing with. I need to be able to use this approach with hundreds of thousands of responses and a hundred bins.

       

      Is this possible somehow, maybe using blending / scaffolding / densification?

        • 1. Re: Binning with bin limits defined in a separate SQL table
          Stephen Rizzo

          For future reference if anyone comes across this question, I did find a workaround that we can use. Our solution required duplicating row level data, but thankfully for our use case having a single duplicate for each row still provided acceptable performance. The key ideas were to use custom SQL to duplicate the data set, use "Show Missing Values" on a calculated date field to create the correct number of rows, self-blend to replicate the desired measures across all rows, then create the necessary inequality filters.

           

          The attached workbook doesn't quite capture the exact process - custom SQL wasn't needed when working with sample data in Excel - but hopefully it can help someone struggling with similar issues.

           

          If anyone can find a cleaner solution to this problem, I would love to hear it!