5 Replies Latest reply on Mar 28, 2018 6:05 AM by Stats Iliterate

    How to structure an aggregated dataset (company level ) in order to get the same filtering outcomes as when using a more specific (employee level) dataset?

    Stats Iliterate

      Hi everyone,

       

      When I work on a dataset that has individual level data in rows, I can use pretty much any variable that has been built in excel as a column to work as a filter for the dataset. For example:

       

      - Say I have 25 rows of data, one for each person. Column 1 can be "Company", Column 2 can be "Gender", Column 3 can be "Age" and Column 4 can be "Monthly Salary". If I do not have missing data, I have 25 cells in Company, 25 in Gender, 25 in Age and 25 in Working Salary. Consequently, I will be able to filter my outcome variable (Working Salary) according to these three variables (Company, Gender, Age), and maybe even a mix of them (e.g., Working Salary for people in Company A and B, Female, Young Age);

       

      However, if I can only show data that has been aggregated at the company level, I do not know how to setup those filters to get the same outcomes (or even if it is possible). So, for instance, I will have 5 rows instead of 25 , and each row will be for Companies A to E. Now I can add a column with the number of female workers at each company and another with the number of male workers at each company. I can also add two more columns, one for Age Below 30 workers and one for Age Above 30 workers. But how do I get the same combinations across filters for this aggregated (company-level) dataset?

       

      It bears mentioning that while I do have the individual level dataset, I can only provide company level data (confidentiality issues, among others).

       

      I am hoping this is possible and that someone will be able to help me.

       

      I am attaching two workbooks, both with the same info, one aggregated and the ones not. My hope would be to be able to filter the data on the aggregate dataset in a way similar to that performed in the individual level dataset

       

      Best regards and thank you for your time,

       

      Stats