We (Spokane Public Schools) are starting to use Tableau for more statistical stuff like program evaluations. For a specific example, we are trying to determine the effectiveness of a certain block schedule comparing it against a typical 60 minute class schedule.
So in my experiment I am looking at basically value added for two similar groups that go through the system.
Control Group: [Student Set A] who use a typical bell schedule
Experimental Group: [Student Set B] who use a 90 minute block schedule
Independent Variable: 90 min block schedule
What is really import (and I think the hardest part) is that both [Set A] students and [Set B] students need to be as similar as possible (before exposure to our test) for us to accurately test our hypothesis.
So this is what I've done to try and statistically find the closest comparison set of students...
I found characteristics of my experiment group before exposure to our 90 min block:
% below standard: .75
AVG special ed FTE .08
% FR lunch .62
And then I needed to find a set of students, that did not participate in the 90 min block, that came from the same school, AND had these similar characteristics.
So this brings me to my question...
I am not sure if there is a "best way" to find a comparison group, but this is what I did...
I used R to run a kmeans cluster analysis with the characteristics mentioned above on all students that did not participate in 90 min block and that came from the same school.
This took me quite a few times of running the analysis (changing the number of clusters, etc) to find a cluster of students whose % below standard, AVG sped fte, and % FR lunch matched (or that was very close) to that of the experiment group. But it seemed to be pretty accurate and WAY, WAY faster than picking students by hand, which is what we used to do to be able to compare these groups.
Once I found my matching cluster I now had my [Student Set A] to compare to [Student Set B].
Is there a better way of finding a comparable set? Maybe a different process other than cluster analysis where I can define the characteristics that I'm looking for?
I can put together a workbook if any of you want to see the R calcs and sheets used to find the clusters.