
1. Re: Comparison of Means between Unequal Groups
luisa.bez Jul 17, 2018 1:26 PM (in response to Danish Zaidi) 
2. Re: Comparison of Means between Unequal Groups
Nitish Pamarty Jul 17, 2018 1:27 PM (in response to Danish Zaidi)Hey Danish,
Did you consider metrics like:
Sales per dealer, profit per dealer
profit over sales ratio for entire groups
and similar metrics where you normalize before you compare.

3. Re: Comparison of Means between Unequal Groups
Danish Zaidi Jul 17, 2018 1:34 PM (in response to luisa.bez)Thanks Luisa. That seems to help in normalizing the data.

4. Re: Comparison of Means between Unequal Groups
Danish Zaidi Jul 17, 2018 1:35 PM (in response to Nitish Pamarty)Thanks Nitish. I did not think of that. That is a good suggestion though.

5. Re: Comparison of Means between Unequal Groups
Aaron Sheldon Jul 17, 2018 9:03 PM (in response to Danish Zaidi)With a sample size that large you have more than enough statistical power to use a nonparametric test like the MannWhitney U test for stochastic dominance in a single ranking dimension, like say sales. Odds are good though that there will be a statistically significant difference, simply due to the large sample size. The harder question is whether or not it is a meaningful difference, which is driven by two considerations: First, why is there a difference? Answering that depends on whether you have included the correct explanatory latent variables in your data. Second, is the difference large enough to have an operational impact. With a sample size that large, even a less powerful nonparametric test will be very sensitive to small differences in the distributions.
The really powerful question is to ask: accounting for all the factors that the dealers cannot control, what is the difference in sales? In the life sciences this is referred to as risk adjustment, although propensity scoring might work as well. Answering this question will not only tell you which dealers are adding the most value due to their behaviour, but also which dealers have large unrealised potential for generating sales, if placed in a better market.
For any financial data I would advise against any tests, like the ttest, which rely on assumptions of normality. Most financial processes tend to originate from geometric stochastic processes, which results in variables whose distributions have powerlaw tails.