So how do you determine which Cities are in the comparison group? If I change to LA, say, how would you determine which are in it's comparison group?
It's completely arbitrary and follows no set of rules. The comparison group is set on a separate table [see second tab of Excel file], and is subject to change at any time (i.e. today we might want to compare NYC to LA, Dallas, and Chicago; but tomorrow we might change the comparisons table so that NYC compares to LA, Dallas, and Seattle).
Sorry Adam, by bad...I completely missed that tab!
So this is a tricky one, and the final solution will (likely) require some duplication of the data. I have (I think) a working version here. Let me try and explain what I've done!
First I UNIONed the 2 tabs together...
This also generates a "Sheet" column which tells me which Tab each row came from
I then joined the City Metric back onto the Compare to City, as I needed to get the Populations for the comparison.
Next we need to create some formulas to pick up the right stuff!
IF [Sheet] = 'City Metrics' THEN
IF [City] = [Selected City] THEN 'Selected City' END
ELSEIF [Sheet] = 'Comparatives' AND [Base City] = [Selected City] THEN 'Comparison City'
ELSE 'Rest of Cities'
This classifies the Cities as Selected (using the original city column) and then uses the [base city] to pick up the right comparisons
We can then use this to pick up the right population column (we now have 2...one from the Metrics and the other where we joined the metrics to the Comparison Cities column)
[Population (to Use)]
IF [City Selected/Comp] = 'Selected City' THEN [Population]
ELSE [Population (City Metrics1)]
And finally, just so we get the sorting correct
[City (to Use) Order]
IF [City (to Use)] = [Selected City] THEN 100000000 ELSE [Population (to Use)] END
I'm assuming no city will have more than 100M people!
We can then filter out the NULLs and 'Rest of Cities' from the [City Selected/Comp] and we're good to go.
with a slightly different data structure, this could be made a bit easier (basically having a Base City row in the Comparison Data) like so
We'd still get duplicate rows, but we can just do a single join, and also the formulas are bit simpler. I've left the solution as per the provided data, in case you don't have that freedom.
Hope that helps, and makes some sense!
CityPopulations_v10.5- SR.twbx 49.0 KB
This is fantastic, thank you so much!
I have to admit I was unfamiliar with the Union feature (and I'm still a little confused about it since it doesn't have the constraints a SQL union would) but I see what is happening and will be able to use a form of this scaled out to my real data set. If I'm understanding correctly, then for every measure I might have, I'll need to create a calculated field to pick up the value from the proper sheet. But that's a small price to pay to make this work well.
Yes it's a pretty ugly solution...but it gets the job done which is the main thing. Yes using the data in it's current form, we do need all these calculations in order to determine which City column we need and which population.
Just to show you how a bit of data-modelling at the start can make life easier in the long run....Attached is a version where I've added in the extra rows of data (as per my previous post)
As you can see we need a lot fewer rows, we need a lot few calculations and we only have a single population column. We are still duplicating some rows (where a city is in 2+ compare groups), so the SUM of population is no longer "additive" over all the data (eg. Phoenix is in 2 compare groups so it has 2 rows, and so the SUM of population for Phoenix is double it's actual population), but as we only ever look at Phoenix filtered down to single Base City it's not an issue.
CityPopulations_v10.5- SR.twbx 73.4 KB