I started down this path before I understood the Iron Viz should have been 1 visualization with 1 About page.  This is an entirely different route, but still a fun exercise.  Enjoy!

 

 

Method:

 

Data Management

The simplest way to get to large lists of data in Wikipedia is to use Google's Fusion Tables Search.  I started here and browsed through all of the results on Wikipedia to find common themes.  Lists by Country and U.S. State appeared to be the most common.  I reviewed the top ~400 search results for interesting state-related data tables: https://research.google.com/tables?hl=en&q=site+en.wikipedia.org+%22List+of+U.S.+states%22. There were about 28 that i selected to include as a base data set to use for my own viz and to share with my colleagues at Boulder Insight.

 

TableURL
Populationhttp://en.wikipedia.org/wiki/List_of_U.S._states_and_territories_by_population
GDPhttp://en.wikipedia.org/wiki/List_of_U.S._states_by_GDP
Areahttp://en.wikipedia.org/wiki/List_of_U.S._states_and_territories_by_area
Life_Expectancyhttp://en.wikipedia.org/wiki/List_of_U.S._states_by_life_expectancy
Incomehttp://en.wikipedia.org/wiki/List_of_U.S._states_by_income
Unemploymenthttp://en.wikipedia.org/wiki/List_of_U.S._states_by_unemployment_rate
Elevationhttp://en.wikipedia.org/wiki/List_of_U.S._states_by_elevation
Incarcerationhttp://en.wikipedia.org/wiki/List_of_U.S._states_by_incarceration_rate
Vehicleshttp://en.wikipedia.org/wiki/List_of_U.S._states_by_vehicles_per_capita
Billionaireshttp://en.wikipedia.org/wiki/List_of_U.S._states_by_the_number_of_billionaires
Densityhttp://en.wikipedia.org/wiki/List_of_U.S._states_by_population_density
Emissionshttp://en.wikipedia.org/wiki/List_of_U.S._states_by_carbon_dioxide_emissions
Educationhttp://en.wikipedia.org/wiki/List_of_U.S._states_by_educational_attainment
Pop_Growthhttp://en.wikipedia.org/wiki/List_of_U.S._states_by_population_growth_rate
Economic_Growthhttp://en.wikipedia.org/wiki/List_of_U.S._states_by_economic_growth_rate
Abbreviationshttp://en.wikipedia.org/wiki/List_of_U.S._state_abbreviations
Demographicshttp://en.wikipedia.org/wiki/Demographics_of_the_United_States
Fertilityhttp://en.wikipedia.org/wiki/List_of_U.S._states_and_territories_by_fertility_rate
Obesityhttp://en.wikipedia.org/wiki/Obesity_in_the_United_States
Etymologyhttp://en.wikipedia.org/wiki/List_of_state_name_etymologies_in_the_United_States
Timehttp://en.wikipedia.org/wiki/List_of_time_offsets_by_U.S._state
Quartershttp://en.wikipedia.org/wiki/50_State_Quarters
Cannabishttp://en.wikipedia.org/wiki/Legality_of_cannabis_by_U.S._jurisdiction
Divorcehttp://en.wikipedia.org/wiki/Divorce_in_the_United_States
Taxationhttp://en.wikipedia.org/wiki/Federal_taxation_and_spending_by_state
Minimum_Wagehttp://en.wikipedia.org/wiki/Minimum_wage_in_the_United_States
Mottoshttp://en.wikipedia.org/wiki/List_of_U.S._state_and_territory_mottos
Temperatureshttp://en.wikipedia.org/wiki/U.S._state_temperature_extremes

 

Each of these tables were imported into sheets in a single Google Sheets Workbook.  The following data transformations were applied while the data was still in the spreadsheet:

 

Normalized column headers

- Edited out Wikipedia markup artefacts

- Condensed two-row headers into one-row headers

- Prefixed with table name where needed

 

Data Specific Transforms

1) Replaced all instances of """ with a """ (double quote)

2) Replaced all instances of "'" with a "'" (single quote)

3) On [Cannabis].[unnamed column b - corresponded to legend color coding] - changed the column name to "Legality" and replaced the letter code with the language from the Map Legend:

a = Jurisdiction with legalized Cannabis

b = Jurisdiction with both medical and decriminalization laws

c = Jurisdiction with legal medical Cannabis

d = Jurisdiction with decriminalized cannabis possession laws

e = Jurisdiction with total cannabis prohibition

4) Changed the primary key from all other descriptions (eg., "State/Territory", "State or District", etc.) to just "State"

5) Normalized State names throughout ... Hawaii, District of Columbia

6) Collapsed duplicate state records on Etymology into single records

7) On [Minimum Wage], moved text to Notes field.  Converted currency to numeric.

8) On [Temperature], edited column names, removed asterisks from dates, retained only Fahrenheit temperatures and first location of temperature recording.

 

The workbook was then saved as XLSX for easy joining in Tableau.  The Population table was used as the primary table with all other LEFT JOINED to the Population table.

 

Analysis

With a data set including over 50 Dimensions and 125 Measures across 28 topics, what could be gleaned from the relationships?  The first task was to determine if there were any convenient linear relationships.  I wanted a convenient way to choose a Measure and then sequentially run through every other measure looking for obvious connections.  The term "Spurious Correlation" came to mind and, naturally, using Wikipedia I learned that i really meant "Spurious Relationship".

 

I created two views that allowed me to do just this.  The first plots any selected dimension against any selected measure.  The second does the same thing for two measures.

 

https://public.tableausoftware.com/static/images/Sp/Spurious_Relationship_Creator/Dimensions_vs_Measures/1.png

 

Running through the possibilities produced a few interesting relationships.  The initial relationships were found with the "Spurious Relationship Creator" and then created in fixed worksheets.  These are included on separate worksheets in the Viz linked above.  At an aggregate State level, it's really a stretch to say that any of these inferences are true, but it's still interesting. Here's what i learned ...

 

Billionaires

Here, i was genuinely curious if the presence of billionaires in a state had an impact on the prosperity in a state.  It doesn't appear that they do.

 

Impact of Income on Education/Health

This one seemed more intuitive.  The more people in a state with Bachelor's and Advanced degrees, you'd expect the income to be higher.  The impact that poverty has on obesity is also intuitive ... cheap food <> healthy food.  Life Expectancy also followed a similar pattern ... be poor and die sooner.

 

Hobbies (Just for Fun)

Sort of the miscellaneous drawer of this workbook, several things that didn't stand up as their own viz were thrown together.  Fun facts:

  1. There really are "donor states" ... states that pay more in taxes than they receive in federal spending.  But does this include pork barrell spending?
  2. Island territories look like fun places.
  3. Live Long and Prosper ... dude.
  4. Maybe we could learn from each other.

 

Thanks for reading,

William