About the Sample Data Sets

Version 2

    This article describes the data sets provided with Vizable and suggests ways to explore them.

     

    In this article:

     

    Movies

    Who doesn’t love movies? This robust data set from OpusData has a ton of box office information about popular movies from 1939 to 2015. OpusData populates the data for the movie website The Numbers.

     

    Want to see how the underlying data is structured? Check out this subset of rows.

     

    Category fields

    • Genre: Single genre per movie, as assigned by OpusData based on relevance, including: Adventure, Action, Comedy, Drama, Thriller/Suspense, Romantic Comedy, Horror, Musical, Black Comedy, Western, Documentary, Concert/Performance
    • Production Company: Primary production company
    • Director: First credited director
    • Movie Title: Movie title
    • Top Actor: First lead actor (if this wasn’t called out, top-paid or first in list)
    • MPAA Rating: Official Motion Picture Association of America ratings: G, PG, PG-13, R, NC-17
    • Creative Type: Sub-genre: Contemporary Fiction, Dramatization, Factual, Fantasy, Historical Fiction, Kids Fiction, Multiple Creative Types, Science Fiction, Super Hero
    • Source: Whether the movie was based on an original screenplay, a book, real-world events, etc.
    • Theater Release: Date of the theatrical release in the USA
    • Release Weekday: Day of the week of the theatrical release
    • Video Release: Date of the release to VHS, DVD, or Blu-ray
    • Production Method: How was the movie produced: Live Action, Animation/Live Action, Digital Animation, Hand Animation, Stop-Motion Animation, Multiple Production Methods, Rotoscoping

     

    Number fields

    • Global Box Office ($): Total box office gross in U.S. dollars
    • Adjusted Global Box Office ($): Total box office gross in U.S. dollars, adjusted for inflation to 2015 dollars
    • U.S. Box Office ($): Total domestic box office gross in U.S. dollars
    • Adjusted U.S. Box Office ($): Total domestic box office gross in U.S. dollars, adjusted for inflation to 2015 dollars
    • Int'l Box Office ($): Total international box office gross in U.S. dollars
    • Adjusted Int'l Box Office ($): Total international box office gross in U.S. dollars, adjusted for inflation to 2015 dollars
    • Budget ($): Movie budget in U.S. dollars
    • Adjusted Budget ($): Budget adjusted for inflation to 2015 dollars
    • Profit ($): Calculated field for total international profit in U.S. dollars: Global Box Office - Budget
    • Adjusted Profit ($): Total international profit in U.S. dollars, adjusted for inflation to 2015 dollars
    • Profit Margin (%): Calculated field: (Global Box Office - Budget) / Global Box Office
    • MovieLens Rating: Average rating from https://movielens.org/, a site that crowdsources ratings for movies on a scale of 1-5 stars
    • Movie Length (min): Movie length in minutes
    • Opening Weekend Theaters: Number of theaters the movie was shown at on its opening weekend
    • Opening Weekend ($): Total box office gross in U.S. dollars for the theatrical release opening weekend
    • Maximum Theaters: Most theaters the movie was shown in at one time
    • U.S. DVD + Blu-ray Sales ($): Combined U.S. sales for DVD and Blu-ray in U.S. dollars
    • U.S. Box Office (%): Percentage of the global box office gross that came from U.S. markets
    • Int'l Box Office (%): Percentage of the global box office gross that came from international markets

     

    Suggested explorations:

    • Do prolific actors or directors tend to stick to a few genres or shake things up?
    • What months or days of the week seem to do well for theatrical releases? Does that vary by genre or movie rating?
    • How does your favorite production company compare to others?
    • Are there movies that have done well internationally only to perform poorly in the U.S.? What about the reverse?
    • The Harry Potter books got longer as the series went on. Did the Harry Potter movies?

     

    Top ^


     

    Weather

    We all talk about the weather, but just how data-driven are those conversations? With this data set, you can explore daily weather for 25 large international cities from 2012 to 2015. The data set comes from World Weather Online, an organization that provides high-quality global weather data.

     

    A quick look at the structure of the underlying data can really enhance your explorations.

     

    Category fields

    • City: 25 cities around the world
    • State/Region: The state or region where the city is located
    • Country: The country where the city is located
    • Description: A written description of the daily weather: Sunny, partly cloudy, etc.
    • Date: Day weather was recorded

     

    Number fields

    • Precipitation: Daily precipitation in millimeters
    • Wind Speed: Daily average wind speed in miles per hour
    • Temperature: Average daily temperature in degrees Fahrenheit
    • Humidity: Average daily humidity as a percentage
    • Pressure: Average daily pressure in millimeters of mercury

     

    Suggested explorations:

    • What sorts of seasonal trend do you see in the data? Any indication what cities are in the southern hemisphere?
    • How does pressure seem to relate to other weather trends?
    • Is Seattle as rainy as people say?

     

    Top ^


     

    Restaurant

    This point-of-sale register data from a sandwich shop shows the buying trends over a 10-week period. Here's a sampling of the source data to help you understand its structure.

     

    Category fields

    • Date: Date of the transaction, all the way down to the hour
    • Menu Item: The specific item purchased
    • Menu Group: Larger categories that menu items fall into
    • Weekday: What day of the week the transaction occurred on

     

    Number fields

    • Sales amount ($): How much an item costs
    • Quantity: How many of a given item were sold

     

    Suggested explorations:

    • What is the most popular drink?
    • How early do people start ordering the $7 Daily Special?
    • What was the percent change in average sales across the month of March?

     

    Top ^


     

    Titanic

    On April 15, 1912, RMS Titanic sank in the North Atlantic Ocean after striking an iceberg. Although she sank on her maiden voyage, the Titanic has remained a curious fascination in popular culture for the past century. This data set looks at the fate of all 2,201 passengers, but its structure is pretty straightforward, as this example shows.

     

    Category fields

    • Class: Indicates if individuals traveled in first, second, or third class accommodations or were crew
    • Age: Indicates if individuals were adult or child
    • Gender: Indicates if individuals were male or female
    • Survived: Indicates if individuals survived or died

     

    Number fields

    • Number of Records: Count of individuals

     

    Suggested explorations:

    • How many crewmembers were on the Titanic?
    • How many females traveling second class survived?
    • Bonus: In the Movies data set, what was the budget for the movie Titanic?

     

    Note: This data set does not have a date field, so no exploration is possible in Time World.

     

    Top ^