Data Is Plural

... is a weekly newsletter of useful/curious datasets.

2024.09.18 edition

Landslides, long-run economic growth, alcohol consumption, art words, and Messi’s moves.

Landslides. The US Geological Survey has released a new map of landslide susceptibility, indicating the specific areas of the country (at 90-meter resolution) that are at greatest risk of slides. The map and county-level metrics are also available as structured data. To calculate the susceptibilities, Benjamin B. Mirus et al. combined data from the agency’s 3D Elevation Program and a national landslide inventory that they updated. The latter provides the location (as a single point or more detailed boundary), timing, number of fatalities, and confidence level for 610,000+ landslides (or evidence of them) in the US since the early 1900s. The researchers also consulted data from state landslide inventories published by Idaho, Maine, North Dakota, and West Virginia.

Long-run economic growth. The Maddison Project Database, based on the work of Angus Maddison (1926-2010), “provides information on comparative economic growth and income levels over the very long run.” Its latest release includes historical per-capita GDP estimates for 169 countries, in many cases spanning several centuries. In all, the database contains 21,000+ such estimates and another 17,000+ population estimates, drawn from hundreds of sources. Previously: The Penn World Table (DIP 2016.08.17) — “income, output, input and productivity” estimates now “covering 183 countries between 1950 and 2019” — and the Long-Term Productivity Database (DIP 2020.04.08).

Alcohol consumption. The National Institute on Alcohol Abuse and Alcoholism’s latest consumption surveillance report, published earlier this year, uses sales and shipment data to measure annual alcohol intake by beverage type (beer, wine, spirits) and state. The report and corresponding data file estimate the likely total and per-capita volumes (of the beverages and of their ethanol content) consumed each year from the 1970s through 2022. Related: Additional surveillance reports and the CDC’s list of surveys gathering data on alcohol use. [h/t Millie Giles]

Art words. The Getty Vocabularies, published by the Getty Research Institute, “contain structured terminology for art, architecture, decorative arts, archival materials, visual surrogates, art conservation, and bibliographic materials.” They provide definitions, relationships, translations, and disambiguations for a broad range of terms and entities. Their Art & Architecture Thesaurus, for example, describes 57,000+ generic concepts (e.g., lithography), while others focus on artist names, cultural objects, and geographies. The records are available several ways, including bulk downloads. [h/t Lynn Cherny]

Messi’s moves. StatsBomb, a soccer/football-data company, publishes a subset of its detailed, in-play data for free. Among the offerings: Every touch, pass, dribble, and shot from Lionel Messi’s 17 seasons playing for Barcelona in La Liga. Related: Carlos Menezes’s tool for visualizing StatsBomb event data files. Read more: Net Gains: Inside the Beautiful Game’s Analytics Revolution, by Ryan O’Hanlon. [h/t Giuseppe Sollazzo]