Data Is Plural

... is a weekly newsletter of useful/curious datasets.

2024.12.11 edition

Quits and layoffs, food safety alerts, crop rotations, Serbian political party funds, and a long-running ultramarathon.

Quits and layoffs. Minneapolis Fed–affiliated economists Kathrin Ellieroth and Amanda Michaud have constructed a new dataset on monthly quits and layoffs. Using Current Population Survey (CPS) microdata going back to 1978, the dataset estimates the proportions of employees who, after quitting or being laid off, transition to unemployment versus exiting the labor market. In a recent article, Ellieroth and Michaud note that “CPS data offer a perspective not seen in the most-often-used series on quits and layoffs, the Job Openings and Labor Turnover Survey (JOLTS),” featured in DIP 2022.09.21. “Whereas the JOLTS tracks what happens to a job, the CPS tracks what happens to people.” Analyzing it, they found “that increases in unemployment are typically not due to increases in layoffs; rather, they happen because laid-off workers are less likely to quickly find a new job, more likely to stay in the labor force, and thus more likely to join a growing pool of unemployed people hunting for work.” [h/t Alex Albright]

Food safety alerts. Data journalist Adrian Nesta is building a automated pipeline to collect and standardize data on food safety recalls and alerts from two US federal agencies — the FDA and the USDA. For each alert, the standardized dataset indicates the notice’s title, ID, URL, and time posted, as well as the product description, company name, brand name, recall type, recall reason, impacted states, risk level, and more.

Crop rotations. The Department of Agriculture’s Crop Sequence Boundaries initiative algorithmically analyzes satellite imagery to create “estimates of field boundaries, crop acreage, and crop rotations across the contiguous United States.” The results are available via an interactive map and downloads for eight-year time frames. The underlying code is open-source and can be used to generate datasets for custom time frames. Previously: The USDA’s CropScape tool and Cropland Data Layer (DIP 2019.03.06). [h/t Forest Gregg]

Serbian political party funds. The Center for Investigative Journalism of Serbia’s Party Funds database “tracks all reported incomes and expenses of 40 political parties and citizens’ groups in Serbia over the past nine years.” The records, based on financial disclosure reports, can be browsed online, searched, and downloaded. They indicate revenues, overhead costs, ad spending, salary expenditures, and more. The data specifies each line item’s year, amount, purpose, and other context-dependent details. [h/t Teodora Ćurčić]

A long-running ultramarathon. The Comrades Marathon, first run in 1921, is considered “the oldest and largest ultramarathon in the world.” The route stretches 80+ kilometers between Durban and Pietermaritzburg, flipping annually between “up” and “down” directions. In 2019, Kyle Stratton scraped the official website to construct a dataset of all 445,000+ finishers (year, name, country, club, category, finishing time, medal received) through that year. Related: The Association of Road Racing Statisticians’ lists of longest-running marathons and ultramarathons, last updated in 2017. As seen in: Antony Unwin’s Getting (more out of) Graphics.