Data Is Plural

... is a weekly newsletter of useful/curious datasets.

2018.11.07 edition

Court decisions, foreign gifts to U.S. universities, European protests, self-driving car ethics, and Star Trek.

Court decisions. The Caselaw Access Project aims “to make all published U.S. court decisions freely available to the public online, in a consistent format, digitized from the collection of the Harvard Law Library.” Currently, the project provides an API for fetching data on more than 6 million cases published between 1658 and 2018 — though public access is limited to downloading 500 cases per day. You can also download bulk data for all cases in Illinois and Arkansas, but getting bulk data for other states currently requires a research agreement. [Update, 2024.04.10: In March 2024, the project announced that the full text of all 7 million state and federal cases it has collected — spanning 36 million pages — are now freely available in bulk. Read more from former director Adam Ziegler on the project’s past, present, and future.] [h/t Caitlin Ostroff]

Foreign gifts to U.S. universities. The Department of Education requires U.S. universities to report all major gifts from (and contracts with) foreign entities. The agency’s database of these gifts and contracts currently covers 2012 to mid-2018, and includes 18,000+ entries from more than 150 schools. Related: In the wake of Jamal Khashoggi’s murder, the AP’s Collin Binkley and Chad Day used the data to examine colleges’ financial ties to Saudi Arabia. [h/t Meghan Hoyer]

European protests, 1980 to 1995. A team led by University of Kansas professor Ron Francisco has collected and codified data on protests, strikes, and other “coercive acts” in dozens of European countries during the late 20th century. There’s a row for each day of each protest, and each row specifies the issue at stake, the organizers, their target, the type of action, and the location — as well as the number of protesters, arrests, injuries, and deaths. [h/t Alexandre Léchenet]

To swerve or not to swerve. A recent study revealed the results of “the Moral Machine, an online experimental platform designed to explore the moral dilemmas faced by autonomous vehicles.” The experiment asked participants to decide whether a self-driving car — faced with two deadly options — should stay on course (killing one group of pedestrians) or swerve (killing another). The project “gathered 40 million decisions in ten languages from millions of people in 233 countries and territories,” and a dataset containing every decision is available to download. Read more: “Should a self-driving car kill the baby or the grandma? Depends on where you’re from.” [h/t Walt Hickey]

All things Star Trek. STAPI bills itself as “the first public Star Trek API.” It provides access to structured data not only about the fictional universe (e.g., 6,364 characters, 1,215 spacecraft, and 155 conflicts) but also its intersection with reality (e.g., 5,302 performers, 731 television episodes, 76 soundtracks). [h/t Cezary Kluczyński]