Data Is Plural — 2019.08.21 edition

2019.08.21 edition

Oil and gas, historical terrorist groups, TV news words, UK general elections, and confidence.

Oil and gas. The Joint Organisations Data Initiative (JODI) coordinates the collection, standardization, and publication of oil and gas data from around the world; the 100+ countries that participate represent the vast majority of global production. The oil data goes back to 2002; the gas data goes back to 2009. Both datasets are updated monthly and track a range of subproducts (e.g., crude oil, diesel, jet fuel) and flows (e.g., imports, exports, production) for each country. Previously: Global and gas infrastructure (DIP 2018.06.06) and state-owned oil companies (DIP 2019.05.01).

Historical terrorist groups. Joshua Tschantret, a political science Ph.D. candidate at the University of Iowa, has compiled a dataset of 260+ terrorist groups formed between 1860 and 1969. For the purposes of the dataset, “terrorist groups are operationally defined as politically-motivated non-state actors using bombings or assassinations,” Tschantret writes in an introductory article (PDF). About one-third of the groups in the dataset operated in the US, Russia, or China; the rest are spread across dozens of other countries. Related: Additional documentation (PDF). Good to know: On Twitter, Tschantret explains why the Black Panthers are included. [h/t Carla Martinez Machain]

A decade of TV news words. The TV-NGRAM project pulls 14 TV stations’ data from the Television News Archive and calculates how often each word (and two-word combination) was said during each 30-minute window. Most of the stations’ counts go back 9 or 10 years, and all are updated daily.

A century of UK general elections. On Monday, the British government published a dataset of voting results, by party and parliamentary constituency, for every UK general election since 1918 — merging modern data with a handful of historical sources.

Confidence. The Confidence Database is aggregating data from behavioral studies that have asked participants’ how confident they were in their own assessments. As of its launch earlier this month, the database contains 145 datasets, 8,700 participants, and 4 million individual observations. [h/t Audrey Mazancieux + Doby Rahnev]