Data Is Plural

... is a weekly newsletter of useful/curious datasets.

2019.03.20 edition

African urbanization, the Book of the States, metro-area segregation, internet scans, and rooftop water tanks.

Africapolis. “Produced by the OECD Sahel and West Africa Club, Africapolis.org is the only comprehensive and standardised geospatial database on cities and urbanisation dynamics in Africa. Combining demographic sources, satellite and aerial imagery and other cartographic sources, it is designed to enable comparative and long-term analyses of urban dynamics - covering 7,500 agglomerations in 50 countries.” You can download the data — which includes historical populations, urbanization metrics, and geospatial outlines — and also explore it online. [h/t Rafael Prieto Curiel]

The Book of the States. The Council of State Governments’ annual Book of the States compiles 50-state reference tables on a range of topics, including elections, finances, courts, and more. It has been published since 1935, and the tables for the past decade-plus are available as spreadsheets. Now you know: The chief justice of the California Supreme Court makes $256,059 per year — the highest compensation for any state judge, and nearly double New Mexico’s top judge, according to 2018’s Table 5.4. [h/t Cezary Podkul]

Metro-area segregation. “[W]hy are so many cities and metropolitan areas still split along racial lines? And what is the role of local government in reinforcing those divides? To answer those questions, Governing conducted a six-month investigation of black-white segregation in the small cities of downstate Illinois.” As part of the investigation, the magazine calculated (and published) school and residential segregation metrics for hundreds of U.S. metropolitan areas, based on the latest Department of Education and Census Bureau data. Related:The Most Diverse Cities Are Often The Most Segregated” (FiveThirtyEight, 2015). [h/t Mike Maciag]

Internet scans. Security firm Rapid7’s Project Sonar “conducts internet-wide surveys across more than 70 different services and protocols to gain insights into global exposure to common vulnerabilities.” Much of the data (on DNS responses, SSL certificates, and more) can be bulk-downloaded through the company’s open data portal without an account, and historical data and the most-current data are available with a free account. Related: Project Sonar: An Underrated Source of Internet-wide Data (Patrik Hudak). Also: Rapid7’s guide to using their open data API with R. [h/t Sharon Machlis]

Rooftop water tanks. New York City requires the owners of buildings with rooftop water tanks to get the vessels inspected annually for things like sediment, bacteria, and dead bugs. The city publishes a dataset of the owner-report results, based on 15,000 inspections, mostly from 2015–17. Unfortunately:A review of city records indicates that most building owners still do not inspect and clean their tanks” … and the “city can’t even say with certainty how many there are or where they are located” … and in “almost every case the [bacteriological] tests are conducted only after the tanks have been disinfected.” [h/t Zack Quaintance]