-
Dose of Data - Jan 01, 2026
- 6.2% - DC’s unemployment rate in September 2025 (seasonally adjusted)
- 69 - year on year reduction in homicides in Baltimore in 2025
- $480M - the amount in health funding the US has committed to the Ivory Coast, covering issue areas including HIV, malaria, maternal and child health, and health security, matched by an expected $292M from the Ivory Coast by 2030
- 1st - the birthday recently celebrated by Tarmeem, India’s first CRISPR gene-edited sheep
-
Beauty in Drainage Coefficient Values

-
The Primacy of Payback Periods for Energy Efficiency
A WaPo article from yesterday details state-level efforts often spearheaded by developers, to roll back or slow down building code reforms that would increase energy efficiency improvements in new housing. A key dividing line between developers and EE advocates is the upfront premium for EE improvements like better insulation, triple glazed windows, and electrical systems that would facilitate EV charger upgrades. While developers quoted in the article claim $20K (perhaps a 10% premium on home value), EE advocates argue it’s closer to $6K.
-
Book Review: John Doerr and Ryan Panchadsaram. *Speed & Scale: A Global Action Plan for Solving Our Climate Crisis Now* (Penguin Business 2021)
If you asked for a single book to provide a comprehensive blueprint for how we might achieve net zero by 2050 and minimize the chances of facing devastating climate change, this might have been that book. Doerr and Panchadsaram, both of Kleiner Perkins, start Speed & Scale with a sector by sector play to cancel out ~60 Gt/year of CO2-eq emissions. First, electrify transportation, decarbonize the grid, “fix food”, protect nature, and remove carbon from the atmosphere (and the oceans), this last part attached to a lofty 10 Gt target. The targets and timetables adopt an Objectives and Key Results (OKR) framework which the authors emphasize in the introduction as critical for success. Standard “you can only manage what you measure” messaging.
-
Bounding Boxes for all US Counties
A post from several years back contained the bounding box coordinates of all US states and has been one of the more viewed pages on this site. Unfortunately, if your area of interest is below the state-level, these bounding boxes may only get you part of the way to your destination. Why waste time expanding a geographic search to areas beyond your narrow AOI?
-
Tip for Installing Orfeo Toolbox Plugin for QGIS on MacOS

-
Cleaning Berkeley Earth's BEST Gridded Daily Temperature Data
You may have recently seen air quality maps produced by the Berkeley Earth group, especially in the wake of the horrific Camp Fire whose death toll now exceeds 80. For example, here’s their real-time visualization of PM2.5 concentrations.
-
Bounding Boxes for All US States

-
A Better ZIP5-County Crosswalk
I use a healthcare expenditure dataset with observations geographically coded at the 5-digit zipcode level, but I’d also like to know which county an observation ‘belongs’ to. Maybe I want to cluster standard errors by county, or control for county-specific trends. You’d imagine this would be straightforward, but I haven’t yet found a government crosswalk that is comprehensive in all the ZIP5s that appear in my data. What follows is the best solution I’m aware of, to ensure that I match as many ZIP5s as possible. While this only increases the number of ZIP5-county matches by about 110 over what HUD offers, it’s an improvement of more than 6,000 over the Census crosswalk.
-
Large Stata Datasets and False Errors about 'Duplicates'
Variable storage types exercise more importance when working with larger datasets, and variables with more digits. I’m reminded of this because of an error message Stata threw while trying to perform a long
reshape, claiming duplicate entries of the ID variable. That was obviously not the case, since the_nid was uniquely created, and the value of each visibly corresponded to its row index.