Towards Closing Gender Data Gaps

In May, the Bill & Melinda Gates Foundation announced a three-year, $80 million investment towards closing the gender data gap, but I only today came across this great video on the same initiative.  A portion of the funds will be directed towards improved data collection, particularly of the time use patterns of women and girls and on household … Continue reading Towards Closing Gender Data Gaps

Stata-Latex esttab Regression Table Output Streamlining

Researchers spend an excessive amount of time getting up to speed with a field's chosen tools and methods, excessive because there is often a consensus on best practice and yet those best practices are not made common knowledge.  I think the CS and statistics communities have this right in their pushing for open data, transparency, and reproducibility … Continue reading Stata-Latex esttab Regression Table Output Streamlining

What We’re Watching – Muralidharan and Basu

The latest Ideas for India (I4I) interview installment features Karthik Muralidharan (UCSD) speaking with Kaushik Basu (World Bank/Cornell) on "big and small ideas in development economics." Fans of the I4I forum which brings in preeminent Indian academic/practitioners to discuss policy should also check out the August 2015 Arvind Subramaniam conversation on "Charting a Course for … Continue reading What We’re Watching – Muralidharan and Basu

Stata: Reghdfe and factor interactions

If you don't know about the reghdfe function in Stata, you are likely missing out, especially if you run 'high dimensional fixed effects' models -- i.e., your model includes 3+ dimensions of FE, perhaps 2 in time and 1 in space-time.  I've been encountering a situation which raises this unhelpful error message: (null assertion) Empty … Continue reading Stata: Reghdfe and factor interactions

Tutorial: FuzzyWuzzy String Matching in Python – Improving Merge Accuracy Across Data Products and Naming Conventions

If you work with manually-entered string character data or data coming from multiple providers, you may encounter the reality of not being able to a.) merge the data, or b.) produce correct summary statistics.  Regarding a.), take the example in the picture of Indian district names exported from two data sources -- we'd have a … Continue reading Tutorial: FuzzyWuzzy String Matching in Python – Improving Merge Accuracy Across Data Products and Naming Conventions

Merging Innumerable Tables into LaTeX? (Mac OS/X)

Sometimes you simply have to run models that test dozens of different hypotheses and therefore are left with a lot of output to work through.  For example, if I'm interested in how crops respond to extreme heat, there's a bevy of specifications to work through, but more importantly, numerous crops to test.  I use the … Continue reading Merging Innumerable Tables into LaTeX? (Mac OS/X)

On Working with India’s NSS Data in an OS/X Environment

I've shifted gears after spending the last few months with district-level data and begun getting my hands dirty with the National Sample Survey data the Indian government regularly compiles.  If you're familiar with the flat text/ASCII approach to digging out your data, you understand the unwieldiness of this format in which variables are created from … Continue reading On Working with India’s NSS Data in an OS/X Environment