• Converting .TXT/.GRD Climate Data Files to netCDF Format

    Climate data is packaged and distributed in too many file formats. Under ideal circumstances, you could easily convert data from formats you’re not familiar with (and don’t have scripts to handle), to those that you do. This is why analogous tools like Stat/Transfer for statistical databases often used by social scientists, are so helpful. If a stranger on the street gives you SPSS data, you can on-the-fly convert it to something which is Stata-readable. Albeit, the value of software like Stat/Transfer diminishes as more stat packages have comprehensive in-built conversion tools, like R’s readstata13 and read stata in pandas. Getting similar functionality with climate data requires a bit more lift.

    Read on →

  • Visualizations Gone Wild

    Who said art had to be intentional?

    Read on →

  • Recent Trends in Women's Employment in Rural India

    That female labor force participation (FLFP) is U-shaped in per capita income is one significant stylized fact at the intersection of development and labor economics. At low levels of development, subsistence requirements render women’s work a necessity for household survival. At higher incomes, the nature of employment changes, with the growth of manufacturing jobs that tend to be staffed by men, and may accompany a contraction of agricultural labor. This, as well as social norms like purdah which encourage women’s non-participation in paid work or their association with non-household men, squeeze FLFP. Further along the industrialization growth path, these manufacturing jobs pave the way for service sector employment, for which women may exercise a comparative advantage. High wage jobs also induce higher FLFP through an opportunity cost channel. Staying home could mean a lot of foregone household income.

    Read on →

  • San Francisco-Stanford Commuting

    rear_deI’ve now been commuting from San Francisco to Stanford for nearly four weeks and thought it might be helpful to pen some observations for folks who are considering living in SF, but working in Palo Alto or at Stanford (at each of the postdoc activities I’ve attended, we’ve been reminded there are more than 2,300 of us campus-wide at a time, so I think there’s an audience for this). Prior to moving here, I tried to research the viability of the commute and how it would work with biking, but found the level of detail lacking on posts at Quora and elsewhere. In sum, it’s entirely doable with bikes and haven’t gotten too exhausted by it yet.

    Read on →

  • Effortlessly Merging 1,000s of Raw Data Files with Stata

    I frequently have to consolidate 100s or 1,000s of raw data files into Stata, that are potentially stored in numerous and potentially unknown folders and subfolders, and have developed a workflow that I think is useful. This approach means I don’t need to know file names or paths, and can instead assign search parameters that determine which files get tagged for processing. I’ve used this approach when constructing a financial flows panel database for India from numerous state-level bank deposits spreadsheets, as well as manipulating GCM output that was chopped up and spit out into thousands of files using an Python/ArcPy workflow. I frequently encounter these setups, and if you do too then this tutorial is probably relevant. All the files needed to work through the following example are on github.

    Read on →

  • Towards Closing Gender Data Gaps

    In May, the Bill & Melinda Gates Foundation announced a three-year, $80 million investment towards closing the gender data gap, but I only today came across this great video on the same initiative. A portion of the funds will be directed towards improved data collection, particularly of the time use patterns of women and girls and on household asset ownership inventories. Better data means better information for policymakers conducting program evaluations (i.e., how did that recent cash transfer program differentially impact women’s and men’s employment levels?) and will enable researchers greater insight into the long-run implications of unpaid work (i.e., how does working for a household business affect children’s final education achievement?).

    Read on →

  • Stata-Latex esttab Regression Table Output Streamlining

    Researchers spend an excessive amount of time getting up to speed with a field’s chosen tools and methods, excessive because there is often a consensus on best practice and yet those best practices are not made common knowledge. I think the CS and statistics communities have this right in their pushing for open data, transparency, and reproducibility in a way that economics, for example, has been late to the game on. As a result, early-stage PhD students can emulate and save those wasted hours tinkering with multicolumns in Latex or some user unfriendly Stata syntax. I have personally benefited greatly from the likes of Jorg Weber and UCLA IDRE, among the numerous Stack Overflow posts on publishing regression output, and have finally developed a satisfycing Stata-Latex esttab workflow which doesn’t require an excessive amount of post-processing in order to be usable. Eyal Frank deserves a hat-tip for helping inspire this process.

    Read on →

  • Stata: Union of Macros

    Figure-8-The-ASCII-text-stream-produced-when-the-binary-stream-in-is-decompressed.png

    Read on →

  • What We're Watching: Muralidharan and Basu

    The latest Ideas for India (I4I) interview installment features Karthik Muralidharan (UCSD) speaking with Kaushik Basu (World Bank/Cornell) on “big and small ideas in development economics.”

    Read on →

  • Stata: Reghdfe and factor interactions

    If you don’t know about the reghdfe function in Stata, you are likely missing out, especially if you run ‘high dimensional fixed effects’ models — i.e., your model includes 3+ dimensions of FE, perhaps 2 in time and 1 in space-time. I’ve been encountering a situation which raises this unhelpful error message:

    Read on →