• What We're Watching: Muralidharan and Basu

    The latest Ideas for India (I4I) interview installment features Karthik Muralidharan (UCSD) speaking with Kaushik Basu (World Bank/Cornell) on “big and small ideas in development economics.”

    Read on →

  • Stata: Reghdfe and factor interactions

    If you don’t know about the reghdfe function in Stata, you are likely missing out, especially if you run ‘high dimensional fixed effects’ models — i.e., your model includes 3+ dimensions of FE, perhaps 2 in time and 1 in space-time. I’ve been encountering a situation which raises this unhelpful error message:

    Read on →

  • Tutorial: FuzzyWuzzy String Matching in Python -- Improving Merge Accuracy Across Data Products and Naming Conventions

    [![District Naming Preview](https://pathindependence.files.wordpress.com/2015/10/screen-shot-2015-10-31-at-1-41-49-pm.png?w=660&resize=660%2C322)](https://pathindependence.files.wordpress.com/2015/10/screen-shot-2015-10-31-at-1-41-49-pm.png)
    Example of Two Datasets with Comparable Variables

    Read on →

  • Merging Innumerable Tables into LaTeX? (Mac OS/X)

    Sometimes you simply have to run models that test dozens of different hypotheses and therefore are left with a lot of output to work through. For example, if I’m interested in how crops respond to extreme heat, there’s a bevy of specifications to work through, but more importantly, numerous crops to test. I use the standard esttab machinery in Stata to output .tex tables which can easily be inputted into a master TeX file which generates attractive enough reports to scour and share with colleagues. What I’ll describe is generic enough to accommodate a range of inputs that would be included in your TeX file, but here I’m specifically interested in having page breaks followed by individual tables, which do not adhere to a consistent file-naming convention (e.g., table1.tex, table2.tex…).

    Read on →

  • NCO 1968 Occupation Codes

    A new project I’ve spent much of the summer on examines labor allocation decisions in India. The dataset we’re using refers to NCO68 occupation codes, without providing code labels. I’ve scraped what seems to be a reliable PDF with comprehensive 3-digit NCO codes and converted it to a spreadsheet for easy use in Stata or any other statistical analysis program. You can find it in the Resources section and it includes 3-digit codes, occupation title, and 1-digit primary codes as separate columns.

  • On Working with India's NSS Data in an OS/X Environment

    link_nesstar

    Read on →

  • Screen Sharing and Yosemite

    Jumping from Yosemite?The Yosemite OS upgrade has been available since mid-October. Since it often takes weeks/months for developers to resolve bugs introduced by a new OS version, I held off on leaving Mavericks for fear of some vital Homebrew or Python functionality becoming a casualty of the shift and then being needlessly out of luck right before a presentation/deadline. I finally upgraded last week and the install process consumed some 9+ hours. That would be fine if the countdown clock said as much, but most of the time the install page declared the process would either be over in 20 minutes or 3 minutes (nothing ever in between, seemingly violating the intermediate value theorem).

    Read on →

  • CHIRPS v. 2.0 Released Today

    canvasToday the UCSB team released a major update to the CHIRPS precip product that people may be familiar with from Sub-Saharan Africa (SSA) hazard (drought/flood) warning applications. While v. 1.8 featured global data at pentadal resolution, this update expands the daily resolution dataset from SSA to global availability. While you’re there, check out their Early Warning Explorer (EWX), code snippets you can embed to produce summary stat plots and precip visualizations. Unfortunately WP prohibits JavaScript embeds, so all I can offer is a snapshot example (above) of their output.

  • Calculating degree days with NumPy

    2010-08-08 at 11-41-43

    Read on →

  • VNC Computer Setup

    MSE DiffRCP 8.5 (GISS)After chronic frustration with an aging Mac Book Pro whose ability to process the climate datasets central to my workday grew increasingly questionable, I decided change was necessary. A friend of mine using a Microsoft Surface through VNC to his home laptop was having good results, and so I investigated similar options on an Apple platform. While a headless Mac Mini would be a good contender, even the new batch of processors unveiled in October had limited processor potential and no after-market RAM expandibility. Since I was very happy with my dual-monitor setup at home, I was less interested in shelling out on an iMac. Another option, the old Mac Pro 8-core towers, seemed attractive with its vast army of ports, drive bays, and parts swapability, you wouldn’t be getting the latest i7 processors and the 980W power supply is just nausea-inducing. Lastly, the new Mac Pros are designed for video professionals and are simply pointless for a workflow consisting primarily of statistical processing.

    Read on →