• Tutorial: FuzzyWuzzy String Matching in Python -- Improving Merge Accuracy Across Data Products and Naming Conventions

    [![District Naming Preview](https://pathindependence.files.wordpress.com/2015/10/screen-shot-2015-10-31-at-1-41-49-pm.png?w=660&resize=660%2C322)](https://pathindependence.files.wordpress.com/2015/10/screen-shot-2015-10-31-at-1-41-49-pm.png)
    Example of Two Datasets with Comparable Variables

    Read on →

  • Merging Innumerable Tables into LaTeX? (Mac OS/X)

    Sometimes you simply have to run models that test dozens of different hypotheses and therefore are left with a lot of output to work through. For example, if I’m interested in how crops respond to extreme heat, there’s a bevy of specifications to work through, but more importantly, numerous crops to test. I use the standard esttab machinery in Stata to output .tex tables which can easily be inputted into a master TeX file which generates attractive enough reports to scour and share with colleagues. What I’ll describe is generic enough to accommodate a range of inputs that would be included in your TeX file, but here I’m specifically interested in having page breaks followed by individual tables, which do not adhere to a consistent file-naming convention (e.g., table1.tex, table2.tex…).

    Read on →

  • NCO 1968 Occupation Codes

    A new project I’ve spent much of the summer on examines labor allocation decisions in India. The dataset we’re using refers to NCO68 occupation codes, without providing code labels. I’ve scraped what seems to be a reliable PDF with comprehensive 3-digit NCO codes and converted it to a spreadsheet for easy use in Stata or any other statistical analysis program. You can find it in the Resources section and it includes 3-digit codes, occupation title, and 1-digit primary codes as separate columns.

  • On Working with India's NSS Data in an OS/X Environment

    link_nesstar

    Read on →

  • Screen Sharing and Yosemite

    Jumping from Yosemite?The Yosemite OS upgrade has been available since mid-October. Since it often takes weeks/months for developers to resolve bugs introduced by a new OS version, I held off on leaving Mavericks for fear of some vital Homebrew or Python functionality becoming a casualty of the shift and then being needlessly out of luck right before a presentation/deadline. I finally upgraded last week and the install process consumed some 9+ hours. That would be fine if the countdown clock said as much, but most of the time the install page declared the process would either be over in 20 minutes or 3 minutes (nothing ever in between, seemingly violating the intermediate value theorem).

    Read on →

  • CHIRPS v. 2.0 Released Today

    canvasToday the UCSB team released a major update to the CHIRPS precip product that people may be familiar with from Sub-Saharan Africa (SSA) hazard (drought/flood) warning applications. While v. 1.8 featured global data at pentadal resolution, this update expands the daily resolution dataset from SSA to global availability. While you’re there, check out their Early Warning Explorer (EWX), code snippets you can embed to produce summary stat plots and precip visualizations. Unfortunately WP prohibits JavaScript embeds, so all I can offer is a snapshot example (above) of their output.

  • Calculating degree days with NumPy

    2010-08-08 at 11-41-43

    Read on →

  • VNC Computer Setup

    MSE DiffRCP 8.5 (GISS)After chronic frustration with an aging Mac Book Pro whose ability to process the climate datasets central to my workday grew increasingly questionable, I decided change was necessary. A friend of mine using a Microsoft Surface through VNC to his home laptop was having good results, and so I investigated similar options on an Apple platform. While a headless Mac Mini would be a good contender, even the new batch of processors unveiled in October had limited processor potential and no after-market RAM expandibility. Since I was very happy with my dual-monitor setup at home, I was less interested in shelling out on an iMac. Another option, the old Mac Pro 8-core towers, seemed attractive with its vast army of ports, drive bays, and parts swapability, you wouldn’t be getting the latest i7 processors and the 980W power supply is just nausea-inducing. Lastly, the new Mac Pros are designed for video professionals and are simply pointless for a workflow consisting primarily of statistical processing.

    Read on →

  • New PCF World Forum Newsletter

    The good folks over at PCF World Forum have just released their first newsletter which is free of charge and can be downloaded here. Be forewarned, this newsletter is a teaser and if you like what you read, then you’ll want to notify THEMA1 of your intent to subscribe. This inaugural issue shares the latest news in product carbon footprinting initiatives, spearheaded by organizations like the Carbon Disclosure Project, the WRI/WBCSD GHG Protocol, and the German collaborative – PCF Pilot Project. In the pipeline are some events to be aware of, including the second PCF World Forum in September and a workshop on “Consistency, International Legislation, and Certification” to be hosted in Germany on July 1st.

    Read on →

  • Carbon Footprint Reduction Services

    I was reading an old issue of The Onion and came across this article which simply had to be reprinted here. Sometimes we just have to laugh along to the carbon footprint jokes – they’re often more truthful than we care to admit. And by that, I refer to the new coal-power drinking straw factory.

    Read on →