Research Remix

September 15, 2008

Pedersen: software results shouldn’t be a matter of faith

Filed under: publication summary — Tags: , , — Heather Piwowar @ 10:26 am

Great article by Ted Pedersen in the Sept 2008 issue of Computational Linguistics (“Empiricism is not a matter of faith”) about the importance of sharing research software, not just the results of running the software.

Sadly, the article isn’t freely available online(see below).  Here’s a link to the first page, a few quotes, and the article outline.

  • “While his work achieved publication, it must gnaw at his scientific conscience that he can’t reproduce his own results.”
  • “We publish page after page of experimental results where apparently small differences determine the perceived value of the work.  In this climate, convenient reproduction of the results establishes a vital connection between authors and readers.”
  • “We do this routinely, to the point where we seem to have given up on the idea of being able to reproduce results.”
  • “often unintentional fallout from how we manage projects and set priorities”
  • “Imagine meeting with a new project member and being able to say: ‘Go download this software, read the documentation, install it, run the script that reproduces our ACL experiments, and then we can start talking tomorrow about how you are going to extend that work…'”
  • “Finally, although this viewpoint may seem quaint or naive, a great deal of our research is funded by public tax dollars, by people who make ten dollars an hour waiting tables […] Although most taxpayers won’t have much interest in reading our papers and running our code, they ought to have the opportunity.  And who knows […]”
  • Concludes by suggesting either we decide approach things with a focus on bigger ideas, or instead insist that “highly detailed empirical studies must be reproducible to be credible, and that it is unreasonable to expect that reproducibility be possible based on the description provided in a publication.”

Article overview:

1.  The Sad Tale of the Zigglebottom Tagger

2.  The paradox of Faith-Based Empiricism

3.  A Heretic’s Guide to Reproducibility

3.1  Release Early, Release Often

3.2  Measure your Career in Downloads and Users

3.3  Ensure Project Survivability by Releasing Software

3.4  Make the World A Better Place

4.  What should Computational Linguistics Do?

[Make the article freely available online ASAP :)]

[[ETA:  An author-archive of the full-text is available here: ]]

Blog at