Research Remix

March 21, 2008

Eating my own dogfood

Filed under: sharingdata — Tags: — Heather Piwowar @ 8:51 am

I guess eating dogfood really refers to companies who use their own software, rather than researchers who apply their research topics to their own research. “Practice what I preach” is more accurate, but less fun. And more, well, preachy.

ANYWAY, the point is, as I’m doing all of this research into data sharing behaviour, I’m making a point of sharing my own data. I’m not sure that anyone will ever want to use it for anything, but who knows? Maybe. From an editorial on Nature Neuroscience [doi:10.1038/nn0807-931]:

Does anyone want your data? That’s hard to predict, but the easier it becomes to request data and to receive credit for sharing it, the more likely people are to ask. After all, no one ever knocked on your door asking to buy those figurines collecting dust in your cabinet before you listed them on eBay. Your data, too, may simply be awaiting an effective matchmaker.

It also lets me experience what it feels like to share data. It isn’t the same, I know, as sharing data from a multi-year, career-making, blood sweat and tears project, but it is something.

Sharing data is indeed hard. Specifically:

  • time consuming
  • decision-intensive (where to put it? what to share? what format to share it in?)
  • scary (what if someone finds a mistake?)
  • embarrassing (the data isn’t nearly as X as I wish I had the time to make it )

I also get to experience some of the first-hand benefits:

  • it forces additional organization
  • it helps me find my own data again later, from any computer!
  • it makes me feel proud to have made my science transparent (albeit after the fact, rather than as open notebook science)

I’m a firm believer in continual improvement. That means that I’ve shared my data now, in the best way that I have time for, rather than waiting until I can share it the way that I’d ideally like to. There are lots of things I’d like to improve:

  • Put it somewhere central and permanent (not clear where, for the esoteric dataset types that I have, but there are some neat possibilities)
  • Put it in a semantic format (!!!)
  • Document it better
  • Tag it so people can find it
  • ….

I’ll keep exploring and implementing these things as I get a chance.
If you want to put your data up but have hesitations about it, I say do it to the best of your ability right now given your current constraints. It isn’t perfect? I know, but perfect is the enemy of good enough.

Related:

  • Ditto for statistical scripts, but that’s another post.
  • Blog as data: bbgm used Dapper as a way to Semantify [the bbgm] site. Sounds fun, I’d like to try when I get a minute.
  • Have you heard this joke? “Before you criticize someone, you should walk a mile in their shoes. That way, when you criticize them, you’re a mile away and you have their shoes.” I love that one :)

1 Comment

  1. It’s really interesting hearing your thoughts as you go through the data sharing process (and write papers on it!). Despite the issues you mention, it does make it seem a little less daunting to start, especially when you say, “just do it!” Maybe there should be a “data sharing 101″ or “data sharing for dummies” guide… covering the basic steps towards good data sharing in the life sciences (or even just informatics-y stuff).

    Thanks for the joke, too! good stuff.

    Comment by shwu — March 21, 2008 @ 11:44 am


RSS feed for comments on this post.

Sorry, the comment form is closed at this time.

Customized Shocking Blue Green Theme. Blog at WordPress.com.

Follow

Get every new post delivered to your Inbox.

Join 3,543 other followers

%d bloggers like this: