Research Remix

July 18, 2007

Shared data? Open data?

Filed under: opendata, sharingdata — Heather Piwowar @ 9:49 am

Quick wondering.  My research is on data re-use.  I struggle with what to call the source datasets.  I’d like to call them “open data” but they aren’t, necessarily.  Sometimes not free, and usually not open in a licensing sense.  I’ve been calling them “shared data” which seems ok, but isn’t mainstream and so doesn’t help link the work in to others who are perhaps interested in the same ideas.  Publicly-available data?  Even more unwieldy.

I’m on the lookout for a better phrase. Let me know if you have any suggestions?

Powered by ScribeFire.

5 Comments

  1. Around here, second-hand shops often call their wares “pre-loved”. I think “pre-loved data” has a certain — something… (sorry, don’t have any sensible suggestions).

    Comment by Bill — July 18, 2007 @ 11:50 am

  2. Ha, that’s great!

    Now if I were naming a gene, I might really go there… PLD1, first discovered in association with a willingness to love datasets and then let them go. Ooops, or not… already taken. UniGene Hs.382865
    :)

    Comment by Heather Piwowar — July 18, 2007 @ 12:22 pm

  3. […] (Research Remix) asks a key question: Shared data? Open data? 15:49 18/07/2007, Heather Piwowar, opendata, sharingdata, Research […]

    Pingback by Unilever Centre for Molecular Informatics, Cambridge - petermr’s blog » Blog Archive » Shared data? Open data? — July 19, 2007 @ 2:35 am

  4. I’m not sure which part of your work you are discussing – one was “data shared upon request”.

    The other is your PLoS ONE paper – I thought those datasets were free to use with attribution.

    Comment by Jean-Claude Bradley — July 19, 2007 @ 8:50 am

  5. Great question, and I realize only now that I was probably not clear enough in my PLoS ONE paper. The datasets that I found were available to me on the internet, with my University of Pittsburgh library privileges which include subscriptions to many journals. Upon reflection, I can’t think of any of the datasets which required these library privileges… most were posted on lab websites that were indeed free for anyone to visit. Nonetheless there may have been a few (<5) which had their data only on a toll-access publisher website such that a subscription was required.
    I’ll have to go look, and see if there were any which were not actually Free.

    I called these “shared” and “publically-accessible”, trying to communicate that they had been published and made available to anyone (with a subscription???), without the need to request them individually from the authors. I can now see how my wording might also have implied “free.”

    Hrm. I suppose there are three different concepts, then.
    Free, as in no subscription required
    Open, as in you can reuse the data (according to some licence)
    Published, as in you do not need to request the data because it is already out there (though perhaps not in a journal or official database)

    That does make it more clear for me to think about. In my PLoS ONE paper, I only meant Published, though I do think that most (if not all) of them were free, and some degree of open (though rarely as open as a CC license). For future work, I may mean a different combination. Regardless, I’ll try to be more explicit.

    Is it misleading to call something “published” if it isn’t in a journal? For example, could “published data” include a table posted on a blog?

    Thank you for initiating the train of thought.

    Comment by Heather Piwowar — July 19, 2007 @ 9:25 am


RSS feed for comments on this post.

Sorry, the comment form is closed at this time.

Blog at WordPress.com.

%d bloggers like this: