Research Remix

March 23, 2012

Can’t I just say “data available for educational and research use”?

Filed under: Uncategorized — Heather Piwowar @ 11:20 am

I’ve been having an email conversation with someone who is starting up a small new discipline-specific data repository.  They hadn’t considered data licenses.  I gave them an overview and the CC0 spiel (why CCZero? see here and here).

A few days later they — quite reasonably! — followed up with me, essentially saying “that all sounds complicated.  Can’t I just say ‘this data is available for academic, research, and non-profit use’?  I am not sure how the commercial access would fly with a lot of the SUBJECTAREA folks.”

Here’s my response.  It isn’t the most carefully crafted and researched response in the world, but it is more useful to the world in my blog than just one Sent box and one Inbox.  Add comments if there are ways you’d improve it?  (I had a paragraph talking about how CC-BY (NC) might not even be appropriate for data, because data isn’t usually copyrightable… but that muddied the water more than helped.  Better clear intent through standardized terms than free text sentences!)

Good question.  The data will have terms of use no matter what — it’s just up to you what they are and how explicitly and clearly you state them.  A “license” (or waiver) simply says you are doing something explicit about copyright…. useful for those who want to know how they may use your content.

Explicit is better: it means people don’t have to guess.  Clear is also better.  To be as clear as possible, it helps to use language that someone else has already figured out (with lawyers etc) rather than what appears to be a simple sentence but may actually contain a lot of ambiguity.  (what do you mean by educational? nonprofit? what about if it is a commercial educational use?  etc.)

If you want to prevent commercial use (or rather, require separate conversations for each commercial use), you could use CC-BY-NC.  More and more people are becoming familiar with Creative Commons licenses (from open access publications, flickr photos, wikipedia, etc).  Creative commons has figured out the legal language and has a nice description page that makes it really clear and explicit that you can link to.  I’d strongly recommend this rather than crafting your own sentence.

Re commercial restrictions in general:  Many academics are hesitant about allowing unrestricted commercial use for the data they collect.  I think it is a discussion worth having, however, and a key value that your data repository will bring.  It isn’t truly “Open” data if it can’t be used commercially (as per all Open Access consensus statements).  See this discussion (about the literature, but substitute “data mining” for “text mining” and it is the same case):  http://lists.okfn.org/pipermail/open-science/2012-March/001466.html

Especially when the data was collected with taxpayer money, a strong case can be made that the data should support economic growth… commercial use is a key part of that.

The link points to a great recent post on the importance of commercial textmining permissions, by John Wilbanks on the OKFN open science mailing list.

6 Comments

  1. Totally agree on the need to be upfront about copyright. We were in a similar position two years ago when developing the ISPS Data Archive (also a small specialized, social science, data repository; isps.research.yale.edu/data). Ended up with CC-BY-NC-ND because that’s what our researchers were most comfortable with. As we expand, we are thinking of revisiting this issue.

    Comment by Limor Peer — March 24, 2012 @ 2:54 pm

  2. Thank you for a thoughtful contribution. Some good points here – in particular that data is currently not covered by copyright in most jurisdictions (EU excepted), and it might be for the best if things stay this way.

    Re: It isn’t truly “Open” data if it can’t be used commercially (as per all Open Access consensus statements).

    Comment: if you consider the Harvard and MIT faculty permissions OA mandates to be consensus statements (I do), then note that both include the phrase “but not for a profit”. This reflects my own perspective on open access as well, written into the consensus statements that I myself have written. Much as I love the Budapest Open Access Initiative, I am not convinced that a small group of people meeting for a couple of days to hammer out a consensus statement is likely to come up with a canon that the rest of the world is obligated to follow as if it were a core religious belief.

    Re: Especially when the data was collected with taxpayer money, a strong case can be made that the data should support economic growth…

    Comment: my perspective is that this statement reflects a particular political perspective, one that some of us will agree with, and others not. As an example of how this can be problematic, consider the Canadian Census. Recently, we saw a change from a mandatory long form census to a voluntary National Household Survey. To me, this reflects a shift from a primarily social purpose (understanding and meeting societal needs, who we are, our history) to a primarily economic focus. Once you make that shift, the kinds of questions you ask may well change, e.g. from tell us about your family, your home, where you came from, what your needs are, to what do you buy or want to buy. Note that the original version of the long form Census was very useful to business.

    A lot of data collected by governments is, and I would argue should be, for the public good. An argument can be made, though, that if the government leaves development of applications to the private sector, we may end up with applications that benefit only some of us. Businesses don’t tend to focus on the needs of poor people because this isn’t the best way to achieve profits.

    Regarding clarify coming from CC licenses, to some extent this is true, however even with these licenses there may be less clarity than we think. For example, in recent discussions on scholcomm and the openscience list, it has come to light that some people believe that using CC-BY means that someone else cannot sell a paywalled version, while others (myself included) think that CC-BY does permit paywalls.

    Nice as it might be to have a standard to direct everyone to use right now, these matters may not be fully settled for some time to come. Consider that copyright has been around since the 1600’s with the Statute of Anne – copyright is still hotly contested, and this is likely to continue for some time to come.

    There is a lot to consider, which is why debate and discussion are healthy. Until this recent discussion came up, for example, I had no idea that people were recommending CC-BY having a very different understanding of what this does and doesn’t permit than I. Now that this has come up, it gives us an opportunity to ask to have this clarified.

    Comment by hgmorrison (@hgmorrison) — March 24, 2012 @ 9:58 pm

  3. Another thought on “economic growth”. Given that we live in an ecosphere with real physical limits – isn’t the idea of endless growth problematic? Many of us would argue that it is. We need to think about living in balance / harmony with our environment, not endless growth. We need to spend some time giving this some deep thought – what this means, and how to achieve it. This is just one of many reasons why we need humanities and social sciences, not just science. We need balance here, too.

    Comment by hgmorrison (@hgmorrison) — March 24, 2012 @ 10:08 pm

  4. Thanks for taking the time to comment.

    I (and others) strongly believe that fully Open includes the right to commercial use. We do basic research not only to know more, but to *do* more. Cell phones, which have clearly benefitted the poor around the world, are the result of decades of all sorts of research… some of it by companies but much of the groundwork by academics. If similar groundwork were to be made *even more available* for use of all kinds then I am positive we would see many more world-helping innovations than we have now.

    Others see this issue differently. I don’t know how to make the case more convincing at this point… I’m going to agree to disagree with those who think that commercial restrictions are necessary or desirable.

    Comment by Heather Piwowar — March 24, 2012 @ 10:14 pm

  5. “Much as I love the Budapest Open Access Initiative, I am not convinced that a small group of people meeting for a couple of days to hammer out a consensus statement is likely to come up with a canon that the rest of the world is obligated to follow as if it were a core religious belief.”

    No. But I do think the Budapest meeting defined very clearly what the then-new term “open access” meant, and it’s been a real shame from everyone‘s perspective that discussion has been impeded by progressive muddying of that term. It would have been better for us all if, when people wanted to start publishing with non-commercial causes, they’d come up with a new and distinct term rather then co-opting the existing “open access”.

    “For example, in recent discussions on scholcomm and the openscience list, it has come to light that some people believe that using CC-BY means that someone else cannot sell a paywalled version.”

    Really? Who thinks that? On what basis? That is plainly and simply incorrect.

    “Given that we live in an ecosphere with real physical limits – isn’t the idea of endless growth problematic? Many of us would argue that it is. We need to think about living in balance / harmony with our environment, not endless growth.”

    Sorry, Heather, but this is a nonsensical argument. It is quite true that we should limit economic growth that is based on exploitation of physical resources, since those are finite — coal reserves, rainforest area, etc. But we are talking here about growth based on pure knowledge, and there is no intrinsic limit to that. We’re not going to run out of knowledge if we allow commercial organisations to use it was well as non-profits! If anything, allowing commercial companies more access to research and data will better enable them to discover eco-friendly alternatives to current processes, e.g. to make advances in solar power.

    Comment by Mike Taylor — March 26, 2012 @ 4:16 am

  6. […] It’s a missed opportunity.  Heather Piwowar said this rather well in a recent comment: “We do basic research not only to know more, but to do more”.  Non-commercial […]

    Pingback by Following up one Biology Open journal’s not-quite-open CC-BY-NC-SA licence « Sauropod Vertebra Picture of the Week #AcademicSpring — March 27, 2012 @ 1:03 am


RSS feed for comments on this post.

Sorry, the comment form is closed at this time.

Blog at WordPress.com.