Research Remix

May 5, 2011

Data citation elephants

Filed under: Uncategorized — Heather Piwowar @ 10:38 am

I recently attended the fantastic JISC workshop on Managing Research Data (#jiscmrd).  Kudos to Simon Hodson for throwing such an interesting meeting, and all the people there for great conversation.

Simon asked me to present on Data Citation Challenges.  My slides are up on slideshare.  Below is a brief summary.

7 data citation challenges, illustrated with data (includes elephants)

photo from gin_able on flickr

Here, in this group of people who are familiar with data citation, let’s take a few minutes to think and talk about some of the elephants in the room of data citation.

We’re out there advocating for best practices every day, trying to make a difference in all of the places that have the lowest-hanging fruit.

There are a few issues we are avoiding for now.  That’s fine, I think we should be avoiding them for now, mostly.  But we do need to remember the elephant issues are here, and that we will have to figure out how to deal with them eventually.  (Some vanguards are of course working on them even now.  Go, vanguards, go!)

I’ll highlight a few of these issues for which my collaborators and I have recent supporting data.  (see slidedeck for graphs and details)

  1. Few policies request or require data citation
  2. There is no consistent practice for data attribution
  3. It requires a huge manual effort to track data citations
  4. There is a lack of tool support for tracking data citations
  5. Our best practices do not scale to mega-reuse
  6. Adoption of best practices will erode incentives in the short term
  7. Data citations only matter if they are valued
This list looks daunting, but we’ll get there.  Meetings like this will help!


  1. […] solution is not perfect, but it is a pretty good recommendation in most […]

    Pingback by Links from the data collection article: Inline or in the bibliography? « Research Remix — May 5, 2011 @ 11:08 am

  2. […] Data Citation Elephants […]

    Pingback by resources on Data Citation Principles « Research Remix — May 17, 2011 @ 4:49 am

  3. […] What other hurdles are in the way?  We need policies to get authors to include data citations in reference lists (feedback wanted on this draft policy).  But how feasible is this, really?  What about cases when authors use many, many datasets, should all of those really go in the article references section?  Can they? How do we handle physical limitations of reference lists for journals still printed on paper?  How can we get around the fact that citations in supplementary information are invisible?  I’ve gone on the record calling these issues elephants in the room of data citation. […]

    Pingback by Citations in Supplementary Material can be indexed! « Research Remix — August 17, 2011 @ 1:53 pm

RSS feed for comments on this post.

Sorry, the comment form is closed at this time.

Blog at

%d bloggers like this: