Simon asked me to present on Data Citation Challenges. My slides are up on slideshare. Below is a brief summary.
7 data citation challenges, illustrated with data (includes elephants)
Here, in this group of people who are familiar with data citation, let’s take a few minutes to think and talk about some of the elephants in the room of data citation.
We’re out there advocating for best practices every day, trying to make a difference in all of the places that have the lowest-hanging fruit.
There are a few issues we are avoiding for now. That’s fine, I think we should be avoiding them for now, mostly. But we do need to remember the elephant issues are here, and that we will have to figure out how to deal with them eventually. (Some vanguards are of course working on them even now. Go, vanguards, go!)
I’ll highlight a few of these issues for which my collaborators and I have recent supporting data. (see slidedeck for graphs and details)
- Few policies request or require data citation
- There is no consistent practice for data attribution
- It requires a huge manual effort to track data citations
- There is a lack of tool support for tracking data citations
- Our best practices do not scale to mega-reuse
- Adoption of best practices will erode incentives in the short term
- Data citations only matter if they are valued