I’ve been thinking a lot recently about a scholarly Open that hasn’t gotten much attention yet: Open impact tracking.
Impact tracking: Usage data from publishers + websites + apps on the objects they host. Downloads and views, but also bookmarks, discussions, posts… indications that people have interacted with the objects in some way.
We all know that companies value this information when the digital objects are pointers to consumer products: who is talking about the product? How many people are talking about it? What are they saying? What does it mean?
Now imagine that the digital objects are scholarly products. Papers, preprints, datasets, slidedecks, software. Don’t we still want to know who is interested? How many people are interested? What they think, what they are doing with it, whether it is making a difference in their own related work?
Yup, as scholars and people who fund and reward scholars, we certainly do want to know those things.
We want to know the numbers, and we want to know the context of the numbers. Not so we can overinterpret them as the end-all-and-be-all of an assessment scheme, but as insight into dimensions of impact that are totally hidden when we focus on pre-publication metrics (particularly the totally-inappropriate-for-article-level-assessment Journal Impact Factor) or even just the single dimension citation tracking.
PLoS has led the way: since 2009 PLoS has been collecting and displaying Article-Level Metrics for its articles. Jason Priem and others have articulated the promise of altmetrics and begun digging into what these metrics mean.
Over the last few months I’ve been having a great time hacking on an app that reveals open altmetrics stats (and their context) for diverse research products. total-Impact started in a 24-hour hackathon at the Beyond Impact workshop funded by the Open Society Foundations. Since then a few of us have been unable to put it down. I’ll talk about it a bit more in a future blog post [added link, also see here], but you are welcome to read more and play around with the alpha release now!
The time is clearly right for this sort of app… several similar ones are emerging now too.
In this post I want to highlight one thing about this space:
The source data for scholarly research impact metrics should be Open. Open facilitates mashups. Open enables unexpected use, from unexpected places. Open lets the little players in and brings the innovation. Open permits transparency to detect problems.
Total-Impact got going in large part because PLoS and Mendeley have APIs which make their impact data freely and openly available. Some publishers and websites do the same (or at least display their usage data on webpages and permit scraping) — but most don’t. Why?
- It costs money, a rep from a Very Big Publisher told me last week. Yup. But these days not that much money. This isn’t the beginning of Citation Counting when it was all manual and the only choice was to charge money. This is routine web stuff. Consider it one of your publishing costs, as PLoS does.
- It can be gamed, we don’t know what it means, it might send the wrong message. Ok, yes. But we are using it right now anyway, with all of those “Highly accessed” badges and monthly emails to authors. The difference? The data isn’t openly available for analysis and critique and deep understanding and improvement. I say: open up your data, say what it means and what its limitations are, and work toward standards.
- Privacy. For sure, don’t do things that would make your service users mad. But that leaves a lot of room for sharing some useful data. Aggregate download stats, maybe some breakdowns by date or geography or return visitors. Drill-down to reviews or publicly-available details. Here are a few of the sources doing it… you can do it too.
- open usage stats. Views and downloads of scholarly research products over time, number of bookmarkers, etc. This means publishers and institutional repositories and data hosts and blogging platforms and value-add services.
- open full text queries. This doesn’t require OA: Google Scholar allows full text queries into research articles. Unfortunately Google SCholar doesn’t allow using its information in an automated fashion. Publisher websites could allow this, ideally through an API. PubMed Central is a leader here, with eUtils (though its 3 queries/second limit prohibits a lot of useful applications).
- open reference lists. You know how abstracts are “open”… or at least free? If reference lists were also in front of the pay wall and available for aggregation we could have a lot more players in the citation aggregation space, and more agile innovation than Web of Science+ Scopus + Google Scholar alone can provide. Again PubMed Central is a leader here in making citation information Open through its eUtils api.