August 13, 2010

Supplementary materials is a stopgap for data archiving

August 13, 2010

The Journal of Neuroscience has issued a new policy on supplementary materials:

Beginning November 1, 2010, The Journal of Neuroscience will no longer allow authors to include supplemental material when they submit new manuscripts and will no longer host supplemental material on its web site for those articles

I think this will benefit the reporting of methods and exploratory analyses. I am thrilled that citations will no longer be lost in supplementary materials, assuming the additional citations make it into the main references list rather than being omitted.

But what about data?

A journal’s supplementary material section is not a great place for data. Limitations include:

  • not good for data formatting and reporting standards
  • not good for discoverability
  • not good for truly permanent storage
  • not good for machine retrievability
  • not good for journals sticking to core competencies
  • not good for journal planning, efficiency
  • not good for free access (in subscription journals)
  • not good for open access (or at least conveying openness clearly)
  • not good for lots of other things that I don’t know about and publishers don’t know about but repository professionals do know about

Most people would agree that well-designed, well-supported data repositories are the best place for data. The problem is, such repositories are few and far between. All is well and good if an experiment is in a discipline or produces a datatype for which a best-practice repository exists: the data should go there. All may be good if the authors are in an institution with an institutional repository that is well-equipped to handle scientific data, though these are uncommon. Otherwise where can investigators put their datasets?

Supplementary information is not a perfect home, it is not even very good, but it is better than hosting data on a lab websites or email-on-demand. It is a useful stopgap while more discipline-based repositories and institutional repositories rise to fill the need.

By removing this stopgap, in my opinion (and with the important caveat that I know very little about the journal or its discipline), The Journal of Neuroscience has sent three messages with its new policy:

1. They don’t consider archiving data to be their responsibility

This was already clear from their lackluster policy on data archiving:

Policy on Concerning Availability of Materials
It is understood that by publishing a paper in The Journal of Neuroscience the author(s) agree to make freely available to colleagues in academic research any clones of cells, nucleic acids, antibodies, etc. that were used in the research reported and that are not available from commercial suppliers.

Policy on DNA Sequences
[...] By the time a paper is sent to press, sequences must be deposited in a database generally accessible to the neuroscience community; the sequence accession number should be provided. Exceptions to this policy may be considered on an individual basis.

That’s it. Compare this to the comprehensive policies of other journals, particularly their statements of motivation. For example, in Science:

After publication, all data necessary to understand, assess, and extend the conclusions of the manuscript must be available to any reader of Science.

And in Stem Cells (similar in Cell):

Stem Cells supports the efforts of the National Academy of Sciences (NAS) to encourage the open sharing of publication-related data. Stem Cells adheres to the beliefs that authors should include in their publications the data, algorithms, or other information that is central or integral to the publication, or make it freely and readily accessible; use public repositories for data whenever possible; and make patented material available under a license for research use.

The Journal of Neuroscience has said that it wants to “maintain its leading position.” For what it is worth, evidence suggests that the highest impact journals have the strongest data sharing policies.

2. They don’t consider archiving data important

Based on the policy and the wording of its announcement, I was left with the impression that the Journal doesn’t consider data archiving important. In particular, stating that “supplemental material is inherently inessential” and “We should remember that neuroscience thrived for generations without any online supplemental material” belittles data sharing, given that much data is currently shared in supplementary materials for lack of a better place to put it.

The policy has left investigators with fewer better-than-nothing places to share data. I hope the next journal that is tempted to eliminate supplementary material will consider these alternative approaches to address its problems while supporting data archiving:

  • Fix rather than eliminate supplemental material policies: clearly specify that supplemental info is not peer-reviewed, specify that suppl info is only for data (for example), remind reviewers and authors that suppl info is not for defensive material, etc.

    One example is the thoughtful response by Cell to its problems with supplemental material, a solution of defining what should and shouldn’t be included:

    “One of the first issues we confronted in thinking about structuring supplemental material was one of setting limits. Limits of course have both positives and negatives. On the plus side, it seems in the best interest of everyone in the scientific community that the concept of a ‘‘publishable story’’ be at least roughly defined. [...] strict overall length limits struck us as somewhat arbitrary, and we instead focused on a more conceptual organization.”

  • Or, if you do indeed want to eliminate supplementary materials, recommend and in fact require that links to supplementary information elsewhere are either to established repositories or to resources archived through one of the many mechanisms for url permanence.
  • Or, engage with Dryad or another discipline-based repository to find a win-win solution
  • And please commit to participating with the community to find solutions, rather vaguely suggesting, “It is conceivable that removing supplemental material from articles might motivate more scientific communities to create repositories for specific types of structured data, which are vastly superior to supplemental material as a mechanism for disseminating data.”

3. Change is needed

I completely agree with them here. Change is needed. I also applaud the Journal for taking a bold step, even if I disagree with its particulars. I think it will motivate, inspire, and induce change. Bring on the market disruption… although it is a real shame if we lose a bunch of (expensive) (irreplaceable) data (forever) in the process.

A follow-up post with references on supplementary material.

  1. Very interesting reading on an important topic–thanks for the edifying disquisition.

    Comment by Hope Leman — August 13, 2010 @ 11:33 am

  2. I love your comments, Hope. I had to look disquisition up in the dictionary :)

    Comment by Heather Piwowar — August 13, 2010 @ 11:47 am

  6. Supplemental data is not adequately reviewed and often represents a collection of poorly executed and documented experiments. Yet authors can claim publication of such ‘results’ in reputable journals. This blog makes it somehow sound as if supplemental ‘data’ were the major data in a publication and without it, the experiments would neither be reproducible nor valid. I could not disagree more!

    Comment by Felix — September 1, 2010 @ 12:08 pm

    • While some articles do have valuable raw or processed data in the supplemental material section, many authors just use this section as a solution for page limits and dump the less important experiment results into this section. It should be clearly defined as to what should be included in supplemental material and what shouldn’t. This new policy is returning some papers back to the offline version without supplemental material. For some papers that do have valuable supplemental data, the authors will need to find a home for the data.

      Comment by Yun — September 15, 2010 @ 9:11 am

