Inspired by current blogosphere discussions, I’ve pulled together a list of articles and studies related to journal supplementary material. The bibliography is at the bottom, and the living collection is at Mendeley. Let me know in the comments if you have other favs?
Because a mere bibliography is only so useful without rolling up one’s sleeves, here are a few rough highlights.
Note I haven’t found many studies that investigate what is actually *in* supplementary materials, how often they are read and used, what they are used for, and other important and interesting questions.
“First, I despise the name. Supplementary implies something extra. …
it sends exactly the wrong message about our priorities. What typically gets put into S&M? The details of the experimental methods and often, especially for papers in genomics, tables and figures containing at least some of the primary data” (Wilke)
“Other journals have almost completely moved the Materials and Methods section from the main text to online supplements. These journals are conveying the message, however inadvertent, that the sine qua non of the scientific method, the Materials and Methods, is the least important part of a scientific publication” (Shriner)
“While the size of articles has grown gradually over the past decade, the supplemental material associated with a typical Journal article appears to be growing exponentially and is rapidly approaching the size of an article. The sheer volume of supplemental material is adversely affecting peer review.” (Maunsell)
“Like it or not, ranking of scientific achievement by citation-based methods is an important part of the scientific system, and journals should make all their citations accessible to those who need accurate numbers. The solution to this problem seems quite simple: the citations in the supplement have to be incorporated into the reference section of the main text by the authors.” (Seeber)
“Supplemental data can seldom be discovered except by manual examination of individual articles. A paywall often limits access. Publishers put few resources into maintaining supplemental data and may even fail to migrate data when journals change hands. ” (Vision)
prices for supplementary info
“Amongst three of the journals we interviewed, these author charges for supplementary data files ranged from $100 to $300+.” (Beagrie)
cost of a discipline-based repository
“Estimates of the combined online and print publication costs of a single scientific article range from $2000 to $10,000 (King 2007). On the basis of projections for Dryad, the marginal cost of data publication would be only a small fraction (< 2 percent) of this sum, provided that the repository has sufficient volume (on the order of 104 new submissions annually)." (Vision)
"With low to moderate curation effort, initial projections of potential costs for Dryad lead to ballpark estimates of $200,000 or $320,000, respectively, assuming receipt of data from 5,000 or 10,000 papers per annum." (Beagrie)
"Given the budget estimates for volumes of 5,000 and 10,000 papers per year, Dryad’s per paper expenses were estimated to be $40 and $32, respectively." (Beagrie)
"For Dryad the value proposition is as follows: [..]
For publishers, Dryad frees journals from the responsibility and costs of publishing and maintaining supplemental data in perpetuity, and allows publishers to increase the benefits of their journals to the societies and the scientists they support;"(Beagrie)
Permanence of supplementary materials and alternative data and web archives
migration to current formats
“Unfortunately, .doc is particularly ill-suited for archival and online-publishing purposes. Whether a particular .doc file can be opened and printed successfully depends on the exact version ofMicrosoft Word installed, the version of the operating system installed, the printer installed, and the fonts installed. Furthermore, the details of the .doc format are secret and change from version to version. As a result, some of Nature’s readers will have problems opening and printing supplementary material. Moreover,we should expect that many of these documents will fail to open properly 10 to 20 years from now.” (Wilke)
supplemental information within journals
“For Method 1 we found that since 2001, only 71 – 92% of supplementary data were still accessible via the links provided, with 93% of these inaccessible links occurring where
supplementary data was not stored with the publishing journal. Of the manuscripts evaluated in Method 2, we found that only 83% of these links were available approximately a year after publication, with 55% of these inaccessible links were at locations outside the journal of publication” (Anderson)
supplemental information upon request
“One in four e-mail addresses becominginvalid within one year of publication is analarming rate of decay as it has an impacton the ability of scientists to communicate and exchange material.” (Wren)
supplemental information by url
“The most common reason for citing a URL was to provide additional information about a topic (54.1%) or to link to additional data or analyses (37.7%)” (Wren Johnson)
“Most authors (55.2%) agreed that the unavailable URL content was important to the publication, but few controlled UR Lavailability personally (5%) or with the help of others (employees, colleagues, and friends) (6.7%).” (Wren Johnson)
“Most authors (32 [51.6%] of 62) did not know whythe URL they cited was unavailable. However, consistent with previous findings, about 11% of URLs were misspelled in the final publication. Three (4.5%) indicated that the URLs became unavailable because of a lack of funding or support.” (Wren Johnson)
“Thirty percent of expired pages referenced in three of the highest impact-rated scientific journals in the United States ended in ‘‘.edu’’ (Dellavalle et al., 2003).”
“Here, we see that websites published at .edu addresses are the least stable. One possible explanation for this is that corresponding authors tend to be lab mentors, whereas creators of websites would likely be students and/or post-doctoral fellows, who would be more likely to leave.” (Wren 2008)
“A study of the reasons behind URL decay suggested that it is often outside the control of the original website creators (Wren et al., 2006b), suggesting that the best place for intervention would be at the time of publication.” (Wren 2008)
“only 5% of URLs cited more than twice have decayed versus 20% of URLs cited once or twice. The most common types of lost content were computer programs (43%), followed by scholarly content (38%) and databases (19%).” (Wren)
“Plant Physiology expands on this theme: Links to web sites other than a permanent public repository are not an acceptable alternative because they are not permanent archives.” (Piwowar, ELPUB)
“However, the average lifespan of a Web site is far from sufficient to ensure reliable long-term availability. Because of the inconstant nature of URLs, neither publishers nor authors are able to guarantee the long-term accuracy or availability of digital information referenced in dermatology journals. Effective solutions will likely require a collaborative effort on the part of researchers, authors, and journal editors.” (Wren Johnson)
“Many high-impact journals do not provide in- structions for Internet citation formats (44%), nor do they provide recommendations to archive cited digital information (99%)” (Schilling)
“The basic changes would require simply scanning for URLs in a publication, automatically checking them for availability, creating a snap- shot of URL content at the time of publication, and permitting authors to update URLs on the journal website should they change.” (Wren 2008)
“Methods of preservation such as PURLs (Schafer et al., 2001) and WebCite (Eysenbach, 2006) have been developed but are apparently not in widespread use. “” (Wren 2008) see also Table 2 in (Wren Johnson)
(arg I wish WordPress.com allowed Mendeley embedding!)
Anderson, Nicholas R, Peter Tarczy-Hornoch, and Roger E Bumgarner. 2006. On the persistence of supplementary resources in biomedical publications. BMC Bioinformatics 7: 260. doi:10.1186/1471-2105-7-260. http://www.biomedcentral.com/1471-2105/7/260.
Anon. Dryad Sustainability Plan: Interview survey findings.
Ball, Catherine A., Gavin Sherlock, Helen Parkinson, Philippe Rocca-Sera, Catherine Brooksbank, Helen C. Causton, Duccio Cavalieri, et al. 2002. Submission of Microarray Data to Public Repositories. PLoS Biology 18, no. 22: 1409. http://www.doaj.org/doaj?func=abstract&id=107145.
Beagrie, Neil, Lorraine Eakin-Richards, and Todd Vision. 2009. Business models and cost estimation: Dryad repository case study, no. 1.
Brown, C. 2007. The role of Web-based information in the scholarly communication of chemists: Citation and content analyses of American Chemical Society Journals. Journal of the American Society for Information Science and Technology 58, no. 13.
Cozzarelli, NR. 2004. UPSIDE: Uniform principle for sharing integral data and materials expeditiously. Proc Natl Acad Sci U S A 101, no. 11: 3721-2. http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&db=PubMed&dopt=Citation&list_uids=15026576 .
Dellavalle, Robert P, Eric J Hester, Lauren F Heilig, Amanda L Drake, Jeff W Kuntzman, Marla Graber, and Lisa M Schilling. 2003. Going, going, gone: Lost Internet references. Science 302, no. 5646: 787-788. doi:10.1126/science.1088234. http://www.sciencemag.org/cgi/reprint/sci;302/5646/787.pdf.
Ducut, Erick, Fang Liu, and Paul Fontelo. 2008. An update on Uniform Resource Locator (URL) decay in MEDLINE abstracts and measures for its mitigation. BMC medical informatics and decision making 8: 23. doi:10.1186/1472-6947-8-23.
Evangelou, E, T Trikalinos, and J Ioannidis. 2005. Unavailability of online supplementary scientific information from articles published in major journals. FASEB J 19, no. 14: 1943-1944. http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&db=PubMed&dopt=Citation&list_uids=16319137 .
Eysenbach, Gunther. 2006. Going, going, still there: using the WebCite service to permanently archive cited Web pages. AMIA Symposium 7, no. 5: 919. http://www.pubmedcentral.nih.gov/articlerender.fcgiartid=1839311&tool=pmcentrez&rendertype=abstract.
Marcus, Emilie. 2009. Taming supplemental material. Immunity 31, no. 5: 691. doi:10.1016/j.immuni.2009.10.005. http://www.ncbi.nlm.nih.gov/pubmed/19932065.
Maunsell, John. 2010. Announcement Regarding Supplemental Material. The Journal of Neuroscience 30, no. 32: 10599.
McCarthy, John. 2009. Supplementary online material: potential and precautions. Augmentative and alternative communication (Baltimore, Md. : 1985) 25, no. 1: 4-6. doi:10.1080/07434610902744041. http://www.ncbi.nlm.nih.gov/pubmed/19280419.
Murray-Rust, P, J Mitchell, and H Rzepa. 2005. Communication and re-use of chemical information in bioscience. BMC Bioinformatics 6. http://dx.doi.org/10.1186%2F1471-2105-6-180.
Piwowar, Heather, and Wendy Chapman. 2008. A review of journal policies for sharing research data. In ELPUB. doi:10.1038/npre.2008.1701.1.
SHRINER, DANIEL. 2008. Putting Materials and Methods in Their Place. Science 322, no. December: 1463-1466.
Santos, C, J Blake, and D States. 2005. Supplementary data need to be kept in public repositories. Nature 438, no. 7069: 738. http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&db=PubMed&dopt=Citation&list_uids=16340990 .
Schilling, Lisa M, Desiree P Kelly, Amanda L Drake, Lauren F Heilig, Eric J Hester, and Robert P Dellavalle. 2004. Digital information archiving policies in high-impact medical and scientific periodicals. JAMA : the journal of the American Medical Association 292, no. 22: 2724-6. doi:10.1001/jama.292.22.2724. http://www.ncbi.nlm.nih.gov/pubmed/15585731.
Seeber, F. 2008. Citations in supplementary information are invisible. Nature 451, no. 7181: 887. http://www.nature.com/nature/journal/v451/n7181/full/451887d.html.
Vision, Todd J. 2010. Open Data and the Social Contract of Scientific Publishing. BioScience 60, no. 5: 330-331. doi:10.1525/bio.2010.60.5.2. http://caliber.ucpress.net/doi/abs/10.1525/bio.2010.60.5.2.
Wilke, Claus. 2004. Supplementary materials need the right format. Nature 430, no. 6997: 291.
Petsko, Gregorya. 2006. Let’s get our priorities straight. Genome Biology 7, no. 1: 101. doi:10.1186/gb-2006-7-1-101.
Wren, Jonathan D, Joe E Grissom, and Tyrrell Conway. 2006. E-mail decay rates among corresponding authors in MEDLINE. The ability to communicate with and request materials from authors is being eroded by the expiration of e-mail addresses. EMBO Rep 7, no. 2: 122-127. doi:10.1038/sj.embor.7400631.
Wren, Jonathan D, Kathryn R Johnson, David M Crockett, Lauren F Heilig, Lisa M Schilling, and Robert P Dellavalle. 2006. Uniform resource locator decay in dermatology journals: author attitudes and preservation practices. Archives of dermatology 142, no. 9: 1147-52. doi:10.1001/archderm.142.9.1147. http://www.ncbi.nlm.nih.gov/pubmed/16983002.
Wren, Jonathan D. 2008. URL decay in MEDLINE–a 4-year follow-up study. Bioinformatics 24, no. 11: 1381-1385. doi:10.1093/bioinformatics/btn127.