In my recently published analysis, datasets from human subject studies and cancer investigations were found least likely to be available through public gene-expression microarray databases.
From the discussion section of the paper:
It is disheartening to discover that human and cancer studies have particularly low rates of data sharing. These data are surely some of the most valuable for reuse, to confirm, refute, inform and advance bench-to-bedside translational research  Further studies are required to understand the interplay of an investigator’s motivation, opportunity, and ability to share their raw datasets , . In the mean time, we can make some guesses: As is appropriate, concerns about privacy of human subjects’ data undoubtedly affect a researcher’s willingness and ability (perceived or actual) to share raw study data. I do not presume to recommend a proper balance between privacy and the societal benefit of data sharing, but I will emphasize that researchers should assess the degree of re-identification risk on a study-by-study basis , evaluate the risks and benefits across the wide range of stakeholder interests , and consider an ethical framework to make these difficult decisions . Learning how to make these decisions well is difficult: it is vital that we educate and mentor both new and experienced researchers in best practices. Given the low risk of re-identification through gene expression microarray data (illustrated by its inclusion in the Open-Access Data Tier at http://target.cancer.gov/dataportal/access/policy.asp), data-sharing rates could also be low for reasons other than privacy. Cancer researchers may perceive their field as particularly competitive, or cancer studies may have relatively strong links to industry – two attributes previously associated with data withholding , .52. Vickers AJ (2008 January 22) Cancer Data? Sorry, Can't Have It. The New York Times. 53. Siemsen E, Roth A, Balasubramanian S (2008) How motivation, opportunity, and ability drive knowledge sharing: The constraining-factor model. Journal of Operations Management 26: 426–445. 54. Tucker J (2009) Motivating Subjects: Data Sharing in Cancer Research [PhD dissertation.]. Virginia Polytechnic Institute and State University. 55. Malin B, Karp D, Scheuermann RH (2010) Technical and policy approaches to balancing patient privacy and data sharing in clinical and translational research. J Investig Med 58: 11–18. 56. Foster M, Sharp R (2007) Share and share alike: deciding how to distribute the scientific and social benefits of genomic data. Nat Rev Genet 8: 633–639. 57. Navarro R (2008) An ethical framework for sharing patient data without consent. Inform Prim Care 16: 257–262. 58. Blumenthal D, Campbell E, Anderson M, Causino N, Louis K (1997) Withholding research results in academic life science. Evidence from a national survey of faculty. JAMA 277: 1224–1228. 59. Vogeli C, Yucel R, Bendavid E, Jones L, Anderson M, et al. (2006) Data withholding and the next generation of scientists: results of a national survey. Acad Med 81: 128–136.
Because the topic is so important and under-discussed, I’d like to supplement this with a few additional resources on the issue of patient consent for research data archiving. I’m far from an expert in this area, but I suggest these articles (Mendeley group) as recommended reading for people interested in this issue, and imho essential reading for clinical researchers. The selection includes diverse perspectives.
- Delamothe, T. (1996). Whose data are they anyway? BMJ (Clinical research ed.), 312(7041), 1241-2. Retrieved from http://www.pubmedcentral.nih.gov/articlerender.fcgi?artid=2351079&tool=pmcentrez&rendertype=abstract
- Anderson, J. R., & Schonfeld, T. L. (2006). Patient consent in the era of de-identified research databases. Journal of clinical oncology : official journal of the American Society of Clinical Oncology, 24(4), 720-1. Retrieved from http://jco.ascopubs.org/content/24/4/720.full
- Foster, M. W., & Sharp, R. R. (2007). Share and share alike: deciding how to distribute the scientific and social benefits of genomic data. Nature reviews. Genetics, 8(8), 633-9. Retrieved from http://dx.doi.org/10.1038/nrg2124
- Caulfield, T., McGuire, A. L., Cho, M., Buchanan, J. A., Burgess, M. M., Danilczyk, U., Diaz, C. M., et al. (2008). Research ethics recommendations for whole-genome research: consensus statement. PLoS biology, 6(3), e73. Public Library of Science. Retrieved from http://dx.plos.org/10.1371/journal.pbio.0060073
- Kaye, J., Heeney, C., Hawkins, N., Vries, J. de, & Boddington, P. (2009). Data sharing in genomics–re-shaping scientific practice. Nature reviews. Genetics, 10(5), 331-5. Retrieved from http://www.pubmedcentral.nih.gov/articlerender.fcgi?artid=2672783&tool=pmcentrez&rendertype=abstract
- Haga, S. B., & OʼDaniel, J. (2011). Public Perspectives Regarding Data-Sharing Practices in Genomics Research. Public health genomics. Retrieved from http://content.karger.com/ProdukteDB/produkte.asp?Aktion=ShowAbstract&ArtikelNr=324705&ProduktNr=224224
- Lemke, A. A., Wolf, W. A., Hebert-Beirne, J., & Smith, M. E. (2010). Public and biobank participant attitudes toward genetic research participation and data sharing. Public health genomics, 13(6), 368-77. Retrieved from http://content.karger.com/ProdukteDB/produkte.asp?Aktion=ShowAbstract&ArtikelNr=276767&Ausgabe=254515&ProduktNr=224224