Public sharing of research datasets: a pilot study of associations

TitlePublic sharing of research datasets: a pilot study of associations
Publication TypeJournal Article
Year of Publication2010
AuthorsPiwowar, H. A., & Chapman W. W.
JournalJournal of informetrics
Volume4
Issue2
Pagination148 - 156
Date Published2010/04//
ISBN Number1751-1577
Abstract

The public sharing of primary research datasets potentially benefits the research community but is not yet common practice. In this pilot study, we analyzed whether data sharing frequency was associated with funder and publisher requirements, journal impact factor, or investigator experience and impact. Across 397 recent biomedical microarray studies, we found investigators were more likely to publicly share their raw dataset when their study was published in a high-impact journal and when the first or last authors had high levels of career experience and impact. We estimate the USA’s National Institutes of Health (NIH) data sharing policy applied to 19% of the studies in our cohort; being subject to the NIH data sharing plan requirement was not found to correlate with increased data sharing behavior in multivariate logistic regression analysis. Studies published in journals that required a database submission accession number as a condition of publication were more likely to share their data, but this trend was not statistically significant. These early results will inform our ongoing larger analysis, and hopefully contribute to the development of more effective data sharing initiatives.

URLhttp://www.ncbi.nlm.nih.gov/pmc/articles/PMC3039489/
Short TitleJ Informetr

Gap Area Study Type:

High-level Gap Areas:

Purpose: 
Investigated whether data sharing frequency was associated with funder and publisher requirements, journal impact factor, or investigator experience and impact
Method: 
Used a previously-created set of 397 articles in 20 journals describing studies using gene expression microarray data; identified which studies had made their raw datasets available; used multivariate logistic regression to evaluate the association between authorship, grant, and journal attributes of a study and the public availability of its microarray data