Peer review and publication are the means by which scientists make their findings known, gain credibility, and advance their careers. For this reason, publishers and reviewers wield a large amount of power in the field. Their rules, practices, and culture dictate which papers are published and can affect the impact of publications. In a few recent articles, authors have approached the role of publishers and peer reviewers in supporting biomedical data-based research, particularly with regard to the important issues surrounding data and software sharing.
As a means of battling this lack of reproducibility in published research, Ionnidis calls for the “sharing of data, protocols, materials, and software.”
The benefits of data sharing include increased reproducibility and scientific discovery. Sharing of data used and produced in biomedical research is still not practiced widely enough in the biomedical sphere, although it has been called for by the White House Office of Science and Technology Policy, the UK Research Council, and many others. The culture of peer review and perceived risk of being “scooped” are among the issues that have prevented the widespread adoption of biomedical data sharing.
The absence of “true” findings, or findings of value, reported in much of the current published biomedical research was addressed by John P.A. Ionnidis in a recent essay in PLoS Medicine. This issue, he argues, is largely due to the culture of publication and peer review. As a means of battling this lack of reproducibility in published research, Ionnidis calls for the “sharing of data, protocols, materials, and software.” This will help make findings more reproducible and, therefore, more likely to be true. He also brings up the current problem of competition among researchers for publications and citations, which is a disincentive to the sharing of data. Current practice does not include the data producers as authors on papers published by subsequent data users. The producers have little incentive to make the effort to share data which they might themselves use in future publications. Thus, a culture of data hoarding persists.
Jennifer Lin and Carly Strasser call for a “social contract” among stakeholders in the research ecosystem based on the principle that “data should be preserved, discoverable, measured, and integrated into evaluation processes, and data sharing is a fundamental practice.
In another recent piece for PLoS Biology, Jennifer Lin and Carly Strasser directly address the role of publishers in encouraging researchers to share their data. They call for a “social contract” among stakeholders in the research ecosystem based on the principle that “data should be preserved, discoverable, measured, and integrated into evaluation processes, and data sharing is a fundamental practice.”
The article then lays out eight recommendations for publishers to increase access to data. One of these focuses on providing formal channels and journals for sharing data, such as journals like Nature Scientific Data and GigaScience that are specifically for datasets. At least one journal has already taken steps toward adhering to this recommendation: On November 19, Nature published an editorial detailing its strengthened policy with regard to publication of data. The piece indicates that, in effort to promote reproducibility, Nature and Nature journals are rolling out a new policy for editors to encourage those submitting papers to submit complementary articles to Scientific Data, a peer-reviewed journal dedicated to publishing reusable datasets. The need for channels for data sharing is a major reason why the NIH Big Data to Knowledge (BD2K) initiative recently funded the BioCADDIE project to serve as a coordination consortium focused on developing pilots for a Data Discovery Index.
Another recommendation focuses on citation of all data both produced and used in a publication. Using data citations could go a long way to changing the culture surrounding data sharing, as the data producers will still be eceiving credit for their data when other researchers use it. Data citation ties in with another recommendation, which is about publishers incentivizing the sharing of data. Incentives might include positive reinforcement like promoting articles with higher-than-average data reuse.
Nature took a step in enacting this cultural change, by declaring stricter guidelines for code sharing for research published in the journal and other Nature journals.
Data sharing is an essential driver of the biomedical research enterprise. Sharing of software and code are equally important. Extending the proposed recommendations to code sharing would further benefit the scientific research community and increase the number of reproducible and “true” results. Nature took a step in enacting this cultural change by declaring stricter guidelines for code sharing for research published in the journal and other Nature journals. The new policy, described in a recent editorial, mandates that all papers include a statement describing the availability and restrictions on accessibility for any code that is “central to reaching a paper’s conclusions.” The editorial also indicates that the journal will be meeting with the community to create guidelines for best practices, and to possibly codify the rules. This step in code sharing, as well as the recommendations made about data sharing in the other pieces, are excellent steps toward creating a modern biomedical research enterprise that is open and accessible.