Last week over 430 data enthusiasts from around the world gathered for the 10th Plenary Meeting of the Research Data Alliance (RDA) in Montreal, Canada. The theme of this year’s meeting was “Better Data, Better Decisions.” NIH had an active presence, with representatives from the National Library of Medicine, National Institute of Allergy and Infectious Diseases, and the NIH Office of Science Policy participating in a wide variety of working groups and interest groups.
Created as a community-driven organization in 2013, RDA is a partnership of the European Commission, the United States government's National Science Foundation and National Institute of Standards and Technology, and the Australian government’s Department of Innovation. Its goal is to build the social and technical infrastructure needed for open sharing of data.
For NIH and the biomedical community, RDA represents an opportunity to consider issues of data science and open science in a broader framework, not bound by a single discipline or frame of reference. Important new perspectives on metadata, standards, data management, provenance, and other topics emerged as those from different scientific disciplines and agencies shared their thoughts and experiences. Accessing knowledge from adjacent fields is critical to promoting innovation in health and biomedicine. Specifically, through the National Library of Medicine’s experiences in promoting open innovation (including Challenge competitions and hackathons), we have learned some of the most compelling solutions are derived from solvers who apply techniques that work in fields related to the health sciences.
Many thought-provoking topics and developments were addressed at the meeting. Among the more notable sessions were the following:
- Role of Artificial Intelligence (AI) and Machine Learning in Transforming All Sectors of the Economy. Plenary speaker Dr. Yoshua Bengio, Professor of Computer Science and Operations Research at the University of Montreal, spoke about the emerging role of computers in learning from data and the impact these technologies are having on image recognition, translation, and the creation of new art forms. He demonstrated how one can feed a machine a sentence and it will develop an image, or create a new color based on the text provided. These developments herald great things for health care, particularly regarding ways in which AI can augment decision-making and analysis. Just imagine how these technologies could transform patient education, giving providers the ability to communicate to patients visually what would happen to their bodies if they follow a certain course of action.
Dr. Bengio cautioned, however, that optimal computer-based learning is still a combination of human and machine. Those things machines do well (e.g., process lots of data simultaneously in a continuous manner) still need to be supplemented by those things humans do well (e.g., abstract thinking, and inferring meaning and nuance from objects). And, of course, before AI can be fully optimized, the issues of privacy and ethics need to be considered, including the responsible use of AI in health care, so that humans modulate machine action where appropriate.
- Emergence of Policies to Support Data Management and Sharing Across the Globe. This plenary session explored data policy trends around the world, highlighting how whole new vistas for open research are emerging as research funders promote policies to advance sharing and managing research data. A more focused interest group followed at which representatives from diverse nations, including Greece, Brazil, Finland, and Japan, shared how they are working to change data management practices and incentivize data sharing in their communities.
Some of the common themes raised included how to maximize our learnings from the data management plans being produced by the research community, promote training within these communities to optimize the development and review of data management plans, and track whether these data sharing policies are encouraging the guiding principles of making data findable, accessible, interoperable and reusable (FAIR). Clearly, this is an area whether further work is needed, and it is hopeful that openly sharing trends and best practices can advance the policy agenda of governments worldwide.
- Health Data as an Emerging Interest Area for the Research Data Alliance. While health care applications have historically represented only a fraction of the data interests of RDA, topics related to health data appear to be gaining traction. One newly formed health interest group focuses on exploring tools for mapping the health data domain. One promising tool presented, RepeATFramework, pulls together essential elements to successfully report and reproduce scientific methods within biomedical research sciences. Sharing this framework at the meeting, along with the associated pre-print, mind map, documentation, and code, showed how the RDA community can be used to help refine and vet an emerging idea. The project and associated documentation are available on the Open Science Framework and GitHub for public viewing and comment.
Another health-specific session focused on the emergence of blockchain technologies and their role in supporting the trusted exchange of healthcare information. One of the interesting characteristics of blockchain technology mentioned was the possibility of enacting smart contracts. Smart contracts—executable pieces of code stored on the blockchain—bind people and transactions to specific actions and outcomes. They require no further direct human involvement once they’ve been added to the distributed ledger, which is what makes these contracts "smart" or self-enacting. Smart contracts can empower citizens by allowing them to specify under what predefined conditions, and to what extent, their personal health data can be shared.
Slides and supplementary materials for these and other sessions from the plenary meeting are available online, as are the proceedings and products of RDA’s working and interest groups. Readers of this blog are encouraged to check out the work of RDA and consider where its deliberations, recommendations, and outputs could help advance the fields of data science and open science.
With organizations around the world rolling out policies related to data management and sharing, and countries building infrastructures to support the growing amount of data produced by their research communities (e.g., the European Open Science Cloud and Australian Science Clouds), participation in trans-national forums will be critical to the success of NIH’s burgeoning data science and open science efforts.
NIH will be well positioned to maximize the promise of health data to advance discovery in biomedicine, including through advances in machine learning and AI, if our systems are developed in a way that maximizes interoperability with the systems and structures emerging across the globe. RDA is one example of the many transnational groups in which NIH will continue to engage as we seek to expand our footprint in data science and open science.