Notice Number:
NOT-OD-24-063
Key Dates
Release Date: February 20, 2024
Response Date: New Date May 7th, 2024 (Original Date: April 20, 2024) per issuance of NOT-OD-24-106
Related Announcements
NOT-LM-21-005
Issued by
National Institutes of Health (NIH)
Purpose
The purpose of the Request for Information (RFI) is to solicit public input on 1) a set of minimum core common data elements (CDEs) that would be used across all NIH funded/conducted clinical studies/trials and community-based research involving human participants; 2) additional CDEs for social determinants of health (SDoH) and clinical domains including autoimmune diseases and immune-mediated diseases; and 3) technologies, tools and policies that could facilitate the use of NIH CDEs. NIH CDEs are defined as CDEs “recommended” or “required” by an NIH body, and/or found in the NIH CDE Repository. These RFI responses will be used to inform NIH’s continuing guidance on CDE use and assist in the planning for adequate resources for CDE implementation.
Background
CDEs are a type of data standard used for collection, comparable analysis, and exchange of data in biomedical research settings. CDEs are standardized, precisely defined questions paired with a set of specific allowable responses, used systematically across different sites, studies, or clinical trials to ensure consistent data collection (https://cde.nlm.nih.gov/home). They provide a common “language” for systematic and consistent capture of research data and routinely collected real-world data. CDEs can range from single data elements such as height and weight, to a bundle of questions that evaluate concepts such as depression and quality of life. A glossary of terms relevant to CDEs can be found on the RFI response website (https://datascience.nih.gov/cde-rfi) to provide further background on CDE use within NIH ecosystems.
Data consistency is a key factor contributing to its interoperability, which is one of the FAIR data principles guiding scientific data management and stewardship. Biomedical data are often collected in different ways for various study purposes, using different data models, which presents significant challenges for collaborative research, meta-analysis, and management/sharing of data. Use of CDEs makes health data “speak the same language” and become interoperable, both structurally and semantically. Since CDEs can be linked across common data models (CDMs) and standard vocabularies/terminologies used in healthcare, such as SNOMED CT, LOINC, RxNORM, and UMLS (among others catalogued in public repositories such as the National Library of Medicine’s Value Set Authority Center), they provide means to align clinical research studies with real-world data from electronic health records, healthcare coverage claims, patient-generated data streams, and patient-reported outcomes. CDEs can be expressed in machine computable formats (as defined in the Glossary) to enable mapping, transforming, and combining of existing data, and in turn, create big data resources by readily integrating data across disparate sources. Implementation of CDEs has potential to accelerate knowledge discovery by harnessing the power of innovative data methods such as machine learning and artificial intelligence.
Resources established by NIH cross-cutting initiatives such as the Rapid Acceleration of Diagnostics (RADx) COVID-19 initiative, and the NIH CDE Repository have recently raised general awareness and facilitated use of CDEs in NIH intramural and extramural research communities. The successful adoption of CDEs in NIH institutes’ programs has accelerated the pace of new scientific breakthroughs. These resources also highlight the need to standardize a minimum core set of CDEs across NIH Institutes and Centers.
The NIH Scientific Data Council (SDC), an internal NIH committee made up of senior NIH Institute and Center (IC) leaders and data scientists, has established a governance process to designate CDEs that meet criteria (such as human & machine readability, semantically clear definitions of variable, measure prompt and response) as “NIH-endorsed” and publish them in the NIH CDE Repository, but no minimum core set of CDEs has been established for use across all clinical studies/trials supported or conducted by ICs.
Beyond NIH, a consortium of mental health research funders and journals has launched the Common Measures in Mental Health Science Initiative to identify common measures for mental health conditions that funders and journals can require all researchers to collect, in addition to any other measures they require for their specific study. For example, mCODE™ (Minimal Common Oncology Data Elements) allows oncology electronic health records (EHRs) exchange between health systems and enables comparative effectiveness analysis (CEA) of cancer treatments through assembling a core set of structured data elements. While the NCI is participating in this initiative in an attempt to harmonize cancer CDEs in EHRs and cancer research, without an effort to standardize a minimum core set of CDEs for use across the NIH, these and other important data initiatives miss the opportunity for data to be more easily integrated and analyzed.
The 21st Century Cures Act highlights “the need for a core set of common data elements and associated value sets.” Development of a core set of CDEs will greatly enhance data interoperability. Recently, the NIH SDC has directed a new CDE working group to provide recommendations on a consistent set of minimum core CDEs that could be utilized across NIH clinical research/trials. The minimum core CDEs would not preclude the use of additional CDEs that are specific for clinical studies/trials. Social determinants of health (SDoH) core CDEs have been identified as priorities, because of increased awareness that social, economic, and environmental factors influence health equity. This RFI seeks feedback on the development and implementation of CDEs including a set of minimal core CDEs across the NIH programs.
Despite all the efforts and progress, wide adoption of CDEs across various clinical domains is not without challenges. For example, the presence of numerous duplicative CDE sources in some clinical domains costs researchers extra time and effort in selecting the appropriate CDEs for use, especially when looking to integrate responses with real-world data. Technologies and tools are needed to map CDEs, to transform data, and to align CDEs with controlled vocabularies, terminologies, and existing data management systems. This RFI is also an NIH effort to understand these challenges and opportunities, to inform appropriate NIH guidance and mechanisms to lower the barriers to CDE use and improve the ability to aggregate and integrate CDE based data.
Note: Any Personally Identifiable Information or Protected Health Information will be restricted in its direct use to those interacting with participants (though aggregate-level measures may be derived for use in study datasets). All patient data to be used for study must be consented by the participant before the data can be used.