Integrating Multi-Dimensional Data to Promote Data-Driven Research in Oral Health and Oral Health Disparities (NIDCR/DER)

Project Point of Contact: Lu Wang, PhD, Chief, Translational Genomics Research Branch, Division of Extramural Research; Chair, NIDCR FaceBase and dbGaP Data Access Committee/Noffisat Oki, PhD, Program Director

Goals and Objectives: The goal is to expedite research to better understand oral health and disparities in oral health, informing ways to improve all individuals’ oral health. This project tackles two major challenges in health care and biomedical research: 1) a general lack of integrated dental and medical records resultant from current healthcare practices, and 2) heterogeneity of available research and clinical data. The project will focus on developing a framework and platform for mining, harmonization and integration, and analysis of research and clinical data. Integrated datasets will power both hypothesis-driven and hypothesis-generating research, leading to insights about oral health and disparities in oral health. The project will also demonstrate the value of data science approaches in the exploration and analysis of integrated health records. Methods and AI/ML-ready datasets generated from the project will be useful for biomedical research, health disparities research, and future methods development.

Significance: The DATA Scholar will pilot a project aimed at developing a framework and platform for unifying existing and future EDR and EMR, as well as basic, translational, and clinical research data for research and improved patient care. The project will also spearhead the development of an implementation use case for the broader biomedical community on how these integrated datasets can be used to fill in knowledge gaps at the intersection of how oral health impacts overall health (and vice versa), as well as in providing insights for understanding oral health disparities and the development of intervention strategies.

Description: NIDCR is committed to strengthening efforts to integrate and analyze EDR, EMR, and other data types including “deep phenotypes”, which reflect individuals’ health, behavior, diet, and social status. The goal is to expedite research to better understand oral health and disparities in oral health, informing ways to improve all individuals’ oral health. This project tackles two major challenges in health care and biomedical research: 1) a general lack of integrated dental and medical records resultant from current healthcare practices, and 2) heterogeneity of available research and clinical data. The project will focus on developing a framework and platform for mining, harmonization and integration, and analysis of research and clinical data. Integrated datasets will power both hypothesis-driven and hypothesis-generating research, leading to insights about oral health and disparities in oral health. The project will also demonstrate the value of data science approaches in the exploration and analysis of integrated health records. Methods and AI/ML-ready datasets generated from the project will be useful for biomedical research, health disparities research, and future methods development.

Data set(s) involved: Data include dental and medical images (CBCT, CT, MRI, ultrasound, and microscopy); biometric data; audios and videos; genomic, transcriptomic, metabolic, and other omic data types; physical measurements; questionnaires; and EDR and EMR. Data will be obtained from a variety of sources including hospitals, universities, databases, and patient data repositories, social media. Examples include data provided by Oregon Oral Health Surveillance System, Wisconsin Oral Health Program, CDC-Oral Health, All of Us, FaceBase. These data should be representative of multiple populations and age groups to understand oral health disparities and help identify prevention and intervention strategies for oral diseases and conditions, including dental caries.

Anticipated outcomes of the project: 

  • A cloud platform resulting from a pilot phase for integration, analysis and visualization of data, and that is accessible for testing and use by NIH beta testers and users
  • An expansion of the platform during a follow-up phase to include additional datasets and allow for users to import their own data for integration of their own disparate datasets or with those already existing on or accessible to the platform
  • Publications and presentations describing the platform and outcomes from the use case(s)

Required skills of the DATA Scholar: NIDCR seeks a data scientist to mine and integrate large, diverse types and formats of data. Cloud computing is expected to be performed through NIH STRIDES.

The scholar will work on:

  • Developing strategies for harmonizing and integrating of datasets from different resources, including developing tools for interoperability across various repositories
  • Developing a framework for assessing data quality and completeness and for defining minimum evaluation metrics/standards across the different types of data
  • Develop a platform for analysis and visualization of these data, and which can be used to gather analytical insights and demonstrate the value of data integration in furthering research that tackles challenges in oral health disparities and equity
  • Provide status updates and disseminate project findings to key stakeholders and the broader research community through publications, presentations, and demonstration sessions of the developed tool(s)

Expected/preferred length of DATA Scholar appointment: 1 to 2 years.

Expected/preferred time effort commitment of the DATA Scholar: Part time or full time

Remote work preference: 100% remote allowable

ICO support: To be determined.

Additional activities: The Scholar will be supervised by Dr. Noffisat Oki, Program Director of NIDCR’s extramural data science program and receive advice from the Project Consultant, Dr. Jennifer Webster-Cyriaque, Deputy Director of the Institute. The Scholar will also interact with the Institute’s Director, Dr. Rena D’Souza, other executive and senior leaders of the Institute, and stakeholders outside of NIDCR who are key to the success of the project.

Career or professional development opportunities: The NIDCR Strategic Plan 2021-2026 prioritizes applying data science to enhance dental, oral, and craniofacial science and oral health. This presents opportunities for a Scholar to pioneer innovative, futuristic data science efforts and explore career opportunities in the federal government.

To apply to this or other DATA Scholar positions, please see instructions here: datascience.nih.gov/data-scholars.

This page last reviewed on March 29, 2023