Project Point of Contact: Dr. Deborah Linares, Program Official/ Dr. Rina Das, Division Director
ICO and Division: National Institute on Minority Health and Health Disparities, Division of Integrative Biological and Behavioral Sciences
Goals and Objectives: The mission of the National Institute of Minority Health and Health Disparities (NIMHD) is to lead scientific research to improve minority health and reduce health disparities while focusing on all aspects of health and health care for racial and ethnic minority populations in the U.S., examining the full continuum of health disparity causes as well as the interrelation of these causes. NIMHD’s Extramural Research Program (https://www.nimhd.nih.gov/programs/extramural/research-areas/), supports research examining human biological and behavioral mechanisms and pathways influencing resilience and susceptibility to adverse health conditions which disproportionately affect populations whom experience health disparities in the U.S., including racial and ethnic minority groups (Black or African American, Hispanic or Latino, American Indian and Alaska Native, Asian American, Native Hawaiian, and Pacific Islander populations), people with less privileged socioeconomic status, sexual and gender minority persons, and rural populations. The goal for the project is to move the field of health disparities science forward at NIMHD utilizing data science to further examine key gaps highlighted during NIMHD’s scientific visioning process. One objective is the development of a health disparities scan tool using a data science approach that can be used to mine extant research literature and large-scale data sets to determine key gaps and new areas for research on common pathways and mechanisms leading to disease(s) and health advantages among populations who experience health disparities. This will help us better understand key interactions at individual, interpersonal, community, and system levels between biological, behavioral, and contextual predictors of resilience and/or disease vulnerability to address health disparities. The second objective for the scholar will be to serve as a technical advisor to help guide and inform the community on the use of artificial intelligence (AI) and data science techniques in health disparities research such as multi-omics, natural language processing, deep learning, artificial intelligence, and/or other machine learning approaches to help determine new lines of inquiry. Project deliverables would include a health disparities scan tool, a sustainability plan for use of the tool, instructions to integrate with other data such as electronic health records or environmental health data, webinars highlighting and instructing others how to use the tool, and a possible publication or report.
Significance: In 2015, NIMHD underwent a rigorous scientific visioning process to determine key gaps in the field, culminating in a special issue in the American Journal of Public Health (https://www.nimhd.nih.gov/about/overview/science-visioning/). One area that emerged from the visioning process was that more information is needed on the etiology of health disparities, specifically on the pathways and mechanisms for how stress and the environment transmit via biological and behavioral processes leading to disease. The DATA scholar will help develop a data science tool that can be used to determine key gaps and new areas for research on common pathways and mechanisms leading to disease(s) among populations experiencing health disparities. Most studies in the field have an underpowered minority population sample; therefore, this tool has the potential to be integrated with existing data science tools at NIMHD, such as HD Pulse or ScHARe (NIMHD’s new cloud data repository) to help to better understand key interactions at individual and interpersonal levels between biological, behavioral, and contextual predictors of resilience and/or disease vulnerability to address health disparities. The tool will allow researchers to highlight critical areas in health disparities research and help the institute determine new areas for etiology-based or intervention research initiatives. The scholar will also help strengthen NIMHD’s understanding and application of AI and data science techniques in a key field that is growing rapidly.
Description: NIMHD is looking for a DATA Scholar to lead a new exciting, transformative project to move the field of health disparities science forward. Using data science, the scholar will develop a health disparities research scan tool to help highlight key research gaps from NIMHD’s scientific visioning process (https://www.nimhd.nih.gov/about/overview/science-visioning/). The health disparities scan tool will be developed by the scholar and will be compared to alternative literature and funded grant review tools (Pubmed, NIH RePORTER, or iSearch) and possibly linking to large scale health survey data. The DATA scholar will also serve as a technical advisor to help guide and inform the community on the use of AI and data science techniques in health disparities research at NIMHD, such as multiomics, deep learning, and/or other machine learning approaches to help determine new lines of inquiry. The scholar will have the challenge of merging different data types, such as tabular and categorical with missing values and the potential for convolutional neural networks or XGboost.
The data scholar should have expertise in the following areas:
- Cloud computing, machine learning (ML), natural language processing (NL), and artificial intelligence predictive modeling
- Programming in commonly used languages in ML/NL, such as R or python
- Experience developing data science tools or applications for public use
- Experience with harmonizing different types of data, especially tabular, and working with SQL or relational databases in a cloud computing environment
- Experience as a technical liaison among multiple stakeholders, esp. USG
Data set(s) involved: PubMed, NIH Reporter, NIMHD ScHARe ecosystem of federated such as the National Health and Nutrition Examination Survey, National Health Interview Survey, American Communities Survey, etc., hosted data sets such as BRFSS, and project data sets from NIMHD, NINR and OMH, and NIH data sets (All of Us), Anvil and BioData Catalyst, N3C as needed.
Anticipated outcomes of the project:
Project deliverables include an open-source code health disparities scan tool, a sustainability plan for use of the tool, webinars highlighting and instructing others how to use the tool, and a possible publication/report. The Scholar will also work with the NIMHD team to develop and maintain the data science tool and integrate the tool’s maintenance and updates into existing workflows and develop instructions to integrate with other data such as electronic health records or environmental health data. Additional outcomes may include: creating a dirty data set for health disparity predictions and bias identification to test the model developed for potential biases in predictions for various populations experiencing health disparities.
Required skills of the DATA Scholar:
- Data science and/or analytics, such as causal inference, background with experience in data management and programming
- Working with SQL or relational databases in a cloud computing environment
- Machine learning/artificial intelligence, and predictive modeling
- Expertise in developing data science tools or applications for public use, industry experience preferred
- Experience working as a technical liaison among multiple stakeholders from various backgrounds, especially USG
- Project management
- Experience and/or interest in health disparities among NIMHD’s target populations (Black or African American, Hispanic or Latino, American Indian and Alaska Native, Asian American, Native Hawaiian, and Pacific Islander populations), people with less privileged socioeconomic status, sexual and gender minority persons, and rural populations)
- Experience working with biological and/or behavioral sciences data
- Ph.D. or equivalent preferred
- Linking, creating large data sets including federally funded data sets such as (All of Us study, National Health and Nutrition Examination Survey, National Health Interview Survey, American Communities Survey, etc.), and conducting statistical analyses
- Strong oral and written communication skills
Expected/preferred length of DATA Scholar appointment: 2 years.
Expected/preferred time effort commitment of the DATA Scholar: Full time (100%)
Remote work preference: 100% remote allowable
ICO support: Access to STRIDES resources, ScHARe population science databases and funded project data, R/Posit, SAS and other relevant software as needed, and data science office staff at NIMHD. The workspace to conduct secure analyses and tool development can be hosted on ScHARe.
Additional activities: The scholar will attend planned meetings, webinars, and workshops. The scholar will provide webinars to the health disparities research community and advise NIMHD extramural staff on data science related efforts.
Career or professional development opportunities: The scholar will be the project leader within DIBBS and will report directly to Dr. Rina Das. The scholar will interact closely with other experts within DIBBS and across NIMHD. There will be opportunities for the scholar to interact with senior NIMHD leadership, ScHARe staff, staff across NIH, and help advance data science related initiatives at NIMHD. Professional development opportunities will include the ability to learn more about health disparities research, training courses in areas of interest, ODSS related trainings, and network with others working in the scholar’s areas of interest.
To apply to this or other DATA Scholar positions, please see instructions here: datascience.nih.gov/data-scholars.