Institute or Center: National Institute of Environmental Health Sciences (NIEHS)
Project: Advancing Interoperability for Environmental Health
Skills sought:
-
Expertise with artificial intelligence/machine learning, statistical modeling and analysis, graph theory, and semantic engineering methods
-
Development of novel software tools for data manipulation in Python or R
-
Ability to manipulate large and complex datasets and to query and use data systems (relational databases, NoSQL, and graph databases)
About the position: The NIEHS is seeking an expert data scientist who will develop novel methods for integrating diverse data sources to identify and understand how environmental exposures impact human health.
The DATA Scholar’s work will focus on high-impact use cases in toxicology and environmental epidemiology. Within the context of each use case, the Scholar will:
-
identify and develop cutting-edge data science approaches to integrate relevant data streams.
-
apply these methods to combine multiple data sources into a single, cohesive dataset that enables the ability to answer novel research questions and produce interpretable and testable results.
-
engage data scientists and biomedical researchers, both within the NIH community and more broadly, to inform methods development, refine solutions, and communicate results.
-
consult on
-
how to incorporate data integration methods into platforms for managing, querying, and analyzing environmental health data resources.
-
how to develop integrated informatics platform(s) that can ingest and integrate data from independent sources.
By the end of the 1-2 year term, the Scholar will have produced novel methods that can be widely adapted within and beyond the environmental health and broader NIH communities, including accompanying software/code, software documentation, tutorials, manuscripts, and presentations at data science or other scientific forums; and applied these methods to use cases in toxicology and environmental epidemiology, demonstrating the ability to answer key questions.
About the work: Environmental health data are among the most diverse, complex, and challenging in the biomedical realm, encompassing the study of all levels of biological organization at all life stages, from preconception through old age, and often including non-biomedical exposure information (e.g., pollution estimates, details of consumable products). The challenges and opportunities with such complex and often high-dimensional are significant, ranging from varying data format and syntax to more complex issues such as imputing across gaps in data, knowledge, and models. The Scholar will have the unique opportunity to focus on advancing methods on fundamental challenges in data science, including integrating diverse data streams on complex systems using emerging methods from different data science subfields.
Datasets involved: toxicology and environmental epidemiology datasets from
Why this project matters: Developing and implementing new methods to address environmental health questions has the potential to advance research broadly for NIEHS, NIH, and the environmental health community worldwide. The Scholar’s work is expected to be transformative in overcoming barriers to data interoperability and integration, ultimately leading to breakthroughs in the ability to quickly identify, understand, prevent, and respond to health threats from existing and new exposures.
Work Location: Research Triangle Park, NC
Work environment: The Scholar will work with an interdisciplinary project team at the NIEHS campus in North Carolina and will engage with academic and industry professionals within the Research Triangle Park area. The team, led by the NIEHS Acting Deputy Director and the Director of the NIEHS Office of Data Science, includes both subject matter and technical experts representing several NIH research programs. The Scholar will have access to a wealth of experts in biomedical research, clinical research, and toxicology, as well as scientific computing, data science, and bioinformatics.
To apply to this or other DATA Scholar positions, please see instructions here: datascience.nih.gov/data-scholars