Interoperability of NIH Cloud-Based Platforms for Genomics Research

Institute or Center: National Human Genome Research Institute (NHGRI)

Project: Interoperability of NIH Cloud-Based Platforms for Genomics Research

Skills sought:

  • Familiarity with the management and analysis of sensitive (i.e., human genome) data
  • Expertise in cloud storage and computing, ideally in Google and Amazon Web Services
  • Familiarity with GA4GH standards
  • Development and use of APIs and other programming user interfaces
  • Understanding of the security and privacy concerns associated with human controlled-access* data
  • Excellent organizational skills and ability to be a team player

About the position: NHGRI seeks a data or computer scientist to provide expertise to address the technical interoperability challenges of siloed, cloud-based platforms that host and make broadly available controlled-access* human genomic and phenotypic (e.g., disease status) data.

The Scholar will coordinate with NHGRI staff to identify interoperability projects across several NIH data platforms, contribute to the projects’ specifications, collaborate with the developers of the NHGRI Genomic Data Science Analysis, Visualization and Informatics Lab-space (AnVIL) platform to implement the specifications, and test new functionalities across the platforms to provide immediate, technically informed feedback to NHGRI and other NIH Institutes involved in the interoperability projects.

Datasets involved:

  • The AnVIL platform provides access to a wide variety of datasets; for examples, see the AnVIL datasets page. The Scholar will work with some of the following:
    • human genomic and phenotypic (e.g., disease status) datasets
    • possibly other health data types (e.g. epigenomics, clinical, proteomics, imaging)
    • most of the data will be controlled-access* data

About the work: Multiple NIH institutes and offices – all with different disease focuses and mission areas – are establishing cloud-based data storage and analysis platforms to facilitate sharing, broad dissemination, and computing on generated and collected by the institutes’ funded programs and initiatives. The DATA Scholar will oversee and contribute to the implementation of collaborative projects among those platforms to facilitate user accessibility and the interoperability of these resources and ultimately increase the impact of those efforts.

Why this project matters: The technical solutions developed under this program will be extended to and adopted by other cloud-based data and analysis resources that are being established throughout the NIH, as well as at other U.S. and international agencies with the goal of creating a global, federated biomedical data ecosystem.

Work location: Bethesda, MD

Work environment: The DATA Scholar will be a member of the NHGRI Division of Genome Sciences and the NHGRI AnVIL team, participating in AnVIL activities and working groups that include NHGRI staff and external investigators. The Scholar will also be a member of the NIH-wide Governance Group that oversees the interoperability projects across NIH cloud-based platforms.

*Controlled-access refers to data or samples collected under informed consents that indicates appropriateness for sharing data only through NIH designated restricted access data repositories.

To apply to this or other DATA Scholar positions, please see instructions here: datascience.nih.gov/data-scholars

This page last reviewed on March 23, 2023