Introducing the 2025-2030 NIH Strategic Plan for Data Science: What Researchers Need to Know

Wednesday, June 4, 2025

By: Dr. Susan Gregurick, Associate Director of Data Science, NIH

Exciting news from NIH! The final 2025-2030 Strategic Plan for Data Science has just been released, charting the course for how biomedical data will transform health research over the next five years. As former NIH Director Dr. Monica Bertagnolli notes in her opening letter, "The NIH Biomedical Data Ecosystem will bring increasingly effective data and tools that enable the broadest research community possible to contribute to our mission to bring better health to all people."

However, discoveries and innovations in health would be impossible without the many dedicated researchers and scientists who collaborate and partner with or work for NIH. If that is you, thank you for your contributions to our mission! Let's dive into the five key goals that will shape your research data landscape:

Goal 1: Improve Data Management and Sharing Capabilities

Remember the NIH Data Management and Sharing Policy that went into effect last year? NIH is doubling down on supporting and effectively implementing the policy. This goal focuses on three critical objectives: Supporting the biomedical community in managing, sharing, and sustaining data; enhancing FAIR (Findable, Accessible, Interoperable, and Reusable) data principles and harmonization; and strengthening the NIH data repository ecosystem.

Expect new tools for preparing and annotating data, improved metadata quality standards, and a data steward program to guide sharing practices. NIH will also work with Tribal communities to develop appropriate data governance frameworks that respect Indigenous data sovereignty through CARE (Collective benefit, Authority to control, Responsibility, and Ethics) principles. For researchers working with sensitive data, streamlined processes for controlled data access are under development.

Goal 2: Enhance Human-Derived Data for Research

Clinical and real-world data offer incredible opportunities but are notoriously tricky to work with. This goal tackles improving access to clinical data sources, adopting health IT standards like Fast Healthcare Interoperability Resources (FHIR®) and the Trusted Exchange Framework and Common Agreement (TEFCA™), enhancing environmental and lifestyle data integration (the "exposome"), and providing cross-disciplinary training.

You will see new suggested methods for collecting informed consent when combining data from multiple sources, home health care device data standards, and federated frameworks allowing sensitive data use in clinical research. The plan specifically mentions developing governance frameworks for data linkages and real-world pilots integrating environmental factors with clinical common data elements (CDEs)—particularly valuable for understanding certain determinants of health.

Goal 3: Advance Software, Computational Methods, and Artificial Intelligence

Biomedical research generates massive amounts of data, and NIH wants to ensure you have cutting-edge tools to analyze it all. This goal balances investments across software development, computational methods, and AI applications. You'll see enhanced support for community-developed software tools with better visualization capabilities and established sustainability metrics following FAIR principles. You’ll also see callouts to programs like NCI's Information Technology for Cancer Research (ITCR), which have funded support for tools across their entire lifecycle—helping ensure that the software you rely on doesn't disappear when grant funding ends!

Beyond traditional analysis approaches, the plan explores exciting computational frontiers like digital twins modeling, privacy-preserving computing, and integrating theory-based modeling with data-driven insights. For those interested in AI applications, the AIM-AHEAD program will continue building nationwide networks to democratize computational capabilities across institutions nationwide. Whether you're a computational expert or just beginning to incorporate advanced analytical methods into your research, NIH is working to provide accessible and sustainable tools that meet the growing complexity of biomedical data challenges.

Goal 4: Support a Federated Biomedical Research Data Infrastructure

Are you tired of data silos? NIH is working toward a federated data ecosystem where researchers can more easily connect disparate datasets across platforms like NHLBI's BioData Catalyst®, NCI's Cancer Research Data Commons (CRDC), the All of Us program, and the NIH database of Genotypes and Phenotypes (dbGaP) through the NIH Cloud Platform Interoperability (NCPI) program. This approach maintains institutional control of data while standardizing access processes and interfaces.

The implementation will focus on creating a robust connected data resource ecosystem with improved interoperability, developing new search and discovery capabilities through enhanced metadata standards, and exploring new computing paradigms. The Researcher Auth Service (RAS) initiative will expand single sign-on capabilities across NIH data resources, streamlining your access to data while maintaining privacy and security standards.

Goal 5: Strengthen the Data Science Community

Data science skills are increasingly essential in all areas of biomedical research. This goal addresses expanding data science expertise at every level—from pre-college students to established investigators. It includes increasing training opportunities, expanding the data science workforce, enhancing collaboration within NIH's Intramural Research Program, and building capacity for every researcher who works with or for NIH.

Look for expanded cross-disciplinary training programs, new mentorship initiatives, and greater integration of data science into existing research training. The successful DATA Scholars program will continue growing NIH's internal data science capacity, while partnerships with programs like the Native American Research Centers for Health (NARCH) will help democratize data science expertise across institutions nationwide.

In conclusion…

This strategic plan builds on significant progress made since the first Data Science Strategic Plan, with a renewed focus on partnership, capacity-building, and responsible innovation. As the research landscape evolves with unprecedented speed, NIH is working to ensure these powerful data tools and technologies benefit all Americans through more comprehensive scientific discoveries.

Want to learn more? Check out the full plan at the NIH Office of Data Science Strategy website!

This page last reviewed on June 5, 2025