Biomedical Data Repositories and Knowledgebases

About Biomedical Data Repositories and Knowledgebases

To better support a modern data resource ecosystem, NIH makes a distinction between data repositories and knowledgebases. While both are important for advancing biomedical research, data repositories and knowledgebases can have unique functions, metrics for success, and sustainability needs.

Sustaining a healthy and productive data resource ecosystem means that each component:

  • Delivers scientific impact to the communities that they serve
  • Employs and promotes good data management practices and provides efficient operation for quality and services
  • Engages with the user community and continuously addresses their needs
  • Supports a process for data life-cycle analysis
  • Engrosses exploration of the current landscape of biomedical data repository metrics to to NIH in better understanding how datasets and repositories are used
  • Provides long-term preservation and trustworthy governance

Both data repositories and knowledgebases contribute to the NIH data resource ecosystem

Data Repositories

  • Biomedical data repositories accept the submission of relevant data from the research community to store, organize, validate, archive, preserve, and distribute data in compliance with the FAIR Data Principles.
  • Curation focuses on quality assurance and quality control.
  • Example: core data might include genome, transcriptome, and protein sequences or imaging or spectroscopic data

Knowledgebases

  • Biomedical knowledgebases extract, accumulate, organize, annotate, and link the growing body of information that is related to, and relies on, core datasets.
  • Significant levels of human curation are traditionally required.
  • Example: information about expression patterns, splicing variants, localization, protein-protein interaction, and pathway networks related to an organism or set of organisms; publication information

View Trans-NIH BioMedical Informatics Coordinating Committee (BMIC) Data Sharing Resources.

Metrics and Lifecycle

Data repositories and knowledgebases exist on a spectrum of ability and readiness to adopt the desirable characteristics aligned with FAIR and TRUST principles. Due to the critical nature of research data resources, repositories, and datasets, the development of metrics to evaluate the usage, utility, and impact of a given repository is essential. To that end, NIH conducted a survey and organized a workshop to better understand both existing and desired lifecycle metrics. The NIH then issued a report which presents the findings to better understand metrics currently used within the biomedical repository community, which can inform future NIH efforts to help develop this space and to understand patterns of use across datasets and repositories.

Open Funding Opportunities

  • (Open) Enhancement and Management of Established Biomedical Data Repositories and Knowledgebases (PAR-23-237) August 31, 2023
  • (Open) Early-stage Biomedical Data Repositories and Knowledgebases (PAR-23-236) August 31, 2023
  • FAQs for PAR-23-237 and PAR-23-236

Closed Funding Opportunities

  • (Closed) Support for existing data repositories to align with FAIR and TRUST principles and evaluate usage, utility, and impact (NOT-OD-23-044) FAQs January 5, 2023
  • (Closed) Support for existing data repositories to align with FAIR and TRUST principles and evaluate usage, utility, and impact (NOT-OD-22-069January 31, 2022
  • (Closed) Administrative Supplements Available to Strengthen NIH-Funded Biomedical Data Repositories (NOT-OD-21-089), April 6, 2021
  • (Closed) NIH released two funding opportunities to support biomedical data repositories and knowledgebases, January 17, 2020

Funded Awards

Award Recipients
Grant NumberAward ICPrincipal InvestigatorProject TitleRepository Weblink

Grant Number

1 R24 GM144308-01

Award IC

NIGMS

Principal Investigator

Anita Elzbieta Bandrowski

Project Title

From RRID to Resource Watch: A Knowledgebase of Biomedical Research Resources

Repository Weblink https://scicrunch.org/resources (link is external)

Grant Number

2 U24 HG007822-08

Award IC

NIA, NIAID, NCI,DK, NIGMS, NHGRI, NHLBI, Office of the Director

Principal Investigator

Alex Bateman

Project Title

UniProt: A Protein Sequence and Function Resource for Biomedical Science

Repository Weblink https://www.uniprot.org/ (link is external)

Grant Number

1 U24 NS122732-01

Award IC

NINDS

Principal Investigator

Adam R. Ferguson

Project Title

Pan-Neurotrauma Data Commons

Repository Weblink

https://odc-sci.org/ (link is external)

https://odc-tbi.org/ (link is external)

Grant Number

1 R24 GM144232-01

Award IC

NIGMS

Principal Investigator

Michael K. Gilson

Project Title

BindingDB: An Open Knowledgebase of Protein-Small Molecule Interactions

Repository Weblink

https://www.bindingdb.org/rwd/bind/index.jsp (link is external)

Grant Number

1 U24 GM142435-01

Award IC

NIGMS

Principal Investigator

Marc S. Halfon

Project Title

REDfly: The regulatory sequence resource for Drosophila and other insects

Repository Weblink

http://redfly.ccr.buffalo.edu/ (link is external)

Grant Number

1 U24 HG012556-01

Award IC

NHGRI, NHLBI, NIMHD, NINDS, Office of the Director

Principal Investigator

Carol Marie Hamilton

Project Title

Establishing the PhenX Toolkit as a Biomedical Knowledgebase

Repository Weblink

https://www.phenxtoolkit.org/index.php (link is external)

Grant Number

1 U24 AI171008-01

Award IC

NIAID

Principal Investigator

Yongqun He 

Project Title

VIOLIN 2.0: Vaccine Information and Ontology LInked kNowledgebase

Repository Weblink

https://violinet.org/ (link is external)

Grant Number

1 U24 AI162625-01

Award IC

NIAID

Principal Investigator

Elliot J. Lefkowitz

Project Title

Virus Taxonomy: A Community Knowledgebase Supporting Virus Research

Repository Weblink

https://scholars.uab.edu/display/grant-000529746 (link is external)

Grant Number

1 U24 ES033155-01

Award IC

NIEHS, NINDS

Principal Investigator

Carolyn J. Mattingly

Project Title

Comparative Toxicogenomics Database (CTD)

Repository Weblink

http://ctdbase.org/ (link is external)

Grant Number

1 U24 GM143402-01

Award IC

NIGMS

Principal Investigator

Mark A. Musen

Project Title

BioPortal: An Expansive Knowledgebase of Biomedical Entities and Relations

Repository Weblink

https://bioportal.bioontology.org/ (link is external)

Grant Number

1 U24 HG012542-01

Award IC

NCI, NHGRI

Principal Investigator

Helen Elizabeth Parkinson

Project Title

Strengthening community knowledge bases for genetic association studies and polygenic scores, the GWAS and PGS Catalogs

Repository Weblink

https://www.pgscatalog.org/ (link is external)

https://www.ebi.ac.uk/gwas/ (link is external)

Grant Number

1 U24 HG012557-01

Award IC

NHGRI

Principal Investigator

Lynn Marie Schriml

Project Title

The Human Disease Ontology: An integrated, mechanistic knowledge resource for biomedical research.

Repository Weblink

https://disease-ontology.org/ (link is external)

Grant Number

1 U24 HG012198-01

Award IC

NHGRI

Principal Investigator

Lincoln D. Stein

Project Title

Reactome: An Open Knowledgebase of Human Pathways.

Repository Weblink

https://reactome.org/ (link is external)

Grant Number

1 U24 HG012212-01

Award IC

NIGMS, NHGRI

Principal Investigator

Paul D. Thomas

Project Title

Gene Ontology Consortium and Knowledgebase

Repository Weblink

http://geneontology.org/ (link is external)

Grant Number

1 R24 GM146616-01

Award IC

NIGMS

Principal Investigator

Michael Tiemeyer 

Project Title

GlyGen growth and evolution into a central resource for glycans and glycoconjugates

Repository Weblink

https://www.glygen.org/ (link is external)

Grant Number

1 U24 ES035214-01

Award IC

NIEHS, Office of the Director

Principal Investigator

Alexander Tropsha 

Project Title

Supporting Biomedical Discovery with the ROBOKOP Graph Knowledgebase.

Repository Weblink

https://robokop.renci.org/ (link is external)

Grant Number

1 U24 CA265879-01

Award IC

NCI

Principal Investigator

Jeremy Lyle Warner

Project Title

Enhancing the HemOnc Knowledgebase of Chemotherapy Drugs and Regimens

Repository Weblink

https://hemonc.org/wiki/Main_Page (link is external)

Grant Number

1 U24 AA029959-01

Award IC

NIAAA

Principal Investigator

Samuel S. Wu

Project Title

Southern HIV and Alcohol Research Consortium Biomedical Data Repository

Repository Weblink

Website for Consortium:

https://sharc-research.org/ (link is external)

NOT-OD-22-069 Award Recipients
Principal InvestigatorInstitutionProject TitleRepository Weblink

Principal Investigator

Alex Bateman

Institution

European Molecular Biology Laboratory

Project Title

UniProt building community metrics for FAIR and TRUSTworthy resources

Repository Weblink

https://www.uniprot.org/ (link is external)

Principal Investigator

Keyvan Farahani

Institution

National Cancer Institute

Project Title

UDash - a Usage Dashboard for the Imaging Data Commons

Repository Weblink

https://portal.imaging.datacommons.cancer.gov/

Principal Investigator

Kerry Goetz

Institution

National Eye Institute

Project Title

NEI BRICS - Harnessing the Power of Data in Vision Research

Repository Weblink

https://neidatacommons.nei.nih.gov/

Principal Investigator

Melissa Haendel

Institution

University Of Colorado Denver

Project Title

The Monarch Initiative: Linking diseases to model organism resources

Repository Weblink

https://monarchinitiative.org/ (link is external)

Principal Investigator

Christian Haselgrove

Institution

Univ Of Massachusetts Med Sch Worcester

Project Title

Neuroimaging Informatics Tools and Resources Collaboratory: Outreach, Infrastructure and Maintenance

Repository Weblink

https://www.nitrc.org/ (link is external)

Principal Investigator

Ian Korf

Institution

University Of California At Davis

Project Title

Informatics, Coordination and Service Center for the Mutant Mouse Resource and Research Centers

Repository Weblink

https://www.mmrrc.org/ (link is external)

Principal Investigator

Mathew McAuliffe

Institution

Department of Defense (DoD) and  National  Institute  of  Neurological  Disorders  and  Stroke  (NINDS)

Project Title

Modernizing the Federal Interagency Traumatic Brain Injury Research (FITBIR) repository

Repository Weblink

https://fitbir.nih.gov/

Principal Investigator

Susan Teitelbaum

Institution

Icahn School Of Medicine At Mount Sinai

Project Title

Human Health Exposure Analysis Resource (HHEAR) Data Center

Repository Weblink

https://hheardatacenter.mssm.edu/ (link is external)

Principal Investigator

Antonella Zanobetti

Institution

Harvard School of Public Health

Project Title

National Cohort Studies of Alzheimers Disease, Related Dementias and Air Pollution

Repository Weblink

This page last reviewed on November 28, 2023