
Biomedical Data Repositories and Knowledgebases
About Biomedical Data Repositories and Knowledgebases
To better support a modern data resource ecosystem, NIH makes a distinction between data repositories and knowledgebases. While both are important for advancing biomedical research, data repositories and knowledgebases can have unique functions, metrics for success, and sustainability needs.
Sustaining a healthy and productive data resource ecosystem means that each component:
- Delivers scientific impact to the communities that they serve
- Employs and promotes good data management practices and provides efficient operation for quality and services
- Engages with the user community and continuously addresses their needs
- Supports a process for data life-cycle analysis
- Engrosses exploration of the current landscape of biomedical data repository metrics to to NIH in better understanding how datasets and repositories are used
- Provides long-term preservation and trustworthy governance
Both data repositories and knowledgebases contribute to the NIH data resource ecosystem
Data Repositories
- Biomedical data repositories accept the submission of relevant data from the research community to store, organize, validate, archive, preserve, and distribute data in compliance with the FAIR Data Principles.
- Curation focuses on quality assurance and quality control.
- Example: core data might include genome, transcriptome, and protein sequences or imaging or spectroscopic data
Knowledgebases
- Biomedical knowledgebases extract, accumulate, organize, annotate, and link the growing body of information that is related to, and relies on, core datasets.
- Significant levels of human curation are traditionally required.
- Example: information about expression patterns, splicing variants, localization, protein-protein interaction, and pathway networks related to an organism or set of organisms; publication information
View Trans-NIH BioMedical Informatics Coordinating Committee (BMIC) Data Sharing Resources.
Metrics and Lifecycle
Data repositories and knowledgebases exist on a spectrum of ability and readiness to adopt the desirable characteristics aligned with FAIR and TRUST principles. Due to the critical nature of research data resources, repositories, and datasets, the development of metrics to evaluate the usage, utility, and impact of a given repository is essential. To that end, NIH conducted a survey and organized a workshop to better understand both existing and desired lifecycle metrics. The NIH then issued a report which presents the findings to better understand metrics currently used within the biomedical repository community, which can inform future NIH efforts to help develop this space and to understand patterns of use across datasets and repositories.
Funding Opportunities
- (Open) Biomedical Data Repository (PAR-20-089) January 17, 2020
- (Open) Biomedical Knowledgebase (PAR-20-097) January 17, 2020
- FAQs for PAR-20-089 and PAR-20-097
- (Closed) Support for existing data repositories to align with FAIR and TRUST principles and evaluate usage, utility, and impact (NOT-OD-23-044) FAQs January 5, 2023
- (Closed) Support for existing data repositories to align with FAIR and TRUST principles and evaluate usage, utility, and impact (NOT-OD-21-089) April 6, 2021
- (Closed) Support for existing data repositories to align with FAIR and TRUST principles and evaluate usage, utility, and impact (NOT-OD-22-069) January 31, 2022
Funded Awards
Grant Number |
Award IC |
Principal Investigator |
Project Title |
Repository Weblink |
---|---|---|---|---|
1 R24 GM144308-01 |
NIGMS |
Anita Elzbieta Bandrowski |
From RRID to Resource Watch: A Knowledgebase of Biomedical Research Resources |
https://scicrunch.org/resources |
2 U24 HG007822-08 |
NIA, NIAID, NCI,DK, NIGMS, NHGRI, NHLBI, Office of the Director |
Alex Bateman |
UniProt: A Protein Sequence and Function Resource for Biomedical Science |
https://www.uniprot.org/ |
1 U24 NS122732-01 |
NINDS |
Adam R. Ferguson |
https://odc-tbi.org/ | |
1 R24 GM144232-01 |
NIGMS |
Michael K. Gilson |
BindingDB: An Open Knowledgebase of Protein-Small Molecule Interactions |
|
1 U24 GM142435-01 |
NIGMS |
Marc S. Halfon |
REDfly: The regulatory sequence resource for Drosophila and other insects |
|
1 U24 HG012556-01 |
NHGRI, NHLBI, NIMHD, NINDS, Office of the Director |
Carol Marie Hamilton |
Establishing the PhenX Toolkit as a Biomedical Knowledgebase |
|
1 U24 AI171008-01 |
NIAID |
Yongqun He |
VIOLIN 2.0: Vaccine Information and Ontology LInked kNowledgebase |
|
1 U24 AI162625-01 |
NIAID |
Elliot J. Lefkowitz |
Virus Taxonomy: A Community Knowledgebase Supporting Virus Research |
|
1 U24 ES033155-01 |
NIEHS, NINDS |
Carolyn J. Mattingly |
||
1 U24 GM143402-01 |
NIGMS |
Mark A. Musen |
BioPortal: An Expansive Knowledgebase of Biomedical Entities and Relations |
|
1 U24 HG012542-01 |
NCI, NHGRI |
Helen Elizabeth Parkinson |
||
1 U24 HG012557-01 |
NHGRI |
Lynn Marie Schriml |
The Human Disease Ontology: An integrated, mechanistic knowledge resource for biomedical research. |
|
1 U24 HG012198-01 |
NHGRI |
Lincoln D. Stein |
||
1 U24 HG012212-01 |
NIGMS, NHGRI |
Paul D. Thomas |
||
1 R24 GM146616-01 |
NIGMS |
Michael Tiemeyer |
GlyGen growth and evolution into a central resource for glycans and glycoconjugates |
|
1 U24 ES035214-01 |
NIEHS, Office of the Director |
Alexander Tropsha |
Supporting Biomedical Discovery with the ROBOKOP Graph Knowledgebase. |
|
1 U24 CA265879-01 |
NCI |
Jeremy Lyle Warner |
Enhancing the HemOnc Knowledgebase of Chemotherapy Drugs and Regimens |
|
1 U24 AA029959-01 |
NIAAA |
Samuel S. Wu |
Southern HIV and Alcohol Research Consortium Biomedical Data Repository |
Website for Consortium: |
Principal Investigator |
Institution |
Project Title |
Repository Weblink |
---|---|---|---|
Alex Bateman |
European Molecular Biology Laboratory |
UniProt building community metrics for FAIR and TRUSTworthy resources |
|
Keyvan Farahani |
National Cancer Institute |
UDash - a Usage Dashboard for the Imaging Data Commons |
|
Kerry Goetz |
National Eye Institute |
||
Melissa Haendel |
University Of Colorado Denver |
The Monarch Initiative: Linking diseases to model organism resources |
|
Christian Haselgrove |
Univ Of Massachusetts Med Sch Worcester |
Neuroimaging Informatics Tools and Resources Collaboratory: Outreach, Infrastructure and Maintenance |
|
Ian Korf |
University Of California At Davis |
Informatics, Coordination and Service Center for the Mutant Mouse Resource and Research Centers |
|
Mathew McAuliffe |
Department of Defense (DoD) and National Institute of Neurological Disorders and Stroke (NINDS) |
Modernizing the Federal Interagency Traumatic Brain Injury Research (FITBIR) repository |
|
Susan Teitelbaum |
Icahn School Of Medicine At Mount Sinai |
||
Antonella Zanobetti |
Harvard School of Public Health |
National Cohort Studies of Alzheimers Disease, Related Dementias and Air Pollution |
Principal Investigator |
Institution |
Project Title |
Repository Weblink |
---|---|---|---|
Eric Ravussin, Ph.D. | Louisiana State University |
Improving FAIR-ness and TRUST-worthiness of the Pennington/Louisiana NORC Biorepository |
|
Chris Rorden, Ph.D. |
University of South Carolina |
||
Molly A. Bogue, Ph.D.; |
The Jackson Laboratory, Bar Harbor, Maine, USA |
Mouse Phenome Database: Making it More FAIR-compliant and TRUST-worthy |
|
Nadine Martin, Ph.D., CCC-SLP |
Temple University |
||
Vikash Gilja, Ph.D. |
University of California, San Diego |
Data Repository for “CRCNS: Avian Model for Neural Activity Driven Speech Prostheses” |
|
Joshua Orvis, M.S. |
Institute for Genome Sciences, University of Maryland School of Medicine |
||
Brian MacWhinney, Ph.D. |
Carnegie Mellon University |
Improve the Compliance of the CHILDES Project Database with the FAIR and TRUST Principles |
|
Carl Kesselman, Ph.D., M.Eng. |
University of Southern California |
FaceBase Data Hub: Enhancing TRUST-worthiness of the FaceBase Research Data Hub |
|
Linda Brzustowicz, M.D., FAPA |
Rutgers University |
Enhancing Alignment of the NRGR with FAIR and TRUST Principles |
|
Dalane Kitzman, M.D. |
Wake Forest University School of Medicine |
Enhancing an Integrated Data Bank for Aging Studies |
|
Carol Bult, Ph.D.;\ |
The Alliance of Genome Resources and the Mouse Genome Database |
Aligning the Alliance of Genome Resources with FAIR and TRUST principles |
|
Jeffrey Grethe, Ph.D. |
University of California, San Francisco |
Open Data Commons for Traumatic Brain Injury (ODC-TBI) |
|
Darrell Hurt, Ph.D. |
National Institute of Allergy and Infectious Diseases, National Institutes of Health |
Increased Interconnectivity for Database of Antimicrobial Activity and Structure of Peptides (DBAASP) |
|
Meghan McCarthy, Ph.D. |
National Institute of Allergy and Infectious Diseases, National Institutes of Health |
Making 3D Data “FAIR” with NIH 3D |
|
Quan Chen, Ph.D. |
National Institute of Allergy and Infectious Diseases, National Institutes of Health |
Project: ‘FHIR’ing up ImmPort: Improving Interoperability of ImmPort Data |
|
Rebecca Rodriguez, Ph.D. |
National Institute of Diabetes and Digestive and Kidney Diseases, National Institutes of Health |
Applying FAIR and TRUST Principles for Enhanced Resource Sharing and Sustainable and Reliable Repository Operations |
|
Jennifer Fostel, Ph.D |
National Institute of Environmental Health Sciences, National Institutes of Health |
Ensuring FAIR and TRUST for High-dimensional Environmental Study Data |
Recent News
This page last reviewed on May 30, 2023