Skip to Main Content

BD2K Centers

Introduction

The BD2K Centers program has established eleven Centers of Excellence for Big Data Computing and two Centers that are collaborative projects with the NIH Common Fund's Library of Integrated Network-Based Cellular Signatures (LINCS) program, the Data Coordination and Integration Center for LINCS-BD2K (DCIC) and the Broad Institute LINCS Center for Transcriptomics (LINCScloud). The Centers are located all across the United States. They are large-scale projects aiming to develop new approaches, methods, software tools, and related resources. The Centers also provide training to advance Big Data science in the context of biomedical research. The thirteen BD2K Centers function with the other BD2K grantees as a consortium and collaborate with one another for the purpose of furthering every aspect of the field of biomedical data science research.

LINCS Program Collaborative Projects

Data Coordination and Integration Center for BD2K-LINCS (BD2K-LINCS DCIC)

Icahn School of Medicine at Mount Sinai
Principal Investigators: Avi Ma’ayan (Contact PI), Mario Medvedovic, Stephan C. Schurer
Grant Number: U54 HL127624

Program Officer: Albert Lee
Science Officers: Ajay Pillai (Lead SO), Pothur Srinivas

The LINCS project will produce a large amount of biological information about the responses of cells and tissues to disruption by drugs and other molecules. Investigators plan to develop novel analysis tools and methods to garner new insights from models of biological systems linking complex diseases with drugs and pathways that the drugs target in various cells and tissues.

Broad Institute LINCS Center for Transcriptomics (LINCScloud)

Broad Institute
Principal Investigators: Todd Golub (Contact PI), Aravind Subramanian
Grant Number: U54 HL127366

Program Officer: Ajay Pillai
Science Officers: Daniel Shaughnessey (Lead SO)

The LINCS project is expected to have significant impact on a broad range of the biomedical research community. It has the potential to yield new approaches to genome functional annotation, to provide a path toward the elucidation of mechanism-of-action of small-molecule compounds, and to facilitate the discovery of drugs with unanticipated therapeutic effects on disease biology. By including a significant outreach and training component it will make researchers from across many biomedical institutions conversant with the applying the resources generated to these important problems in health science.

Centers of Excellence for Big Data Computing

Big Data for Discovery Science (BDDS)

The University of Southern California
Principal Investigators: Arthur W. Toga (Contact PI)
Grant Number: U54 EB020406

Program Officer: Vinay Pai
Science Officers: Stacia Friedman-Hill (Lead SO), Christina Liu, Keyvan Farahani, Patrick Bellgowan, Ashley Xia

Researchers at the Big Data for Discovery Science Center will focus on proteomics, genomics, and images of cells and brains collected from patients and subjects across the globe. They will enable detection of patterns, trends, and relationships among these data with user-focused data management, sophisticated computational methodologies, and leading-edge software tools for the efficient large-scale analysis of biomedical data. Interactive visualization tools created at this Center will stimulate fresh insights and encourage the development of modern treatments and new cures for disease.

Center for Big Data in Translational Genomics

The University of California Santa Cruz
Principal Investigators: David H. Haussler (Contact PI), David Patterson, Laura J. van 't Veer
Grant Number: U54 HG007990 

Program Officer: Valentina DiFrancesco
Science Officers: Jerry Li (Lead SO), Weiniu Gan, Dawei Lin, Jane Ye, Heidi Sofia, Lisa Brooks

The Center for Big Data in Translational Genomics is a multinational collaboration between academia and industry that will create data models and analysis tools to analyze massive datasets of genomic information. Such tools can be used for analysis of the genomes and the gene expression data from thousands of individuals to uncover the contribution of gene variants to disease with an initial focus on cancer. This knowledge will be instrumental in the development of precision diagnostic and treatment methods.

Center for Causal Modeling and Discovery of Biomedical Knowledge from Big Data (CCD)

The University of Pittsburgh at Pittsburgh
Principal Investigators: Gregory F. Cooper (Contact PI), Ivet Bahar, Jeremy M. Berg
Grant Number: U54 HG008540

Program Officer: Valentina DiFrancesco
Science Officers: Weiniu Gan (Lead SO), Ajai Pillai, Donna Krasnewich, Valerie Florance, Michelle Heacock

The Center for Causal Modeling and Discovery of Biomedical Knowledge from Big Data will develop user-friendly tools and resources that use Bayesian statistics to generate causal models from large and complex datasets. Initial tool and method development efforts will focus on three biomedical problems that involve large amounts of data: cell signals that drive cancer development, the molecular basis of lung disease susceptibility and severity, and functional connections in the human brain.

Center for Expanded Data Annotation and Retrieval (CEDAR)

Stanford University
Principal Investigators: Mark A. Musen (Contact PI)
Grant Number: U54 AI117925

Program Officer: Maria Giovanni
Science Officers: Allen Dearry (Lead SO), Valerie Florance, Quan Chen, Ashley Xia, Punam Mathur

The ability to locate, analyze, and integrate Big Data depends on the metadata that describe the content of data sets. The Center for Expanded Data Annotation and Retrieval (CEDAR) will facilitate automated annotation of data with high quality metadata by generating community-based metadata standards and a metadata repository for training learning algorithms to develop metadata templates. These templates will initially be evaluated, validated, and adapted with the NIAID ImmPort multi-assay data repository and other data repositories.

Center for Mobility Data Integration to Insight (The Mobilize Center)

Stanford University
Principal Investigators: Scott L. Delp (Contact PI)
Grant Number: U54 EB020405

Program Officer: Grace Peng
Science Officers: Theresa Cruz (Lead SO), Daofen Chen

The Mobilize Center is poised to provide access to mobility data for over ten million people. The Center will develop and disseminate a range of novel data science tools, including identifying: new modeling and analysis methods to predict and improve the outcomes of surgeries in children with cerebral palsy and gait pathology; novel approaches to optimize mobility in individuals with osteoarthritis, running injuries, and other movement impairments; and new methods that motivate overweight and obese individuals to exercise more and in ways that promote joint health.

Center for Predictive Computational Phenotyping (CPCP)

The University of Wisconsin – Madison
Principal Investigators: Mark W. Craven (Contact PI)
Grant Number: U54 AI117924

Program Officer: Maria Giovanni
Science Officers: Gina Wei (Lead SO), Valerie Florance, Becky Boyles, Quan Chen, Ashley Xia, Punam Mathur

The Center for Predictive Computational Phenotyping aims to accelerate the impact of predictive modeling on clinical practice. The Center will focus on issues related to computational phenotyping and will produce disease prediction models from machine learning and statistical methods. These models will integrate data from electronic health records, images, molecular profiles, and other datasets to predict patient risks for breast cancer, heart attacks, and severe blood clots.

Center of Excellence for Mobile Sensor Data-to-Knowledge (MD2K)

The University of Memphis
Principal Investigators: Santosh Kumar (Contact PI)
Grant Number: U54 EB020404

Program Officer: Richard Conroy
Science Officers: Wendy Nilsen (Lead SO), Bill Riley, Kate Stoney, Mary Rodgers, Elaine Collier

Researchers at the Center of Excellence for Mobile Sensor Data-to-Knowledge will develop innovative tools to make it easier to gather, analyze, and interpret data from mobile sensors. These tools will reduce the burden of complex chronic disorders on health and healthcare by enabling detection and prediction of person-specific disease risk factors ahead of the onset of adverse clinical events. The Center will study two specific problems as test cases: 1) reducing hospital readmissions for patients with congestive heart failure, and 2) preventing relapse in those who have quit smoking.

ENIGMA Center for Worldwide Medicine, Imaging, and Genomics

The University of Southern California
Principal Investigators: Paul M. Thompson (Contact PI)
Grant Number: U54 EB020403

Program Officer: Vinay Pai
Science Officers: Patrick Bellgowan (Lead SO), Yantian Zhang, Harold Gordon, Ed Ramos

The ENIGMA Center for Worldwide Medicine, Imaging, and Genomics will incorporate the scientific acumen of more than 300 scientists worldwide, and their biomedical datasets, in a global effort to combat human brain diseases. This Center will develop computational methods for integration, clustering, and learning from complex biodata types. This Center’s projects will help identify factors that either resist or promote brain disease, or assist in the diagnosis and prognosis, as well as help identify new mechanisms and drug targets for mental health care.

Heart BD2K, a Community Effort to Translate Protein Data to Knowledge: An Integrated Platform

The University of California Los Angeles
Principal Investigators: Peipei Ping (Contact PI), Merry Lindsey, Andrew Su, Karol Watson
Grant Number: U54 GM114833

Program Officer: Susan Gregurick
Science Officers: Pothur Srinivas (Lead SO), Weiniu Gan, Sal Sechi

The UCLA Center of Excellence for Big Data will embark on the project, Heart BD2K, a Community Effort to Translate Protein Data to Knowledge: An Integrated Platform, in order to fundamentally alter biomedical research culture to enable full employment of technological modeling innovations, such as crowdsourcing, to biomedical Big Data analysis. The goal of this Center is to democratize data research to include non-computational scientists and individuals and to apply innovative global community-driven data integration and modeling methods to address challenges involved in the study of protein structure, function, and networks with a focus on cardiovascular research.

KnowEng, a Scalable Knowledge Engine for Large-Scale Genomic Data

University of Illinois Urbana-Champaign
Principal Investigators: Jiawei Han (Contact PI), Saurabh Sinha, Jun S. Song, Richard M. Weinshilboum
Grant Number: U54 GM114838

Program Officer: Susan Gregurick
Science Officers: Valentina DiFrancesco (Lead SO), Nicole Moore, Heidi Sofia, Dawei Lin

The KnowEng Center will build a computational Knowledge Engine that uses data mining and machine learning techniques to obtain and combine gene function and gene interaction information from disparate genomic data sources. This integrated genomic environment will enable scientists and medical practitioners to add their own datasets to the engine and explore models generated from the incorporation of their data within the existing knowledge-base.

Patient-Centered Information Commons: Standardized Unification of Research Elements (PIC-SURE)

Harvard University Medical School
Principal Investigators: Isaac S. Kohane (Contact PI)
Grant Number: U54 HG007963

Program Officer: Valentina DiFrancesco
Science Officers: Valerie Florance (Lead SO), Gina Wei, Valentina DiFrancesco, Carolyn Williams, Lyn Hardy, Elaine Collier

Investigators at the Patient-Centered Information Commons Center will develop systems to combine genetic, environmental, imaging, behavioral, and clinical data on individual patients from multiple sources into integrated sets. Computing across thousands of such individuals will enable more accurate classification of individual disease or disease risk and facilitate greater precision in patient disease prevention and treatment strategies.

Back to Top