Skip to Main Content


BD2K News



Big Data to Knowledge Multi-Council Working Group - January 2017

January 9, 2017

Notice is hereby given of a meeting of the Big Data to Knowledge (BD2K) Multi-Council Working Group.

Name of Working Group:  Big Data to Knowledge Multi-Council Working Group

Date:  January 9, 2017 - Canceled

Place:  Teleconference
This portion of the meeting is open to the public and is being held by teleconference.  This is a listen ONLY meeting.  Please submit any questions or comments via email to the contact person listed below.

Join WebEx Meeting
Meeting number: 627 298 875
Meeting password: 1234
Dial-in: 1-877-668-4493
Open Session:  11:00am - 12:00pm ET

Discussion will review current Big Data to Knowledge (BD2K) activities and newly proposed BD2K initiatives.

  • Roll Call and Introduction
  • Update from the Associate Director for Data Science
  • BD2K All Hands Meeting and Open Data Science Symposium Recap

Closed Session:  12:30pm - 3:00pm ET

Agenda:  Discussion will focus on review of proposed FY17 Funding Plans for BD2K Funding Opportunity Announcements and Administrative Supplements.

Event Contact: 
Individuals who plan to attend and need special assistance, such as sign language interpretation or other reasonable accommodations, should notify Tonya Scott, email:, phone: 301-402-9817.

Federal Register Meeting Announcement:
National Institutes of Health, Office of the Director - Notice of Meeting



Public Voting Determines Three Finalists for the Open Science Prize

January 9, 2017

Public voting for the Open Science Prize is now closed. Thank you to everyone who voted. The 3 prototypes which scored highest and will therefore be going forward to the next stage of review are:

MyGene2: Accelerating Gene Discovery with Radically Open Data Sharing


Real-Time Evolutionary Tracking for Pathogen Surveillance and Epidemiological Investigation

We will now be collecting expert reviews of these three prototypes. We anticipate announcing the the Grand Prize winner in early March 2017.

For additional information, contact:



Need Cloud for Your Research? Calling All NIH Extramural Investigators

December 9, 2016

The NIH Big Data to Knowledge (BD2K) initiative has partnered with the CMS Alliance to Modernize Healthcare (CAMH), operated by MITRE, to launch and test a new funding paradigm that will provide NIH extramural researchers with access to cloud computing and storage capabilities. This funding model, called the Commons Credits Pilot, will provide extramural biomedical investigators with active NIH grants access to cloud-based environments to network, securely store, and share their work in the form of digital objects.

The first cycle for applications is open now through January 16, 2017. 

Successful pilot applicants will receive dollar-denominated “credits” to obtain cloud-based computing and storage resources through an online market environment. Currently, the Commons Credits Pilot environment offers a variety of conformant cloud providers, including IBM, Seven bridges, and resellers of Google and Amazon.  This list will grow as more vendors become available. Investigators will have the flexibility to select their preferred cloud provider from the list and provide feedback to NIH on their experiences. The Commons Credits Pilot is not a grants program; it has shorter application requirements and review times, ensuring that the credits are dispensed rapidly to keep pace with novel research.

An active NIH extramural grant is required for participation in the Commons Credits Pilot.  Successful applications will likely complement the current grant(s) to enable novel research that may not have been accomplished or funded through other outlets.  NIH expects that requests will not typically exceed $50,000 in dollar-denominated credits.

To date, the NIH Commons Credits Pilot has been shared with researchers at various research institutes and conferences, including the BD2K All-Hands Meeting held November 29-30, 2016. NIH encourages active NIH grant holders to take advantage of this new funding mechanism and we hope that you’ll also share this opportunity with your respective institutes.

Interested researchers should register and apply now at: The Commons Credits Pilot team has created a short instructional video describing the application process within the portal to facilitate participation. To stay connected on the latest news regarding the NIH Commons Credit Pilot:

Please share this very exciting announcement with your extramural reasearch communities. For additional information, email the Commons Credits Pilot Team at:



Public Voting for the Open Science Prize is LIVE!

December 1, 2016

Public voting for the Open Science Prize is LIVE!

Help shape new directions in biomedical research by VOTING HERE.

Voting will be open December 1, 2016 through January 6, 2017 at 11:59pm PST.

In the spirit of Open Science, we invite you to help decide which of the prototypes competing for the Open Science Prize will be considered for the final grand prize. You will be asked to review 6 prototypes developed by the finalist teams and cast your vote for the most novel and impactful ones. The 3 prototypes receiving the highest number of public votes will advance to a final round of review by a panel of science experts and judges. A single, grand prize winner of $230,000 will be announced in March 2017.

In this competition, the teams were challenged to use open, publicly accessible data to improve human health. Each team produced prototypes that demonstrate how the power of Open Data can be harnessed to address a wide array of human health concerns through crowdsourcing or the development of innovative platforms on which to conduct computational modeling. Each team includes at least one U.S. and one international member with the goal of forging new collaborations with health and technology innovators from across the world, benefiting the global research community and the public in the process.


We invite you to watch the video demonstrations and test drive the prototypes before voting at: An archive of the NIH Open Data Science Symposium webcast is available here:, if you would like to watch the onstage prototype demonstrations or any other presentations from the Big Data to Knowledge (BD2K) All Hands Meeting (November 29-30) or Open Data Science Symposium (December 1).   

The winning prototype will be selected by the National Institutes of Health and the Wellcome Trust and publically announced in March 2017. For additional information, email:

The Open Science Prize is a collaboration between the National Institutes of Health (Bethesda, MD, USA) and the Wellcome Trust (London, UK), with additional funding provided by the Howard Hughes Medical Institute (Chevy Chase, MD, USA). This opportunity is being funded in part by the NIH Big Data to Knowledge (BD2K) Initiative.

We appreciate your help with getting the word out to your stakeholder communities about this worldwide public voting opportunity. Thank you for voting and helping to support the Open Science Prize.




bioCADDIE DataMed Version 1.5 Now Live

November 23, 2016

DataMed Beta Version 1.5

The bioCADDIE development team announces the release of DataMed Version 1.5, a Data Discovery Index (DDI) prototype

 …with enhancements and important code corrections!

Thanks to user feedback, the DDI prototype has many new usability enhancements and code corrections.

New features introduced:

  • Increased coverage to twice the number of biomedical data repositories
  • Total number of datasets doubled
  • Repositories mapped to DATS 2.1 metadata model
  • Sorting on publication date of the dataset
  • Visualization of results via timeline
  • Usability enhancements based on user feedback and user interviews

User-reported issues resolved:

  • Search capabilities expanded to include search by dataset IDs, PMIDs
  • Compatibility with Google Chrome fixed
  • Generate collections from search results
  • Ability to view results in different formats
  • Links to related datasets
  • and Many More Features...!

DataMed is a work in progress and the bioCADDIE development team welcomes your feedback HERE.

Get involved in the bioCADDIE project and DataMed user studies!

For more details, contact: or



Exponential Medicine 4-Day Program in San Diego

October 8, 2016

Exponential Medicine (October 8-11, 2016) is a unique and intensive cross-disciplinary 4-day program that brings together world-class faculty, innovators and organizations from across the biomedical and technology spectrum (from mobile health & 3D printing, to A.I., robotics, synthetic biology, and beyond) to explore and leverage the convergence of fast moving technologies in the reinvention and future of health and medicine. The program will focus on how computing through robotics, big data, and artificial intelligence will cause a disruptive change in medicine.

For more information, visit

This program is sponsored by Singularity University. In computing, singularity “is a hypothetical event in which artificial general intelligence would be capable of recursive self-improvement and is the point beyond which events may become unpredictable or even unfathomable to human intelligence.”



Extracting Insights from Healthcare Data with Deep Learning

September 22, 2016

The Office of the Associate Director for Data Science (ADDS) announces a training opportunity in Deep Learning.  This day-long, in-depth workshop is the second session of a two-part series.  Part 1 (September 8), is an hour-long overview of deep learning followed by a half hour for questions.  Part 2 (September 22), is a day-long, in-depth workshop.  Attending or watching the overview first is highly recommended.  Registration is required for the September 22 workshop, but not for the September 8 lecture.

Title: Extracting Insights from Healthcare Data with Deep Learning

Date: September 22, 9:00am - 5:00pm ET

Location: Building 31, 6C Room 10, NIH Bethesda campus

This event is a hands-on workshop and will not be videocast.

Open to NIH only; registration required.

Register for this course:

Information on all upcoming courses: 

Abstract: Recent years have seen a dramatic increase in the amount of healthcare-related data being collected.  Now we need powerful analysis tools to extract insights and understanding from these mountains of data.  A new approach called Deep Learning - based on neural networks inspired by the brain - is proving to be very effective in a wide range of research, diagnostic, and clinical applications.  Join us to learn how advanced deep learning techniques are being applied to these rich data sets to help solve problems in diagnosis of diabetic retinopathy, calculating ventricular ejection fraction, and predicting survival in a Pediatric ICU.  We will cover a variety of frameworks, tools, and languages including: DIGITS, Caffe, MXNet, Keras, MATLAB, R, Python, and more.  The hands-on exercises will help you get started with applying deep learning to your own work.

For additional information, contact Sonynka Ngosso, 301-402-9816.



Computational Biology: Past, Present, and Future PLOS Symposium

September 16, 2016

Time: 9:30am - 4:00pm ET

Place: Porter Neuroscience Research Center, Bldg 35A, Rm 620, on the NIH Bethesda campus.

For those unable to attend, the event will be webcast here:

Register here

Agenda: The Symposium will be chaired by PLOS Editors-in-Chief Ruth Nussinov and Jason Papin and the Journal’s Founding Editor-in-Chief, Dr.  Phil Bourne, Associate Director for Data Science, NIH. Keynote addresses will include: David J. Lipman, NCBI; Jennifer Lippincott-Schwartz, Eunice Kennedy Shriver National Institute of Child Health and Human Development; and Bert Vogelstein, Johns Hopkins School of Medicine. The agenda also includes two discussion panels served by PLOS Computational Biology editors from a range of fields. The morning panel will discuss the “Biggest Challenges and Greatest Opportunities in Computational Biology over the Next 10 Years”. The afternoon panel will discuss “How Computational Biology Will Affect Human Health”. The Symposium is open to all NIH/HHS staff and the wider community. Closing remarks will be given by Dr. Michael Gottesman, Deputy Director of Intramural Research, NIH. Please share this invitation with your scientific communities. For additional information, contact:

Read the blog post about the Symposium in PLOS Biologue.



International Data Week 2016

September 11, 2016

International Data Week (IDW), September 11-17, 2016 in Denver, CO, will bring together data scientists, researchers, industry leaders, entrepreneurs, policy makers, and data stewards to explore how best to exploit the data revolution to improve our knowledge and benefit society through data-driven research and innovation. The theme of this landmark event is "From Big Data to Open Data: Mobilizing the Data Revolution." 

IDW comprises three complementary events: 

SciDataCon 2016 - Advancing the Frontiers of Data in Research:  Seeks to advance the frontiers of data in all areas of research. This means addressing a range of fundamental and urgent issues around the Data Revolution and the recent data-driven transformation of research and the responses to these issues in the conduct of research.

International Data Forum:  This event will be the centrepiece of the International Data Week, bringing together international researchers, industrialists, policy makers, and educators to discuss the major opportunities and challenges of the data revolution, from Big Data to Open Data.

8th RDA Plenary Meeting: offers attendees a unique opportunity to network and collaborate with colleagues and peers in various disciplines, and make concrete progress in technical and social areas on topics related to research data sharing and exchange.

Find out what the ADDS Office and BD2K grantees are doing during International Data Week 2016 here.



The BD2K Guide to the Fundamentals of Data Science Lecture Series

September 9, 2016

Big Data to Knowledge (BD2K) is pleased to announce The BD2K Guide to the Fundamentals of Data Science, a series of online lectures given by experts from across the country covering a range of diverse topics in data science. This course is an introductory overview that assumes no prior knowledge or understanding of data science. This lecture series is a joint effort between the BD2K Training Coordinating Center (BD2KTCC), the BD2K Centers Coordination Center (BD2KCCC), and the NIH Office of the Associate Director of Data Science. For up-to-date information about the series and to view archived presentations, go to:

The series begins September 9, and will run every Friday from 12:00pm – 1:00pm ET.

All sessions will be streamed live and archived for on demand public viewing.

Call in information: 

phone: +1 (914) 614-3221; Access Code: 736-335-403

For additional event information, contact Crystal Stewart.


  • 09/09/16 - INTRODUCTION to Big Data and the Data Lifecycle (Mark Musen, Stanford)
  • 09/16/16 - SECTION 1: DATA MANAGEMENT OVERVIEW (Bill Hersh, Oregon Health Sciences)
  • 09/23/16 - Finding and Accessing Datasets, Indexing, and Identifiers (Lucila Ohno-Machado, UCSD)
  • 09/30/16 - Data Curation and Version Control (Pascale Gaudet, Swiss Institute of Bioinformatics)
  • 10/07/16 - Ontologies (Michel Dumontier, Stanford)
  • 10/14/16 - Provenance (Zachary Ives, Penn)
  • 10/21/16 - Meta Data Standards (Susanna-Assunta Sansone, Oxford)
  • 10/28/16 - SECTION 2: DATA REPRESENTATION OVERVIEW  (Anita Bandrowski, UCSD)
  • 11/04/16 - Databases and Data Warehouses, Data: Structures, Types, Integrations (Chaitan Baru, NSF)
  • 11/11/16 - NO LECTURE: VETERAN’S DAY
  • 11/18/16 - Social Networking Data (TBD)
  • 12/02/16 - Data Wrangling, Normalization, Preprocessing (Joseph Picone, Temple)
  • 12/09/16 - Exploratory Data Analysis (Brian Caffo, Johns Hopkins)
  • 12/16/16 - Natural Language Processing (Noemie Elhadad, Columbia)


  • 01/13/17 - Workflows/Pipelines
  • 01/20/17 - Programming and Software Engineering, API, Optimization
  • 01/27/17 - Cloud, Parallel, Distributed Computing, and HPC
  • 02/03/17 - Commons: Lessons Learned, Current State
  • 02/17/17 - Smoothing, Unsupervised Learning/Clustering/Density Estimation
  • 02/24/17 - Supervised Learning/Prediction/ML, Dimensionality Reduction
  • 03/03/17 - Algorithms, Optimization
  • 03/10/17 - Multiple Testing, False Discovery Rate
  • 03/17/17 - Data Issues: Bias, Confounding, and Missing Data
  • 03/24/17 - Causal Inference
  • 03/31/17 - Data Visualization Tools and Communication
  • 04/07/17 - Modeling Synthesis


  • 04/14/17 - Open Science
  • 04/21/17 - Data Sharing (including social obstacles)
  • 04/28/17 - Ethical Issues
  • 05/05/17 - Extra Considerations/Limitations for Clinical Data
  • 05/12/17 - Reproducibility
  • 05/19/17 - SUMMARY and NIH CONTEXT 


Back to Top