Skip to Main Content

Data Science News

Data Science Community News



NIH Workshop: Harnessing Artificial Intelligence and Machine Learning to Advance Biomedical Research

June 28, 2018

Monday, July 23, 2018, 8:45 a.m. – 4:00 p.m. ET | The John Edward Porter Neuroscience Research Center, Room 620 | NIH Main Campus, Bethesda, MD

Artificial intelligence (AI) and machine learning (ML) are advancing rapidly and in use across industries, including biomedical research and healthcare delivery. For this full-day public workshop, NIH is bringing together leaders in innovation and science to explore the opportunities for AI and ML to accelerate medical advances from biomedical research. Workshop participants will hear from leading industry experts and scientists who are employing AI/ML in biomedical research settings. Speakers will cover a range of issues including the promise of integrating AI technology into healthcare, how it is being used in biomedical research, its potential for enhancing clinical care and scientific discovery. Craig Mundie, who served on the President’s Council of Advisors on Science and Technology (PCAST) and was formerly Microsoft’s Chief Research Strategy Officer, will deliver the keynote address.

Learn More...



NIH releases strategic plan for data science

June 4, 2018

Storing, managing, standardizing and publishing the vast amounts of data produced by biomedical research is a critical mission for the National Institutes of Health. In support of this effort, NIH today released its first Strategic Plan for Data Science that provides a roadmap for modernizing the NIH-funded biomedical data science ecosystem. Over the course of the next year, NIH will begin implementing its strategy, with some elements of the plan already underway. NIH will continue to seek community input during the implementation phase.



Chief Data Strategist and Director, Office of Data Science

May 10, 2018

Chief Data Strategist and Director, Office of Data Science Job Ad.



​NIH Seeking Input on Draft Strategic Plan for Data Science

March 5, 2018

To capitalize on the opportunities presented by advances in data science, the NIH is developing a Strategic Plan for Data Science. NIH published a Request for Information (RFI) to seek input on a draft strategic plan from stakeholders including members of the scientific community, academic institutions, the private sector, health professionals, professional societies, advocacy groups, patient communities, as well as other interested members of the public. The draft plan identifies overarching goals, strategic objectives, and implementation tactics for promoting the modernization of the NIH-funded biomedical data science ecosystem. Responses to the RFI are due by April 2, 2018.



NIH awards to test ways to store, access, share, and compute on biomedical data in the cloud

November 6, 2017

NIH Data Commons Pilot Phase to seek best practices for developing and managing a data commons

Twelve awards totaling $9 million in Fiscal Year 2017 will launch a National Institutes of Health Data Commons Pilot Phase. A data commons is a shared virtual space where scientists can work with the digital objects of biomedical research, such as data and analytical tools. The NIH Data Commons will be implemented in a four-year pilot phase to explore the feasibility and best practices for making digital objects available through collaborative platforms. This will be done on public clouds, which are virtual spaces where service providers make resources, such as applications and storage, available over the internet. The goal of the NIH Data Commons Pilot Phase is to accelerate biomedical discoveries by making biomedical research data Findable, Accessible, Interoperable, and Reusable (FAIR) for more researchers.

“Harvesting the wealth of information in biomedical data will advance our understanding of human health and disease,” said NIH Director Francis S. Collins, M.D., Ph.D. “However, poor data accessibility is a major barrier to translating data into understanding. The NIH Data Commons Pilot Phase is an important effort to remove that barrier.”

“The NIH Data Commons Pilot Phase will create new opportunities for research not feasible before,” said NIH Data Commons Pilot Phase Program Manager, Vivien Bonazzi, Ph.D. “Making biomedical data sets accessible and connected at an unprecedented scale will lead to creative new ways to combine, analyze, and ask questions of the data to generate new knowledge.”

The recipients of the 12 awards will form the nucleus of an NIH Data Commons Pilot Phase Consortium in which researchers will start developing the key capabilities needed to make an NIH Data Commons a reality. These key capabilities, which were identified by NIH, collectively represent the principles, policies, processes, and architectures of a data commons for biomedical research data. Key capabilities include making data transparent and interoperable, safe-guarding patient data, and getting community buy-in for data standards.

Three NIH-funded data sets will serve as test cases for the NIH Data Commons Pilot Phase. The test cases include data sets from the Genotype-Tissue Expression and the Trans-Omics for Precision Medicine initiatives, as well as the Alliance of Genome Resources, a consortium of Model Organism Databases established in late 2016. These data sets were chosen based on their value to users in the biomedical research community, the diversity of the data they contain, and their coverage of both basic and clinical research. While just three datasets will be used at the outset of the project, it is envisioned the NIH Data Commons efforts will expand to include other data resources once the pilot phase has achieved its primary objectives.

NIH has acquired support from a Federally Funded Research and Development Center, the MITRE Corporation, to assist in establishing new NIH sustainable infrastructure for data science (people, processes, technologies). The MITRE Corporation will provide a broad range of support services for the NIH Data Commons Pilot Phase including innovative approaches to assure cost-effective cloud-based computing and storage for scientific data; analyses related to usage, cost, and comparative business models; and, other considerations to assure long-term viability of NIH data science efforts.

The trans-NIH Data Commons Pilot Phase receives funding from multiple NIH Institutes and Centers and is managed by the NIH Common Fund within in the NIH Office of the Director. The Common Fund; the National Heart, Lung, and Blood Institute; and the National Human Genome Research Institute are the lead NIH entities involved in management of the NIH Data Commons Pilot Phase.

About the NIH Common Fund: The NIH Common Fund encourages collaboration and supports a series of exceptionally high-impact, trans-NIH programs. Common Fund programs are managed by the Office of Strategic Coordination in the Division of Program Coordination, Planning, and Strategic Initiatives in the NIH Office of the Director in partnership with the NIH Institutes, Centers, and Offices. More information is available at the Common Fund website:

About the National Heart, Lung, and Blood Institute (NHLBI): Part of the National Institutes of Health, the National Heart, Lung, and Blood Institute (NHLBI) plans, conducts, and supports research related to the causes, prevention, diagnosis, and treatment of heart, blood vessel, lung, and blood diseases; and sleep disorders. The Institute also administers national health education campaigns on women and heart disease, healthy weight for children, and other topics. NHLBI press releases and other materials are available online at

About the National Human Genome Research Institute (NHGRI): NHGRI is one of the 27 institutes and centers at the National Institutes of Health. The NHGRI Extramural Research Program supports grants for research, and training and career development at sites nationwide. Additional information about NHGRI can be found at

About the National Institutes of Health (NIH): NIH, the nation's medical research agency, includes 27 Institutes and Centers and is a component of the U.S. Department of Health and Human Services. NIH is the primary federal agency conducting and supporting basic, clinical, and translational medical research, and is investigating the causes, treatments, and cures for both common and rare diseases. For more information about NIH and its programs, visit

NIH…Turning Discovery Into Health®



Call for Papers

October 17, 2017

NIPS Workshop on Machine Learning for Health (NIPS ML4H 2017)

What parts of Healthcare are Ripe for Disruption by Machine Learning Right Now?

A workshop at the Thirty-First Annual Conference on Neural Information Processing Systems (NIPS 2017).

Friday, December 8, 2017

Long Beach Convention Center, Long Beach, CA, USA

Please direct questions to:

NOTE 2017/09/28: NIPS 2017 workshop registrations are now sold out. If you have not registered you may still submit a paper. During submission, please indicate an author that will attend or could attend in the unlikely event that more registrations became available as a "corresponding author."


  • Mon Oct 30, 2017: Submission deadline at 11:59pm
  • Fri Nov 10, 2017: Acceptance notification (Poster or Spotlight+Poster)
  • Thu Nov 16, 2017: NIPS deadline to cancel registration (with full refund)
  • Fri Dec 01, 2017: Final papers posted online (with permission)
  • Fri Dec 08, 2017: Workshop


The goal of the Machine Learning for Health Workshop (NIPS ML4H 2017) is to foster collaborations that meaningfully impact medicine by bringing together clinicians, health data experts, and machine learning researchers. We aim to build on the success of the last two NIPS ML4H workshops which were widely attended and helped form the foundations of a new research community.

This year’s program emphasizes identifying previously unidentified problems in healthcare that the machine learning community hasn't addressed, or seeing old challenges through a new lens. While healthcare and medicine are often touted as prime examples for disruption by AI and machine learning, there has been vanishingly little evidence of this disruption to date. To interested parties who are outside of the medical establishment (e.g. machine learning researchers), the healthcare system can appear byzantine and impenetrable, which results in a high barrier to entry. In this workshop, we hope to reduce this activation energy by bringing together leaders at the forefront of both machine learning and healthcare for a dialog on areas of medicine that have immediate opportunities for machine learning. Attendees at this workshop will quickly gain an understanding of the key problems that are unique to healthcare and how machine learning can be applied to addressed these challenges.

The workshop will feature invited talks from leading voices in both medicine and machine learning. Invited clinicians will discuss open clinical problems where data-driven solutions can make an immediate difference. The workshop will conclude with an interactive panel discussion where all speakers respond to questions provided by the audience.

From the research community, we welcome short paper submissions highlighting novel research contributions at the intersection of machine learning and healthcare. Accepted submissions will be featured as poster presentations and (in select cases) as short oral spotlight presentations.


Researchers interested in contributing should upload short, anonymized papers of up to 4 pages in PDF format by Monday, October 30, 2017, 11:59 PM in the timezone of your choice.

Please submit via our ML4H EasyChair website:

Papers should adhere to the NIPS conference paper format, via the NIPS LaTeX style file:

Workshop papers should be at most 4 pages of content, including text and figures. Additional pages containing only bibliographic references can be included without penalty.

Relevant Topics

Submitted papers should describe innovative machine learning research focused on relevant problems in health and medicine. This can mean new models, new datasets, new algorithms, or new applications. Topics of interest include but are not limited to reinforcement learning, temporal models, deep learning, semi-supervised learning, data integration, learning from missing or biased data, learning from non-stationary data, model criticism, model interpretability, causality, model biases, and transfer learning.

Peer Review and Acceptance Criteria

All submissions will undergo double-blind peer review. It will be up to the authors to ensure the proper anonymization of their paper. Do not include any names or affiliations. Refer to your own past work in the third-person.

Accepted papers will be chosen based on technical merit and suitability to the workshop's goals. All accepted papers will be included in one of two poster presentation sessions on the day of the workshop. Some accepted papers will be invited to give short oral spotlight presentations at the workshop.

Registration and Attendance

To promote community interaction, we hope at least one presenting author has registered and can attend the workshop. However, because NIPS workshop registration has sold out, we encourage all researchers to submit a paper regardless of their registration status.

Accepted papers that cannot attend will at least be listed on our website. It is unlikely that we will be able to create new registration spots for accepted papers, but we are exploring possibilities. If your paper is accepted and you cannot attend due to registration or other issues, please contact us after you are accepted and we'll find solutions on a case-by-case basis. Acceptance notifications will go out a few days before the NIPS deadline for full refunds.

Copyright for Accepted Papers

This workshop will be informally published online but not officially archived. This means:

  • Authors will retain full copyright of their papers.

  • Acceptance to NIPS ML4H 2017 does not preclude publication of the same material in another journal or conference.

We encourage (but do not require) accepted papers to be posted on arXiv. With author permission, we will post links to accepted short papers on our workshop website.

Our workshop does allow submission of papers that are under review or have been recently published in a conference or a journal. Authors should clearly state any overlapping published work at time of submission.




December 31, 1969

Monday, September 11 at 1:00PM

National Science Foundation Director Dr. France Anne-Dominic Córdova will deliver the second Annual Donald Lindberg and Donald West King Lecture, cosponsored by the National Library of Medicine, the Friends of the National Library of Medicine (FNLM) and the American Medical Informatics Association (AMIA), on Monday, September 11, 2017 at 1:00 p.m. in the Lister Hill Center Auditorium, Building 38A. The lecture, which honors recently retired NLM Director Dr. Lindberg and former NLM Deputy Director for Research and Education Dr. King, is titled, “Computation and Biomedicine: New Possibilities for Longstanding Challenges.”

Dr. Córdova is an American astrophysicist and the 14th director of the National Science Foundation, the only government agency charged with advancing all fields of scientific discovery, technology innovation, and science, technology, engineering and mathematics (STEM) education. Previously, she was the eleventh President of Purdue University and served as NASA’s chief scientist. The event will also be videocast,   

Sign language interpreter will be provided. Light refreshments sponsored by the FNLM will follow the lecture.

Sponsored by:
Milton Corn, MD
Deputy Director for Research and Education

For more information:
Kathy Cravedi
Office of Communications & Public Liaison

Back to Top