Skip to Main Content

All BD2K Events

  • 29

    2016 BD2K All Hands Meeting & Science Symposium

    November 29, 2016

    2016 BD2K All Hands Meeting & Science Symposium

    November 29 - December 1, 2016

    (TBD) Washington, DC Metro Area

    Details Forthcoming

    The 2016 BD2K All Hands Meeting & Science Symposium brings together researchers, educators, developers, and trainees from all of the BD2K initiative grants. The goals of the All Hands Meeting are to showcase the research being conducted by BD2K sponsored programs and to build a cohesive BD2K consortium that maximizes synergies between participants.

    The 2016 All Hands Meeting & Science Symposium will be a three-day event. The BD2K All Hands Meeting (November 29-30) is open to BD2K grantees and NIH Staff only. However, the plenary sessions will be videocast and available to the public at: The BD2K Science Symposium (December 1) will be open to public for the very important purpose of fostering and encouraging relationships with leading scientists and noted researchers whose work spans a diverse spectrum of biomedical, computational, and quantitative sciences across the field of Data Science.

    For a look at what happened at last year's meeting, please visit:
    2015 BD2K All Hands Meeting

    Details for the 2016 All Hands Meeting & Science Symposium are forthcoming. Any new developments will be updated on this webpage as they are finalized. For specific inquiries, please contact Sonynka Ngosso at or Tonya Scott at

  • 8

    Refining the Concept of Scientific Inference When Working with Big Data

    June 8, 2016


    The Committee on Applied and Theoretical Statistics presents: "Refining the Concept of Scientific Inference When Working with Big Data" - a 2-day workshop on the challenges of applying scientific inference to big data. This workshop is sponsored by the NIH Big Data to Knowledge (BD2K) initiative and the National Science Foundation.

    Date:  June 8-9, 2016
    Location:  Keck Center, Room 100
    500 Fifth Street, NW, Washington, DC

    This workshop will be professionally webcast. If you intend to join us in person or online, please register by Monday, June 6th. For more information, please visit the workshop website.

  • 20

    How Far Can We Trust Our Data?: The Interpretation, Value, and Use of Drug Discovery Data and the Need for a Bigger Data Approach

    May 20, 2016

    The Office of the Associate Director for Data Science invites you to attend the following lecture as part of the NIH Frontiers in Data Science Lecture Series:

    TITLE:  "How Far Can We Trust Our Data?:  The Interpretation, Value, and Use of Drug Discovery Data and the Need for a Bigger Data Approach"

    SPEAKER:  Terry Richard Stouch, Ph.D.

    DATE:  Friday, May 20, 2016, 2:00pm–3:00pm EST

    LOCATION:  Porter Neuroscience Research Center, Building 35A, Room 640, NIH Main Campus, Bethesda, MD

    This lecture will be videocast here.  


    How reliable is our drug discovery data? Is it qualitative or quantitative? Is it a hard and fast signpost to direct our efforts or just a hint of what might be? In fact, the percentage of drug discovery data in which we can have high confidence is surprisingly low. And, the common range of "true" error in any measurement is many times larger than the conventional error of measurement. Assays can be difficult in the best of situations and the demand for assays of low cost, speed, high throughput, and utilizing minuscule quantities of compound adds additional compromises. The picture is still more complicated by a myriad of confounding factors that can influence results and may not have consistent influence across assays, compounds, targets, and other factors.

    This presentation will address the author's 3-to-10 and 20% rules that capsulize the true error of drug discovery data, the percentage that can be used with high confidence, and how these rules were determined. Examples of surprising and confounding factors from the experience of many laboratories and companies around the world will be highlighted and lead to a discussion of the need for a "bigger" data of drug discovery that can help improve our understanding and increase the value of our results. The use of stringently-determined predictive models as an adjunct to wet assays for validation and triage will be shown. The importance of data presentation leading to proper interpretation by users will also be discussed.


    Dr. Terry Richard Stouch has been working in depth with drug discovery data for over 30 years in order to accelerate and increase the success of drug discovery. A user of the data himself, he collaborates closely not only with others who use the data, but more importantly, with those who generate the data. He brings biological, physical chemical, statistical, and computational experience to the evaluation of assays and data. In addition to addressing the raw data, his extensive collaborations have impressed on him the importance of focused data presentation and interpreted metadata to guide interpretation by data-swamped users of data.

    His experience in drug discovery research spans large pharma and biotech. He specializes in drug design, pharmaceutical data analysis, predictive modeling, property prediction, bio-molecular structure, and molecular modeling and simulation. His pharmaceutical research efforts span most therapeutic areas. He has participated in placing eight compounds into clinical trials.


    Individuals with disabilities who need sign language interpreters and/or reasonable accommodation to participate in this event should contact Kristan Brown, at or 301-402-9827 and/or the Federal Relay (1-800-877-8339) at least 5 business days prior to the event.

  • 29

    Data Science and Medicine: What's Possibly at the Cutting Edge?

    April 29, 2016

    The Office of the Associate Director for Data Science invites you to attend the following webinar as part of the NIH Frontiers in Data Science Lecture Series:

    TITLE:  "Data Science and Medicine: What's Possibly at the Cutting Edge?"

    SPEAKER:  Anthony Goldbloom, Co-Founder and CEO of Kaggle

    DATE:  Friday, April 29, 2016, 1:00pm-2:00pm EST

    This webinar has been recorded for the purpose of sharing content for public use. To watch the full YouTube video presentation, please click here.


    Kaggle hosts data science competitions. Data scientists download data and upload solutions to very difficult problems. Kaggle has collaborated with the NIH to use data science to solve healthcare and medical research problems ranging from using data science to diagnose heart failure from fMRIs (by measuring ejection fraction) to predicting seizures from EEG data. This talk will introduce data science competitions and show some of the surprising things at the cutting edge of medical research.


    Anthony Goldbloom is a Co-Founder and CEO of Kaggle.  In 2011 and 2012, Forbes Magazine named Anthony as one of the 30 Under 30 in Technology. In 2013, the MIT Tech Review named him one of the Top 35 innovators Under 35. And, the University of Melbourne has awarded him an Alumni of Distinction Award. He holds a first call honors degree in Econometrics from the University of Melbourne.  Anthony has published in the Economist and the Harvard Business Review.


    Dr. Andrew Arai, Senior Investigator, National Heart, Lung, and Blood Institute (NHLBI) 

    This thought provoking two-part presentation introduces how data science competitions can help to discover solutions to complex biomedical problems. In Part I, Anthony Goldbloom presents an overview of Kaggle’s methodology of designing, conducting, and evaluating data science competitions in medical research. He demonstrates through case studies of recent biomedical research collaborations, including diagnosing and predicting heart failure, seizures, and diabetic retinopathy. In Part II, Dr. Andrew Arai presents the results of a NHLBI collaboration with Kaggle, which conducted a machine learning competition that analyzed MRI imaging data to identify heart damage indicators that can assist with heart attack prediction.


    Individuals with disabilities who need sign language interpreters and/or reasonable accommodation to participate in this event should contact Kristan Brown at or 301-402-9827, or the Federal Relay (1-800-877-8339) at least 5 business days prior to the event.

  • 25

    Big Data to Knowledge Multi-Council Working Group - May 2016

    April 25, 2016

    Notice is hereby given of a meeting of the Big Data to Knowledge (BD2K) Multi-Council Working Group.

    This teleconference meeting will be open to the public as indicated below. Individuals who plan to attend and need special assistance, such as sign language interpretation or other reasonable accommodations, should notify the contact person listed below in advance of the meeting.

    Name of Working Group:  Big Data to Knowledge Multi-Council Working Group

    Date:  April 25, 2016

    Open Session:  11:00am - 12:30pm EST

    Agenda:  Discussion will review current Big Data to Knowledge (BD2K) activities and newly proposed BD2K initiatives.

    Open Session Presentations:

    Closed Session: 12:45pm - 3:00pm EST

    Agenda:  Discussion will focus on review of proposed Funding Plans for BD2K Funding Opportunity Announcements.

    Place:  Teleconference
    This meeting is open to the public, but is being held by teleconference only.  No physical meeting location is provided for any interested individuals to listen to committee discussions.

    Join WebEx MeetingExit Link Disclaimer

    Meeting number: 622 421 867

    Meeting password: 1234

    Dial-in: 1-877-668-4493

    This is a listen ONLY meeting. Please submit any questions or comments via email to the contact person listed below.

    Contact Person:  Tonya Scott, email:, phone:  301-402-9817.

    To view the Federal Register meeting announcement: Office of the Director, National Institutes of Health - Notice of Meeting



  • 13

    Interactive Visual Discovery in Event Analytics: Electronic Health Records and Other Applications

    April 13, 2016

    The Office of the Associate Director for Data Science invites you to attend the following event as part of the NIH Frontiers in Data Science Lecture Series:

    TITLE:  "Interactive Visual Discovery in Event Analytics: Electronic Health Records and Other Applications"

    SPEAKER:  Ben Shneiderman, Ph.D.

    DATE:  Wednesday, April 13, 2016, 11:00am–12:00pm EST

    LOCATION:  Porter Neuroscience Research Center, Building 35A, Room 610, NIH Main Campus, Bethesda, MD

    This lecture will be videocast here.


    Event Analytics is rapidly emerging as a new topic to extract insights from the growing set of temporal event sequences that come from medical histories, e-commerce patterns, social media log analysis, cybersecurity threats, sensor nets, online education, sports, etc. This talk reviews Dr. Shneiderman's decade of research on visualizing and exploring temporal event sequences to view compact summaries of thousands of patient histories represented as time-stamped events such as strokes, vaccinations, or admissions to an emergency room.  His current work on EventFlow supports point events, such as heart attacks or vaccinations, and interval events, such as medication episodes or long hospitalizations. Demonstrations cover visual interfaces to support hospital quality control analysts who ensure that required procedures were carried out. Dr. Shneiderman will show how domain-specific knowledge and problem-specific insights can lead to sharpening the analytic focus so as to enable more successful pattern and anomaly detection.


    Ben Shneiderman is a Distinguished University Professor in the Department of Computer Science at the University of Maryland (UM). He is a Fellow of the AAAS, ACM, IEEE, and NAI, and a Member of the National Academy of Engineering, in recognition of his pioneering contributions to human-computer interaction and information visualization.


    Individuals with disabilities who need sign language interpreters and/or reasonable accommodation to participate in this event should contact Kristan Brown at or 301-402-9818 and/or the Federal Relay at 1-800-877-8339, at least five business days prior to the event.

  • 5

    Translating from Bench to Bedside and Back – Challenges and Opportunities from a Data Science Perspective

    April 5, 2016

    The Office of the Associate Director for Data Science invites you to attend the following lecture as part of the NIH Frontiers in Data Science Lecture Series:

    TITLE: "Translating from Bench to Bedside and Back – Challenges and Opportunities from a Data Science Perspective"

    SPEAKER: John Shon, Ph.D.

    DATE: Tuesday, April 5, 2016, 1:00pm-2:00pm

    LOCATION: Natcher Conference Center, Building 45, Room F1/F2, NIH Main Campus, Bethesda, MD

    Lecture will be videocast here.


    As the cost of sequencing decreases, the generation of sequence data, and more importantly, genetic variation data, increases at a rate that exceeds our ability to fully understand it. The data is also increasingly being generated in clinical contexts, providing the potential to enlighten our mechanistic understanding of disease and therapy. From a data science perspective, Dr. Shon will describe how research and clinical operational  contexts both facilitate and limit the use of next generation sequencing data for discovery, clinical care, and translational research purposes. Dr. Shon will also review promising approaches coupling the generation of NGS data with clinical data for the systematic application of translational knowledge for precision medicine.


    John Shon is VP of Bioinformatics and Data Sciences at Illumina. In this role he leads a global team of bioinformatics scientists in developing algorithms and methods for Illumina NGS instruments and assays. As part of the Enterprise Informatics business unit, he also leads bioinformatics for clinical interpretation and translational informatics software. Prior to Illumina, Dr. Shon has over a decade of experience in large pharmaceutical companies, most recently as VP of Informatics, Research IT, and External Innovation at Janssen Pharmaceuticals (a division of J&J) where he supported R&D, clinical development, and Janssen Diagnostic’s teams. At Roche, Dr. Shon led informatics groups in translational research for target discovery, biomarker selection, drug safety, and personalized healthcare.


    Individuals with disabilities who need sign language interpreters and/or reasonable accommodation to participate in this event should contact Kristan Brown at or 301-402-9818 and/or the Federal Relay (1-800-877-8339) at least 5 business days prior to the event.

  • 14

    Models and Data in Biomedicine: What’s Real and What’s Noise? And, Why Should We Care?

    March 14, 2016

    The Office of the Associate Director for Data Science invites you to attend the following lecture as part of the NIH Data Science Distinguished Seminar Series:

    TITLE: "Models and Data in Biomedicine: What’s Real and What’s Noise? And, Why Should We Care?"

    SPEAKER: Dr. Carlos Bustamante, Ph.D., Inaugural Chair of Stanford University's new Department of Biomedical Data Science

    DATE: Monday, March 14, 2016, 1:00pm – 2:00pm EST

    LOCATION: Lipsett Auditorium, Building 10, NIH Main Campus, Bethesda, MD

    This lecture will be videocast here.


    If you think of a scatterplot of data overlaid with a model for the data and ask practitioners from different fields, “what’s noise and what’s real?,” the answers may surprise you. To a biologist, the data will almost surely be “what’s real” and the model is a poor approximation to the “truth.” To a physicist, the model is probably “what’s real” and the data is just a noisy realization of an underlying true physical process that we are attempting to study. As we think about the biomedical data enterprise in the 21st century and the massive amounts of data we generate (and want to analyze!), we need to support multiple world views and have guidance on how to translate noisy data and noisy models into actionable information. Dr. Bustamante's presentation will draw upon several examples from Population Genetics (a field very rich in theory) and Genomics (a field not so rich in theory and much more data driven) to illustrate these points. It will also touch upon reproducible research and the question of how  funding agencies need to support ecosystems for collaborative research, including data producers, consortia, and so called "research parasites” that may want to use the data in ways that go beyond what the original experimental designers envisioned.


    Dr. Carlos Bustamante is a population geneticist whose research focuses on analyzing genome-wide patterns of variation within and between species to address fundamental questions in biology, anthropology, and medicine. From 2002-2009, he was on the faculty at Cornell University, in the Departments of Statistical Sciences and Biology Statistics and Computational Biology, where he was promoted to full professor in 2008. Since 2010, he has been on the faculty in the Department of Genetics at the Stanford University School of Medicine.

    He has received multiple honors and awards including a Marshall-Sherfield Fellowship (2001-2), the Sloan Research Fellowship (2007), and a John D. and Catherine T. MacArthur Fellowship (2010). He has trained over 50 post-doctoral fellows and graduate students as primary advisor and co-authored over 130 papers. Much of his research is located at the interface of computational biology, mathematical genetics, and evolutionary genomics. His most current research focuses on human population genomics and global health, including developing statistical, computational, and genomic resources for enabling trans- and multi-ethnic genome-wide association and medical sequencing studies of complex biomedical traits. He is one of the Principal Investigators of the recently announced $25M ClinGen project to build the country's National Database of Clinically Relevant Genomic Variants.


    This lecture is part of a full day of scheduled events and activities for the second annual NIH Pi Day, which celebrates the intersection between the quantitative and biomedical sciences. Pi Day is an annual international celebration of the irrational number Pi, 3.14..., on March 14. On Pi Day and every day, NIH recognizes the importance of building a diverse biomedical workforce with the quantitative skills required to tackle future challenges. For more information, visit the event page at


    Individuals with disabilities who need Sign Language Interpreters and/or reasonable accommodation to participate in this event should contact Kristan Brown at or 301-402-9818 or the Federal Relay (800-877-8339) at least 5 business days prior to the event.

  • 9

    Open Science as a Social Machine

    February 9, 2016

    The Office of the Associate Director for Data Science invites you to attend the following lecture as part of the NIH Frontiers in Data Science Lecture Series:

    TITLE: "Open Science as a Social Machine"

    SPEAKER: Dr. Barend Mons, Ph.D.

    DATE: Tuesday, February 9, 2016, 10:00am-11:00am

    LOCATION: Natcher Conference Center, Room F1/F2

    This lecture will be videocast here.


    Barend Mons is Chair of the European Commission's High Level Expert Group for the European Open Science Cloud (EOSC). The EOSC is meant to be a supporting expert infrastructure for Open Science. In this presentation, Dr. Mons will cover the aspects of open and participatory science in which community curation and annotation of data is key. He will emphasize the joint responsibility for data stewardship in Open Science. He will explain the concepts of Nanopublication, the Explicitome, and the concept of FAIR (Findable, Accessible, Interoperable, and Re-usable) data and other research objects with an emphasis on machine actionability of published research objects. Finally, Dr. Mons will outline the future developments of social machines in science and how users and producers of data merge into knowledge creation communities where man-machine interaction is key. Examples will be from his own field: Human Genetics.


    Barend Mons is a Molecular Biologist by training (Ph.D., Leiden University, 1986) and spent over 15 years in Malaria research. After that, he gained experience in computer-assisted knowledge discovery, which remains his research focus. He spent time with the European Commission (1993-1996) and with the Netherlands Organization for Scientific Research (NWO). Dr. Mons has also co-founded several spin off companies. Currently, Dr. Mons is Professor of Biosemantics at the Human Genetics Department of Leiden University Medical Center, is Head of Node for ELIXIR-NL at the Dutch Techcentre for Life Sciences, Integrator of Life Sciences at the Netherlands eScience Center, and Board member of the Leiden Centre of Data Science. In 2014, Dr. Mons initiated the FAIR data initiative, and in 2015, was appointed Chair of the European Commission's High Level Expert Group for the European Open Science Cloud. For Dr. Mons' publication and citation record for the FAIR data initiative, see: and for Nanopublications, see:


    Individuals with disabilities who need Sign Language Interpreters and/or reasonable accommodation to participate in this event should contact Kristan Brown at or 301-402-9816 and/or the Federal Relay (1-800-877-8339) at least 5 business days prior to the event.

  • 29

    High-Performance Integrated Virtual Environment (HIVE): A Regulatory NGS Data Analysis Platform

    January 29, 2016

    The Office of the Associate Director for Data Science invites you to attend the following lecture as part of the NIH Frontiers in Data Science Lecture Series:

    TITLE:  “High-Performance Integrated Virtual Environment (HIVE): A Regulatory NGS Data Analysis Platform”

    DATE:  Friday, January, 29 2016, 10:00AM – 11:00AM

    LOCATION:  NIH Main Campus, Building 35A, Room 640

    This lecture will be videocast here.

    SPEAKERS:  Vahan Simonyan, Ph.D., HIVE Lead Scientist at the U.S. Food and Drug Administration (FDA) and Raja Mazumder, Ph.D., HIVE Lead Scientist at George Washington University (GW)


    The abundance of miscellaneous high-performance computational platforms available across academia, the healthcare industry, and in government organizations isn't doing much to close the gap between research and regulatory analytics.  Extra iterations for drug, device, and biologics approval processes are causing a significant cost increase for medical product development.  High-Performance Integrated Virtual Environment (HIVE), co-developed by FDA and GW, presents a great opportunity for serving as a bridge.  It is authorized as a regulatory NGS data analysis platform and provides unique capability for healthcare stakeholders to look into NGS data from the regulatory perspective of FDA.

    As a distributed storage and computation environment and a multicomponent cloud infrastructure, HIVE provides secure web access for authorized users to deposit, retrieve, annotate, and compute biomedical big data and to analyze the outcomes using web interface visual environments appropriately built in collaboration with internal and external end users.  In addition to the initial HIVE applications to next generation sequencing, the current universe of HIVE projects covers tailor-made applications involving dimensionality analysis, federated and integrated data mapping, modeling, and simulations that are applicable to basic research, biostatistics, epidemiology, clinical studies, post-market evaluation, manufacturing consistency, environmental metagenomics, outbreak detection, and more.


    Vahan Simonyan, Ph.D. is the HIVE Lead Scientist at FDA, an author to more than 50 scientific publications in quantum physics and chemistry, nanotechnology, biotechnology, population dynamics, and bioinformatics. The technology developed by Dr. Simonyan and the code-base donated to the U.S. Government has launched HIVE at FDA. This resulted in an enormous success resulting in a regulatory compliant R&D IT platform capable of handling peta-scale data from sequencing projects, post-market analytics, clinical, and preclinical data analysis. Currently Dr. Simonyan's collaborations span the scope of +80 medium to large research and regulatory projects with scientists from government organizations, large healthcare consortia, and academia.

    Raja Mazumder, Ph.D. is the HIVE Lead Scientist at GW, an Associate Professor of Biochemistry and Molecular Medicine, and Director of the McCormick Genomic & Proteomic Center at GW. Prior to joining GW, Raja was faculty at Georgetown University where he worked on the UniProt project as a Team Lead with colleagues from the European Bioinformatics Institute and Swiss Institute of Bioinformatics. Prior to Georgetown, Raja worked at the National Center for Biotechnology Information (NCBI) as a Bioinformatics Scientist.


    Individuals with disabilities who need Sign Language Interpreters and/or reasonable accommodation to participate in this event should contact Kristan Brown at or 301-402-9816 and/or the Federal Relay (1-800-877-8339) at least 5 business days prior to the event.

  • 11

    Big Data to Knowledge Multi-Council Working Group - January 2016

    January 11, 2016

    Notice is hereby given of a meeting of the Big Data to Knowledge Multi-Council Working Group.

    The teleconference meeting will be open to the public as indicated below. Individuals who plan to attend and need special assistance, such as sign language interpretation or other reasonable accommodations, should notify the contact person listed below in advance of the meeting.

    Name of Working Group:  Big Data to Knowledge Multi-Council Working Group

    Date:  January 11, 2016

    Open Session:  11:00am - 12:00pm EST

    Agenda:  Discussion will review current Big Data to Knowledge (BD2K) activities and newly proposed BD2K initiatives.

    Open Session Presentations:

    Place:  Teleconference

    This meeting is open to the public but is being held by teleconference only. No physical meeting location is provided for any interested individuals to listen to committee discussions. Any individual interested in listening to the meeting discussions must call: 1-866-692-3158 and use Passcode: 2956317, for access to the meeting.

    Closed Session:  12:10pm - 3:30pm

    Agenda:  Discussion will focus on review of proposed Funding Plans for BD2K Funding Opportunity Announcements.

    Contact Person:  Tonya Scott, email:, phone: 301-402-9817.

    Information is also available on the Office of the Associate Director for Data Science's home page: where an agenda and any additional information for the meeting will be posted when available.

    Office of The Director, National Institutes of Health; Notice of Meeting

  • 12

    2015 BD2K All Hands Meeting

    November 12, 2015

    The NIH Big Data to Knowledge (BD2K) initiative seeks to establish a digital ecosystem for biomedical research. The 2015 BD2K All Hands Meeting will bring together researchers, educators, developers, and trainees from all of the BD2K initiative grants. The goals of the All Hands Grantee Meeting are: 1) to bring together all participants and grantees in BD2K to showcase research being conducted in the BD2K programmatic areas, and 2) to build a cohesive BD2K consortium that maximizes synergies between participants. The meeting will also highlight the strategic mission of BD2K and the NIH Office of the Associate Director for Data Science (ADDS). 

    WHEN:  November 12 – 13, 2015

    WHERE:  Natcher Conference Center, Building 45, NIH Main Campus, Bethesda, MD

    WHO CAN ATTEND:  This meeting is for BD2K grantees, BD2K program staff, and NIH staff only. The plenary sessions are available for public view at:

    Day 1 Videocast

    Day 2 Videocast

    REGISTRATION:  All persons wishing to attend this event should register at (Registration is now closed).

    REASONABLE ACCOMMODATION:  Individuals with disabilities who need sign language interpreters and/or reasonable accommodation to participate in this event should contact Sonynka Ngosso, at or 301-402-9816 and/or the Federal Relay at 1-800-877-8339. Requests should be made at least 5 business days in advance of the event.

    LODGING OPTIONS:  BD2K has reserved room blocks at the following local hotels. Please use the links provided to reserve a room in one of these blocks.

    Hyatt Regency Bethesda:
    The cut-off date is Thursday October 15, 2015

    Hilton Double Tree Bethesda:
    The cut-off date is Friday, October 16, 2015

    Bethesda Marriott:
    The cut-off date is Tuesday , October 20, 2015 

    For additional information please visit the event registration page at


  • 2

    Women in Data Science Conference

    November 2, 2015

    This one-day technical conference aims to inspire, educate and support women in the field – from those just starting their journey to those who are established leaders in industry, academia, government and NGO’s.

    The Inaugural Women in Data Science Conference

    Date: November 2nd, 2015

    Location: Stanford University Palo Alto, CA


    Special early bird registration rates are available now through September 20th.
    Space is limited, and the conference is expected to sell out-- we highly encourage attendees to register early and reserve a spot.

    This conference will feature:

      *   Tech Talks from industry experts from around the world
      *   Panels showcasing the wide range of career paths in the field, including female entrepreneurs in data science
      *   Lunchtime "unconference" break-out sessions
      *   Reception ​and networking

    This conference is co-sponsored by the BD2K Mobilize Center. Additional sponsors include: the Stanford Institute for Computational and Mathematical Engineering (ICME), Stanford Department of Statistics, Stanford Office of the President, and Computer Forum. This conference is possible thanks to the generous funding provided by Walmart.

  • 28

    Data-level Metrics

    October 28, 2015

    This webinar will be co-sponsored by NCI as part of the CBIIT Speaker Series.

    Martin Fenner is the DataCite Technical Director since August 2015. From 2012 to 2105 he was technical lead for the PLOS Article-Level Metrics project. Martin has a medical degree from the Free University of Berlin and is a Board-certified medical oncologist.

    Abstract: The DataONE repository network, California Digital Library and Public Library of Science (PLOS) from October 2014 - October 2015 work on a NSF-funded project to explore metrics -  including citations, downloads and social media -  for about 150,000 datasets. This presentation will summarize the major hurdles to make this work, the most important findings, and some ideas to go forward, including implementation as a production service.

    Date/Time: October 28, 2015, 11AM - 12PM

    Location: Webinar

    On-site location: NCI, Shady Grove room 2W910 - 912

    For more information, please visit the event page at



  • 15

    Big Privacy: Policy Meets Data Science

    October 15, 2015

    It is time to register for the Privacy Symposium!

    With the advent of high-throughput methods in biomedical research, the drive for precision medicine, and the advances in computational methods that foster "big data science," many commentators have expressed concern about how to promote biomedical science while respecting people's privacy. Biomedical research data may be subject to different privacy laws and regulations depending on the type of institution holding or using the data, the type of data, who funds the research, the state in which the research is conducted, and other factors. Biomedical researchers are generally required to protect patient and research participant privacy, while at the same time researchers are encouraged or explicitly required to share data with the scientific community. In some cases privacy protections can impede science, but in some cases data sharing can expose research participants or patients to informational risk. This half-day symposium will examine legal, policy, and technical issues at the intersection of data privacy and data science. 


    Title: Big Privacy: Policy Meets Data Science

    A Symposium sponsored by the Center for Predictive Computational Phenotyping (CPCP), an NIH Center of Excellence for Big Data Computing

    Date/Time: October 15, 2015, 1:00 pm - 5:00 pm

    Location: Wisconsin Institutes For Discovery on the UW-Madison Campus, DeLuca Forum, 330 N. Orchard St. 

    Registration: Registration is required for this event. Please go to

    This event is free.

  • 9

    BD2K: California Big Data Brain Workshop

    October 9, 2015

    A highly interactive opportunity to present your software, big data discoveries, and big data resources and to discuss how to leverage our California-bsed BD2K efforts into further consortia activities for large-scale biomedical science and training. All BD2K grant awardees based in California and their non-CA-based collaborators are invited to join this meeting.

    Date/Time: October 9th-10th

    Location: Palm Springs, California at the JW Marriott Resort


    Audience: This event is limited to California Center-based BD2K grantees, the BD2K CA Centers collaborators, NIH BD2K staff, and other invited participants. 

    Registration: REQUIRED see event website.


  • 30

    NIH Common Data Elements (CDE) Initiatives Overview Workshop

    September 30, 2015

    Planning Committee:

    OD:  Phil Bourne, Jennie Larkin, Leslie Derr, Angel Horton, Sonynka Ngosso

    NLM:  Betsy Humphreys, Mike Huerta, Lisa Lang, Jerry Sheehan

    NCI:   Warren Kibbe, Sherri De Coronado, Dianne Reeves, Denise Warzel

    NCATS:   Elaine Collier

    NIEHS:   Cindy Lawler


    Executive Summary

    BD2K, BMIC, NLM, NCI, NCATS, and NIEHS jointly organized a workshop for NIH staff to explore the role of CDEs in NIH Data Sharing. This workshop convened 40 representatives of the NIH community that were interested in CDEs with the goals of:

    • Supporting NIH-wide understanding of current activities and opportunities related to CDEs.
    • Identify current barriers/challenges for the adoption and use of CDEs by NIH-funded researchers, both intramural and extramural.
    • Identify possible ways to modify development, implementation, and use of CDEs to increase adoption and value to research.
    • Identify incentives and opportunities for involvement of relevant communities in CDE development, use, and re-use.
    • Develop evaluation plans for CDEs to test their assumed utility.
    • Identify opportunities to improve coordination in the development of CDEs for research use and in infrastructure for developing and making them accessible.
    • Determine how best to support CDE activities in the context of BD2K.

    NIH Common Data Element (CDE) Initiatives Overview Pre-Workshop Webinar

    Prior to the workshop, a preparatory webinar was held on September 8, 2015.  This webinar included presentations from several ongoing CDE programs by NIH staffs that are engaged in the BMIC CDE Working Group.

    Webcast Recording and Meeting Slides

  • 21

    Recent Developments in Artificial Intelligence - Lessons from the Private Sector

    September 21, 2015

    Please join us for a Frontiers in Data Science Seminar. Andrew Moore is the Dean of the School of Computer Science at Carnegie Mellon University. His areas of research and expertise include decision and control algorithms, statistical machine learning, artificial intelligence, robotics, and statistical computation for large volumes of data. Andrew more previously served as the VP of Engineering at Google Pittsburg where he was responsible for the retail segment: Google Shopping. Andrew was involved with a number of Google/University activities, two examples of which were Google Sky (in collaboration with CMU, Hubble Space Telescope Center and University of Washington) and the Android SkyMap app.

    Photograph of Andrew Moore Dean of Computer Science at Carnegie Mellon University

    Andrew Moore, Ph.D.

    Dean of Computer Science at Carnegie Mellon University

    Title: Recent Developments in Artificial Intelligence - Lessons from the Private Sector

    September 21, 2015 12:00-1:00 pm
    Lipsett Auditorium, NIH Main Campus Bldg 10

    This talk is co-sponsored by the Office of the Associate Director for Data Science and the National Library of Medicine.

    Abstract: Andrew more will discuss some of the big developments in computer science from the perspective of someone crossing over from industry to academia. He will talk about roadmaps for AI-based consumer and advice products in the commercial world and contrast with some of the potentially viable roadmaps in healthcare. Andrew more will also touch on entity stores (aka knowledge graphs), question answering and ultra-large data center architectures.

    Videocast Information for this lecture will be available soon.


  • 21

    JHU DaSH - Data Science Hackathon

    September 21, 2015

    Join a team of data scientists to tackle a real-world, cutting-edge problems!

    The NIH Big Data to Knowledge (BD2K) program and the NIH Library are pleased to join the Johns Hopkins (JHU) Bloomberg School of Public Health Department of Biostatistics in announcing the first JHU DaSH – Data Science Hackathon.

    This event will be an opportunity for data scientists and data scientists-in-training to get together and hack on real-world problems collaboratively and to learn from each other. The DaSH will feature data scientists from government, academia, and industry presenting problems and describing challenges in their respective areas. There will also be a number of networking opportunities where attendees can get to know each other. We think this will be fun event and we encourage people from all areas, including students (graduate and undergraduate), to attend.

    Location: Baltimore, MD

    Dates: September 21-23, 2015

    Event Website:

    Application (NIH Staff and Trainees): This event requires application. NIH staff or trainees who would like to attend should complete the application at (no later than Aug 14th) rather than the one on the website. For questions, contact Lisa Federer ( in the NIH Library. 

    Application (Non-NIH): This event requires application. Non-affiliates of NIH should apply directly through

  • 16

    Data Analysis with Pipes

    September 16, 2015

    Please join us for a Frontiers in Data Science Seminar.  Hadley Wickham, Ph.D., Chief Scientist at RStudio and Adjunct Assistant Professor at Rice University is the author of several of the most revolutionary, influential, and popular software packages for the R statistical software environment including dplyr, ggplot2, reshape2, and numerous others.

    Photograph of Hadley Wickham

    Hadley Wickham, Ph.D.

    Chief Scientist, R Studio and Adjunct Assistant Professor, Rice University

    Title: Data Analysis with Pipes

    September 16, 2015 2:30-3:30 pm
    NIH Bldg 40 Room 1201/1203

    This talk is co-sponsored by the Office of the Associate Director for Data Science and the National Cancer Institute.

    Abstract: Over the last year and half, three things have had a profound impact on how I develop tools for data analysis: Rcpp, writing the advanced R book ( and the pipe operator (%>%, from magrittr). In this talk, I'll focus on the pipe operator and how it’s influenced the development of tidyr, dplyr and ggvis, the next generation of reshape2, plyr and ggplot2. Come along to learn about why I think pipelines are awesome and see how pipelines + tidyr, dplyr, and ggvis can make your data analysis fast, fluent and fun.

    This event will not be videocast but remote participants may join the live webmeeting at
    Call-in number: 1-(888) 971-0934, passcode: 32359385# 

  • 15

    MD2K Center of Excellence All Hands Meeting

    September 15, 2015

    The MD2K Center of Excellence Annual All Hands Meeting

    Dates: September 15-16, 2015

    Location: Memphis, TN


    This meeting will be videocast live. The program agenda, list of attendees, and information on how to join the meeting remotely are available at the meeting website

  • 20

    MD2K Seminar Series: Introduction to First Person Vision

    August 20, 2015

    Recent progress in miniaturizing digital cameras and improving battery life has created a growing market for wearable cameras, exemplified by products such as GoPro and Google Glass. At the same time, the field of computer vision, which is concerned with the automatic extraction of information about the world from images and video, has also made rapid progress due to the increasing availability of image data, increases in computational power, and the emergence of machine learning methods such as deep learning. The analysis of video captured from body-worn cameras is an emerging subfield of computer vision known as First Person Vision (FPV). FPV provides new opportunities to model and analyze human behavior, create personalized records of visual experiences, and improve the treatment of a broad range of mental and physical health conditions. In this talk I will provide an introduction to some of the concepts and methods from computer vision which underlie the analysis of first person videos. In particular, I will focus the automatic analysis of video to track the motion of the camera and recover the 3D geometry of the scene, recognize activities, and detect and recognize objects of interest. This seminar will also briefly discuss the role of visual attention in FPV. The presentation won’t assume any prior knowledge of computer vision. The second presentation will focus on specific FPV technologies in the context of the BD2K Mobile Sensor Data-to-Knowledge Center (MD2K).

    Title: MD2K Seminar Series: Introduction to First Person Vision (Seminar 2-of-2)

    Date/Time:  Thursday, August 20 – 3:00 pm CT

    Presented by: Dr. James M. Rehg, Professor, College of Computing - Georgia Institute of Technology, Deputy Director, MD2K Center of Excellence for Mobile Sensor Data-to-Knowledge


    *The first part of the Seminar: Introduction to First Person Vision is archived on YouTube at

    Learning Objectives: Following the presentation, attendees will be able to:

    • Describe some basic analysis goals for first person video and identify some of the challenges posed by automatic video analysis
    • Summarize the relationship between the movement of a body-worn camera in 3D, the motion induced in a video sequence, and methods for estimating video motion
    • Outline a basic approach to activity recognition in first person video using either object or motion features, including the major system components and sources of error


    R. Szeliski, Computer Vision: Algorithms and Applications, Chapters 6, 7, and 8
    E-book -
    H. Wang and C. Schmid, Action Recognition with Improved Trajectories. In Proc. IEEE Intl. Conf. on Computer Vision (ICCV 13), pp. 3551-3558, Sydney, Australia, Dec 2013.
    Paper -
    Source code -
    H. Pirsiavash and D. Ramanan. Recognizing Activities of Daily Living in First-Person Camera Views. In Proc. IEEE Conf. on Computer Vision and Pattern Recognition (CVPR 12), Providence, RI, June 2012.
    Paper -
    Source code and dataset -

    About the Presenter: James M. Rehg (pronounced "ray") is a Professor in the School of Interactive Computing at the Georgia Institute of Technology, where he is co-Director of the Computational Perception Lab (CPL) and Director of the Center for Behavioral Imaging. He received his Ph.D. from CMU in 1995 and worked at the Cambridge Research Lab of DEC (and then Compaq) from 1995-2001, where he managed the computer vision research group. He received an NSF CAREER award in 2001 and a Raytheon Faculty Fellowship from Georgia Tech in 2005. He and his students have received best student paper awards at ICML 2005, BMVC 2010, Mobihealth 2014, and Face and Gesture 2015, and a 2013 Method of the Year Award from the journal Nature Methods. Dr. Rehg serves on the Editorial Board of the Intl. J. of Computer Vision, and he served as the Program co-Chair for ACCV 2012 and General co-Chair for CVPR 2009, and will serve as Program co-Chair for CVPR 2017. He has authored more than 100 peer-reviewed scientific papers and holds 25 issued US patents. His research interests include computer vision, machine learning, pattern recognition, and robot perception. Dr. Rehg is the lead PI on an NSF Expedition to develop the science and technology of Behavioral Imaging, the measurement and analysis of social and communicative behavior using multi-modal sensing, with applications to developmental disorders such as autism. He also serves as the Deputy Director of the NIH Center of Excellence on Mobile Sensor Data-to-Knowledge (MD2K). See and  for details.

    The MD2K Seminar Series is a service of the MD2K Center of Excellence for Mobile Sensor Data-to-Knowledge

  • 14

    NIH Data Science Distinguished Seminar Series: BRAIN/BD2K Seminar

    August 14, 2015

    Towards solutions to experimental and computational challenges in neuroscience

    Christof Koch, Ph.D. 
President and Chief Scientific Officer, Allen Institute for Brain Science  
    Emery N. Brown, M.D., Ph.D. 
Professor of Computational Neuroscience and Health Sciences and Technology, Department of Brain and Cognitive Sciences
 MIT-Harvard Division of Health Sciences and Technology

    Drs. Koch and Brown will describe the computational or experimental challenges associated with Big Data in their respective domains of neuroscience. From the basic to applied realms, science is being transformed by the collection of data on increasingly finer resolutions, both spatially and temporally. Storing, accessing, and analyzing these data create numerous challenges as well as opportunities. 

    Location: Masur Auditorum, Building 10, NIH Main Campus, Bethesda, MD

    Videocast: This event will be videocast. To view the videocast please go to

    Attending the seminar: This is a public event at the National Institutes of Health. All individuals interested in the seminar may attend. If this will be your first time visiting the NIH we strongly encourage you to review the visitor information at and allow extra time for security and transit. Individuals with disabilities who need Sign Language Interpreters and/or reasonable accommodation to participate in this event should contact Sonynka Ngosso, at (301) 402-9816. Requests should be made at least 5 business days in advance of the event.

    About the Speakers:

    Photograph of Christof Koch
    Christof Koch, Ph.D.
    is the President and Chief Scientific Officer of the Allen Institute for Brain Science. His research interests include elucidating the biophysical mechanisms underlying neural computation, understanding the mechanisms and purpose of visual attentn, and uncovering the neural basis of consciousness and the subjective mind. Dr. Koch has published extensively, and his writings and interests integrate theoretical, computational and experimental neuroscience. His most recent book, Consciousness: Confessions of a Romantic Reductionist, blends science and memoir to explore topics in discovering the roots of consciousness. Stemming in part from a long-standing collaboration with the late Nobel Laureate Francis Crick, Koch authored the book The Quest for Consciousness: A Neurobiological Approach. He has also authored the technical books Biophysics of Computation: Information Processing in Single Neurons and Methods in Neuronal Modeling: From Ions to Networks, and served as editor for several books on neural modeling and information processing.

    Photograph of Emery Brown

    Emery N. Brown, M.D., Ph.D. is the Warren M. Zapol Professor of Anaesthesia at Harvard Medical School; an anesthesiologist at Massachusetts General Hospital (MGH); and the Edward Hood Taplin Professor of Medical Engineering and Computational Neuroscience at Massachusetts Institute of Technology. Dr. Brown received his B.A. (magna cum laude) in Applied Mathematics from Harvard College, his M.A. and Ph.D. in statistics from Harvard University and his M.D. (magna cum laude) from Harvard Medical School. Dr. Brown completed his internship in internal medicine at the Brigham and Women’s Hospital and his residency in anesthesiology at MGH.

    Dr. Brown is an anesthesiologist-statistician whose experimental research has made important contributions to understanding the neuroscience of how anesthetics act in the brain to create the states of general anesthesia, using the EEG to accurately monitor the anesthetic state and devising new approaches to precisely control the anesthetic state. Dr. Brown is also widely recognized for his statistics research in which he has developed statistical methods to analyze dynamic processes in neuroscience.

    Dr. Brown served on the NIH BRAIN Initiative Working Group and is member of the International Anesthesia Research Society Board of Trustees. Dr. Brown is the recipient of an NIH Director’s Pioneer Award, an NIH Director’s Transformative Research Award, the 2011 Jerome Sacks Award from the National Institute of Statistical Science, 2015 Guggenheim Fellowship in Applied Mathematics and the American Society of Anesthesiologists 2015 Excellence in Research Award.

    He is a fellow of the American Statistical Association, the IEEE, the American Association for the Advancement of Science and the American Academy of Arts Sciences. Dr. Brown is the first and only anesthesiologist to be elected a member of all three branches of the National Academies: the National Academy of Medicine (formerly the Institute of Medicine), the National Academy of Sciences and the National Academy of Engineering. 

  • 13

    Harnessing Big Data to Stop HIV

    July 13, 2015

    The Division of AIDS, NIMH Division of AIDS Research, NIH Big Data to Knowledge, and the Bill and Melinda Gates Foundation are hosting a meeting, Harnessing Big Data to Stop HIV to further explore the potential for Big Data approaches to answer questions in HIV research. Over one and a half days, sessions will address examples of Big Data research, legal and ethical issues, and statistical and methodological challenges and solutions. Following a series of talks on these topics, there will be break-out sessions focused on five critical research areas: (1) HIV among women and girls, (2) HIV among men who have sex with men, (3) identification of those recently infected with HIV (4)  the HIV continuum of care, and (5) ethical, legal and policy challenges in big data research on HIV. In the breakout groups, participants will work together to define research questions that could be addressed with Big Data approaches, as well as new challenges that would arise in the use of Big Data techniques in each of these areas.

    Registration is required and webcasting will be available. This meeting is a companion to the new funding annoucement NIH PA-273, Harnessing Big Data to Stop HIV which is being solicited by NIAID, NCI, NIDA, and NIMH. 

    Location: 5601 Fishers Lane, Bethesda Maryland




    Day 1:
    Day 2:

    The organizing committee:

    Pim Brouwers, NIMH               Sam Garner, NIAID

    David Burns, NIAID                    Rosemary McKaig, NIAID

    Liza Dawson, NIAID                   Joana Roe, NIAID

    Elizabeth Flanagan, NIAID       Carolyn Williams, NIAID

  • 13

    Big Data Course for Computational Medicine (Mayo Clinic)

    July 13, 2015

    Precision medicine’s promise to deliver the right treatment at the right time relies on our ability to extract information from high-dimensional data sets that combine traditional clinical data in electronic health records with data generated by high- throughput technologies. To meet this challenge, new approaches for data representation, integration, analysis, visualization and sharing need to be developed collaboratively by quantitative scientists, biomedical researchers, clinicians, and bioethicists. This joint Mayo Clinic and University of Minnesota week-long Big Data Coursework for Computational Medicine (BDC4CM) funded by the U.S. National Institutes of Health (NIH). BDC4CM will emphasize how to navigate the interface between research and practice by offering participants in-depth lectures, case studies and hands-on training from leading researchers in academia and industry. This short course will be held at the Mayo Clinic in Rochester, MN and is supported by an award from the BD2K Enhancing Training program. Additional information about this course can be found at

  • 6

    OHSU BD2K Skills Course (Oregon Health Sciences University)

    July 6, 2015

    The OHSU BD2K Skills Courses provide a series of training opportunities for a variety of learners. A set of interactive on-line preparatory elements with one-on-one mentoring from data scientists is in place to provide initial competence. For all students, we utilize an in-person collaborative case study focused on a set of big data challenges over a week of in-depth instructions. For advanced students, we have selected extremely challenging problems to promote new methodologies in big data science.

    The next offering of the OHSU BD2K Skills Course will be held July 6-10th, 2015, 9am-5pm daily at Oregon Health Sciences University (OHSU) in Portland, OR and is focused on early learners and undergraduate interns. This course will cover the basics of data science in research including:

    • Defining a Scientific Problem
    • Data identification and Resources
    • Data Wrangling
    • Methods Tools and Analysis
    • Scientific Communication

    See the OHSU BD2K Skills July 2015 Flyer for more detail. Mentors contact the Nicole Vasilevsky.

  • 6

    1st Summer Institute in Statistics for Big Data (University of Washington)

    July 6, 2015

    The 1st Summer Institute in Statistics for Big Data (SISBID 2015) consists of a series of two-and-a-half day workshops (modules) designed to introduce biologists, quantitative scientists, and statisticians to modern statistical techniques for the analysis of Biological Big Data. The format will involve formal lectures, computing labs, and hands-on case studies. The instructors are world-class faculty with expertise in all aspects of Biological Big Data. Participants are encouraged to enroll in multiple modules.

    Dates: July 6-22, 2015

    Location: University of Washington in Seattle, Washington



    Five modules are available during the Institute:

    • Accessing Biomedical Big Data - July 6-8, 2015
    • Visualization for Biomedical Big Data - July 8-10, 2015
    • Supervised Methods for Statistical Machine Learning - July 13-15, 2015 (REGISTRATION FULL)
    • Unsupervised Methods for Statistical Machine Learning - July 15-17, 2015
    • Reproducible Research for Biomedical Big Data - July 20-22, 2015

  • 24

    Precision Medicine 2015: Patient Driven

    June 24, 2015

    President Obama launched the Precision Medicine Initiative on January 30th. That event led us to organize a conference at HMS on Precision Medicine. The theme of this inaugural conference of a planned series is Precision Medicine: Patient Driven.

    There will be a lot of interesting and interested people at the conference. It will start with a keynote by Matt Might and include talks by Krishna Yeshwant, Linda Avey and Arlene Sharpe. In this 1.5 day meeting there will be expert panels on how we are going to pay for precision medicine, on the ethical and regulatory challenges, delivering precision medicine to the point of care, and several inspiring examples of patient-driven successes.

    Location: Joseph Martin Conference Center, Harvard Medical School

    Sponsor: BD2K Centers - Patient Centered Information Commons


    Agenda: Preliminary Agenda


  • 8

    Causal Discovery from Biomedical Data

    June 8, 2015

    The Center for Causal Discovery (CCD)  Summer Short Course on Causal Discovery from Biomedical Data.

    This intense (and fun), 4-day, hands-on learning event will introduce data scientists and biomedical investigators at the graduate level and above who understand basic statistical principles to:

    • Major topics in causal graphical modeling
    • Machine learning of causal models from biomedical data
    • Advances in biomedical research through causal discovery

    The CCD short course is directed by Dr. Richard Scheines, a leader in the field of statistical causal models and a pioneer in education technology.  Attendees will also learn from and work with other CCD members who are experts in data science and biomedical research at the University of Pittsburgh, Carnegie Mellon University, and the Pittsburgh Supercomputing Center.  For more information on the course, please see the attached announcement and visit the CCD Summer Short Course


    Short Course Dates: Monday, June 8 – Thursday, June 11, 2015

    Location: Carnegie Mellon University, Pittsburgh, PA Registration

    Deadline: Friday, May 15, 2015 – closed at 75 registrants  RE-OPENED temporarily (25 additional slots)

    Online Registration:

    Costs: No registration fee, but attendees will be responsible for travel, housing (discount dorm and hotel rooms available), and most meals

  • 21

    MD2K Webinar with Emre Ertin: Contactless Physiological Sensing in the Mobile Environment

    May 21, 2015

    Please join us for the newest installment in the MD2K Webinar Series:


    Contactless Physiological Sensing in the Mobile Environment using Ultrawideband Radio-frequency Probes


    Presented by:

    Emre Ertin, Ph.D.
    Sensor Platform Technologist, MD2K Center

    Research Associate Professor

    Department of Electrical and Computer Engineering

    The Ohio State University


    About the topic:


    Physiological monitoring in the mobile environment can provide visibility into the health status of individuals such as cardio-respiratory state, psychological health, addictive behavior, and patterns of social interaction. Physiological monitoring today, however, require wearing of ECG electrodes, or respiration belts, and are therefore only suitable for small-scale research studies for short-term data collection in the field. In this talk, we will review  our recent efforts in development of non-contact radio-frequency(RF) sensors for unobtrusive monitoring of physiology in the mobile environment that  can enable large scale research studies into the potential causes of complex diseases and risky behavior. The talk will give a brief overview of the Easysense sensor design, a low-power micro ultrawideband (UWB) radar platform for monitoring of body composition and heart-lung motion. We will also present algorithms for learning and exploiting subspace structure of high dimensional data from  EasySense sensor  for detecting and analyzing heart and lung motion.


    About the speaker:


    Emre Ertin, Ph.D., is a Research Associate Professor with the Department of Electrical and Computer Engineering at The Ohio State University. He received the B.S. degree in Electrical Engineering and Physics from Bogazici University in Turkey in 1992, the M.Sc. degree in Telecommunication and Signal Processing from Imperial College, U.K. in 1993, and the Ph.D. degree in Electrical Engineering from Ohio State in 1999. From 1999 to 2002 he was with the Core Technology Group at Battelle Memorial Institute. His current research interests are biomedical sensor design and statistical signal processing with application to sensor networks and mobile health.


    To join the Meeting:


    To join via Browser:


    To join with Lync:



    MD2K Webinars are a service of the MD2K Center of Excellence for Mobile Sensor Data-to-Knowledge, a NIH Big Data to Knowledge Center of Excellence for Big Data Computing

  • 7

    Network of BioThings Hackathon

    May 7, 2015

    During this hackathon at the Scripps Hazen Campus in La Jolla, CA we will work on projects relating to any of the major challenges of biomedical big data.

    These may include:

    • Locating, gaining access to, standardizing, and documenting data, software tools, and APIs
    • Developing effective, robust, reproducible and interoperable means of sharing data and software tools
    • Developing new methods for analyzing biomedical big data
    • Training researchers for analyzing biomedical big data

    1st BD2K 3rd Network of BioThings Hackathon

    When: May 7-9, 2015

    Where: The Scripps Research Institute (TSRI) Hazen Campus in La Jolla, CA


    RegistrationRegister Here

    Fees: $30.00. Max 50 participants - first come first served! Includes food & drinks, eligibility for prizes, and a fancy t-shirt.

    Scholarships: Several scholarships will be available to promote education and diversity. Please fill out this form to be considered for a travel award or scholarship.

    This event is sponsored by the Heart of Data Science BD2K Center of Excellence , the CEDAR BD2K Center of Excellence, and the NHLBI Proteomics Center / COPaKB, and is supported by the International Society for BiocurationSage Bionetworks, the European Bioinformatics Institute, the BioCaddie BD2K Data Discovery Index, the San Diego Center for Systems Biology, and Geek Girls.

    Please send questions or comments about this event to the Network of Biothings mailing list at!forum/network-of-biothings or to gstupp at scripps


  • 30

    Webinar: Micro-randomized Trials for Just-in-time Adaptive Intervention Development

    April 30, 2015

    The Mobile Data to Knowledge (MD2K) Center is offering a webinar by Dr. Susan Murphy titled "Micro-randomized Trials for Just-in-time Adaptive Intervention Development, to be held Thursday, April 30 at 4:00 pm CT (5pm ET).  This seminar is being offered as part of a BD2K-funded Center of Excellence.

    Micro-randomized trials are trials in which individuals are randomized 100's or 1000's of times over the course of the study. The goal of these trials is to assess the impact of momentary interventions, e.g. interventions that are inte to impact behavior over small time intervals. We discuss the design and analysis of these types of trials with a focus on their use in developing JITAIs in mobile health.

    Join the webinar at:

    Susan A. Murphy, Ph.D., is the H.E. Robbins Distinguished University Professor of Statistics, Professor of Psychiatry and Research Professor, Institute for Social Research, University of Michigan She directs the Statistical Reinforcement Learning Lab at the University of Michigan. Her research concerns clinical trial design and the development of data analytic methods for informing multi-stage decision making in health. In particular for (1) constructing individualized sequences of treatments (a.k.a., adaptive interventions) for use in informing clinical decision making and (2) constructing real-time individualized sequences of treatments (a.k.a., Just-in-Time Adaptive Interventions) delivered by mobile devices. Dr. Murphy has developed a formal model of this decision-making process and an innovative design for clinical trials called Sequential Multiple Assignment Randomized Trial (SMART) that allow researchers to test the efficacy of adaptive interventions. In 2014, she was elected a member of the National Academies’ Institute of Medicine, and in 2013, she was selected as a MacArthur Fellow.


  • 29

    Webinar for RFA-CA-15-006 (BD2K) Advancing Biomedical Science Using Crowdsourcing and Interactive Digital Media (UH2)

    April 29, 2015

    The NIH recently issued the Funding Opportunity Announcement (FOA) RFA-CA-15-006 "Big Data to Knowledge (BD2K) Advancing Biomedical Science Using Crowdsourcing and Interactive Digital Media (UH2)." An applicant informational webinar will be held on April 29, 2015, from 2:00 p.m. to 3:00 p.m. (Eastern Daylight Time) to provide information about this FOA to assist prospective applicants. NIH staff will discuss the FOA's goals and objectives, the review process, and address questions. The webinar is open to all prospective applicants, but participation in the webinar is not a prerequisite to applying.

    To participate in the webinar use the information provided below

    Webinar Site:

    Interactive Media FOA Webinar Slides

    Meeting number:733 334 360

    Meeting Password: BigData1@

    Please direct all inquiries to:

    David J. Miller, Ph.D.
    National Cancer Institute (NCI)
    Telephone: 240-276-6210

    Click here for more details on the NIH Calendar

  • 20

    Accelerating Cross-Sectoral Collaboration on Data in Climate, Education and Health

    March 20, 2015

    Hosted by the National Institutes of Health (NIH), National Oceanic & Atmospheric Administration (NOAA), The Office for Science and Technology Policy (OSTP), The Governance Lab (The GovLab/NYU)

    This workshop will bring together government agencies, companies, data scientists and academics. This diverse set of participants will harness various competencies and areas of expertise to address existing knowledge gaps in the nascent field of data sharing. The focus of the discussion will be on the following issues:


    • Existing mechanisms of cross sectoral data sharing
    • Factors currently driving cross-sectoral data sharing (including incentives to participate in data collaboratives; the value and impact of sharing institutions and society-at-large; and methods and techniques for mitigating risks in data sharing)
    • Legal dimensions of cross-sectoral data sharing (including an examination of existing legal conventions that govern data sharing)
    • Best practices in forming and furthering data collaboratives


    The goal of the workshop is to identify and begin to address existing knowledge gaps for each of these areas. In doing so, the workshop will seek to deepen the value proposition of cross-sectoral sharing; foster greater participation and coordination among corporations and public organizations; and more generally enable the use of data and data sharing towards the greater public good.

    This workshop is open to invited participants only.

  • 13

    Pi Day at NIH

    March 13, 2015

    The National Institutes of Health will hold a Pi Day Celebration on the NIH main campus in Bethesda, MD and online on Pi Day Eve, March 13, 2015. The goal of the NIH Pi Day Celebration is to increase awareness across the biomedical science community of the role that the quantitative sciences play in biomedical science.


    Event URL:

    Videocast: Event will be videocast LIVE on the Web at

  • 25

    NIH BD2K Workshop on Community-Based Data and Metadata Standards

    February 25, 2015

    Chairs: Melissa Haendel, Ph.D. and Christopher Chute, M.D., Dr.P.H.
    NIH Lead Organizers: Cindy P. Lawler, Ph.D.

    Executive Summary:

    BD2K is formulating approaches to encourage development and facilitate the use of data-related (including metadata) standards more broadly across the biomedical research community and is, therefore, interested in the issues involved in developing Community-Based Standards (CBS). The goals of this workshop are:

    • Effective approaches, processes, and activities that could advance the community-based standards landscape (e.g., creating a collaborative workspace or an advising structure toward standards development, extension, or adoption).
    • Gaps in community-based data standards of relevance to biomedical research, including real use-cases (e.g., emerging fields and technologies, or research domains with multiple existing data standards that could benefit from additional work, integration and/or reconciliation).
    • Lessons learned from existing CBS efforts, particularly examples with field-tested processes and infrastructure or known examples of failures by CBS efforts.
    • Common challenges in CBS development (e.g., methods for community engagement or building interoperability with other related standards).
    • Considerations for evaluating progress and milestones to assess data standards development and utility.
    • Effective approaches for addressing the need to sustain useful standards, and to update existing standards as a field develops.

    This workshop is open to invited participants only.

  • 11

    BD2K EHR Data Methodologies for Clinical Research: Perspectives from the Field Think Tank

    December 11, 2014

    Co-Chairs: Michael Kahn, M.D., Ph.D. and David Madigan, Ph.D.
    NIH Lead Organizers: Elaine Collier, M.D. and Gina Wei, M.D., M.P.H.

    This think tank convened a small number of experts specifically to address methods for optimizing the robustness and use of data from the Electronic Health Records (EHR) for a variety of clinical research purposes that fall within NIH’s domain. Given the potential broad scope of this topic, participants were asked to focus primarily on issues related to the use of EHR on the ‘back end’ (i.e. the imperfect data as currently collected), rather than strategies to improve the quality of EHR data collected on the ‘front end’ (e.g., data entered by clinicians). Experts in accessing EHR data and experts in study design and analysis methods for research using EHR data presented the challenges, solutions, and needs based on their experience and knowledge of the field.

    Workshop Report: EHR Data Methods Workshop Report


    Agenda (with links to slides presentations): EHR Data Methods Workshop

    Contacts: Elaine Collier and Gina Wei

  • 8

    NIH BD2K Think Tank: Game Developers and Biomedical Researchers

    December 8, 2014

    December 8-9, 2014 
    Co-Chairs: Ben Sawyer and Markus Covert, Ph.D.
    NIH Lead Organizers: David Miller, Ph.D. and Jennifer Couch, Ph.D.

    Report | Agenda | Participant List

    As a component of the BD2K program, the National Institutes of Health is hosting a diverse group of game developers and biomedical researchers in a think tank exploring research games and the application of game methods and technologies for biomedical research. The purpose of this think tank is to explore the opportunities in and begin to address challenges of how these two communities – Game Developers and Biomedical Researchers – currently collaborate, exchange data science & visualization expertise, and develop games for enabling and performing biomedical research that addresses important science and health issues that affect everyone. This day and a half meeting will focus discussions on the following themes: 1) the technical and social infrastructure that enables Game Developers and Scientific Researchers to first find each other and then create new games, tools, and interfaces to research, 2) the common elements across biomedical research problems and games that both communities can address, and 3) the marketplace for matching games-amenable problem holders to solution providers.


    Join Day 1 WebEx (Dial In #: 1-240-276-6338) 
    Meeting ID #: 732 503 587 
    Password: Dcb@12345 

    Join Day 2 WebEx (Dial In #: 1-240-276-6338) 
    Meeting ID #: 731 351 883 
    Password: Dcb@12345 

    Twitter: #BD2K #ResearchGames

  • 2

    NIH BD2K Joint Kick-Off Meeting

    November 2, 2014

    Co-Chairs:Lisa Brooks & Ron Margolis

    Meeting Agenda

    The intention of this meeting is for the DDICC and BD2K Center investigators to discuss the goals of their consortia and how to collaborate with each other, other BD2K projects, and the NIH Commons.


  • 3

    ADDS Data Science Workshop

    September 3, 2014

    Introduction | BD2K Overview | Training Overview | NIH Commons Overview

    The goal of this workshop was to gather a group of external experts in biomedical data science, including some members of the original Data and Informatics Working Group, to discuss the future of data science at NIH. The information gathered at the meeting will be used to chart future efforts of the newly-formed ADDS office and the BD2K program.


    Agenda: ADDS Meeting Agenda

    Participant List: ADDS Meeting Attendees


  • 12

    BD2K Software Discovery Workshop

    May 12, 2014

    Co-Chairs: Dr. Owen White and Dr. Asif Dhar

    Workshop Report | Agenda | Participant List

    The Software Discovery workshop explored the challenges and opportunities associated with citing, tracking, and sharing biomedical software. We were interested in gaining an understanding of approaches for making software easier to locate via computer-readable meta-data, digital identifiers, and other methods. In addition, the workshop focused on identifying the needs of biomedical software users and developers as they seek to find, cite, and use these tools in biomedical research. Finally, we identified potential barriers and incentives to adoption and use of these different discovery, citation, tracking methods. The workshop was organized around three major sessions: Finding and Tracking Software; Software Citation and Other Incentives; and Software Reproducibility.



    Twitter Feed:#bd2kSDW

    Videocast Day 1:

    Videocast Day 2:

  • 13

    Applicant Information Webinar for RFA-HG-14-001 BD2K-LINCS-Perturbation Data Coordination and Integration Center (DCIC) (U54)

    January 13, 2014

    Chairs: Dr. Jennie Larkin & Dr. Ajay Pillai

    An Applicant Information Webinar will be held on Monday, January 13, 2014, from 1:00 – 2:30 pm ET, to provide information about the Library of Integrated Network-based Cellular Signatures (LINCS) FOA to prospective applicants. This FOA seeks applications to develop a data coordination and integration center (DCIC) that will address the opportunities and challenges provided by two major NIH efforts: Big Data to Knowledge (BD2K) and the Library of Integrated Network-Based Cellular Signatures (LINCS). The NIH expects that this BD2K-LINCS DCIC will focus on perturbagen-response data and signatures while ensuring that the resulting resources are effectively utilized by the community by addressing challenges related to biomedical Big Data. A successful DCIC will ensure consistent annotation of data and tools generated within the LINCS program; incorporate (without replicating databases) relevant non-LINCS perturbation data into the LINCS resource; support integration of relevant data, signatures, and tools to allow for seamless exploration of the (LINCS) program’s output by a broad range of biomedical researchers; support linkages to outside knowledge bases, data portals, and resources; support training in perturbation-data science skills; build innovative access and query tools to disparate databases hosting multiple data types; and disseminate the resulting tools and resources to the broad range of biomedical researchers. Related FOAs of relevance to the DCIC include RFA-RM-13-013 soliciting applications for LINCS data and signature generation and RFA-HG-13-009 soliciting applications for BD2K Centers of Excellence.

    Funding Opportunities:

    Information on how to participate in the webinar:
    Registration: None
    Audio: Dial: 1-800-779-8174
               Participant Code: 38-60-157
    Slides with background information that will be displayed during the webinar will be available at  shortly before the webinar begins. A document with the questions and responses addressed during the webinar, and an audio recording of the webinar will also be made available on this website.

  • 25

    Frameworks for Community-Based Standards Efforts

    September 25, 2013

    Co-Chairs: Susanna Sansone, PhD and David Kennedy PhD.

    Workshop Summary | Workshop Report

    The overall goal of this workshop is to learn what has worked and what has not worked in community-based standards efforts. Participants will have experience in leading specific community based standards initiatives.  Prior to the workshop, participants will be asked to address in writing answers to specific questions regarding formulating, conducting, and maintaining such efforts.  This information will be used to facilitate focused and actionable discussion at the workshop.  Issuance of a Request for Information soliciting comment from the broader community on some of the key issues addressed in the workshop is currently envisioned.

    Agenda: Frameworks for Community-Based Standards Efforts (PDF 40.7KB)
    Participant List: Roster of Invited Participants (PDF 32KB)

  • 12

    Applicant Information Webinar for RFA HG-13-009 Centers of Excellence for Big Data Computing in the Biomedical Sciences (U54)

    September 12, 2013

    Chair: Dr. Carson Loomis


    This webinar will provide information about the FOA (RFA HG-13-009)to prospective applicants. NIH staff will provide an overview of the FOA and answer questions. The webinar is open to all prospective applicants. Participation in the teleconference is not a prerequisite for applying, and is not required for a successful application.

    Potential applicants are encouraged to submit their questions or comments prior to the meeting. Afterwards, the webinar slides and a summary of the questions and answers will be posted on the site.



    Click here for Frequently Asked Questions (FAQs) and answers; this information may be updated without additional notice.


    Webinar Video:

    Video Transcript: Centers of Excellence for Big Data Computing in Biomedical Sciences (U54) Webinar (PDF 71KB)


  • 11

    Enabling Research Use of Clinical Data

    September 11, 2013

    Co-Chairs: Robert M. Califf, M.D. and Daniel R. Masys, M.D.

    Agenda | Biographies | Workshop Report| Workshop Site

    This workshop will identify actionable steps that NIH can take (alone and with others) to enable research use of clinical data, e.g., in pragmatic clinical trials, observational studies, and genome-phenome relationships using electronic health records and other clinical data. In particular, we will consider needs for: 1) research and development of new technologies and methods; 2) common infrastructure to enable the future research scenarios; and 3) policy changes necessary to facilitate progress. Read More



    Videocast Day 1:

    Videocast Day 2:

  • 21

    NIH Data Catalog

    August 21, 2013

    Chair: Francine Berman, Ph.D.

    Workshop Summary | Agenda | RFI Response Summary| Participant List

    This workshop seeks to identify the least duplicative and burdensome, and most sustainable and scalable method to create and maintain an NIH Data Catalog. An NIH Data Catalog would make biomedical data findable and citable, as PubMed does for scientific publications, and would link data to relevant grants, publications, software, or other relevant resources. The Data Catalog would be integrated with other BD2K initiatives as part of the broad NIH response to the challenges and opportunities of Big Data and seek to create an ongoing dialog with stakeholders and users from the biomedical community.


  • 29

    Workshop on Enhancing Training for Biomedical Big Data

    July 29, 2013

    Co-chairs: Karen Bandeen-Roche, Ph.D and Zak Kohane, M.D., Ph.D.

    RFI (NOT-HG-13-003) | RFI Summary | Workshop Report

    This workshop provided recommendations that will guide NIH staff in the development of long- and short-term training initiatives, which aim to prepare and empower the biomedical research community to take full advantage of Big Data. The workshop will (a) identify the knowledge and skills needed by individuals and by collaborating teams to work productively with biomedical Big Data, and (b) discuss resources and programs needed to help both trainees and practicing scientists acquire the identified knowledge and skills.


    Agenda: Workshop on Enhancing Training for Biomedical Big Data Agenda (PDF 149KB)

    Videocast Day 1:

    Videocast Day 2:

Back to Top