Toward an Ethical Framework for AI in Biomedical and Behavioral Research: Transparency for Data and Model Reuse Workshop

Wednesday, January 31, 2024

January 31–February 2, 2024

The goal of this workshop was to explore and assess the landscape of ethical AI in the biomedical research context by gathering expert input on opportunities and challenges with respect to the ethical use and reuse of data and models in the Artificial Intelligence (AI) development cycle. This interactive workshop was to:

  • Begin to develop transparency guidelines for NIH awardees using, developing, or contributing to AI.
  • Identify tools and capability gaps for ethical and transparent AI.
  • Identify trends and future states.

This workshop was convened by the NIH Office of Data Science Strategy (ODSS) and the NIH-wide AI Ethics Working Group. ODSS leads implementation of the NIH Strategic Plan for Data Science through scientific, technical, and operational collaboration with the institutes, centers, and offices that comprise NIH.

Read the Co-Chair and Breakout Lead report of the workshop – including important opportunities to enhance transparency for researchers and others using, developing, or contributing to AI.

Agenda

Day 1: Wednesday, January 31, 2024
TimePresentation
12:30 p.m. – 1:00 p.m.Registration and Networking (lunch not provided or available to purchase on day 1)
1:00 p.m. – 1:05 p.m.Welcome and Logistics
Laura Biven, Ph.D.
Data Science Technical Lead, Office of Data Science Strategy (ODSS),
National Institutes of Health (NIH)
1:05 p.m. – 1:15 p.m.Welcoming Remarks 
Susan K. Gregurick, Ph.D.
Director of the Office of Data Science Strategy, Associate Director for Data Science, ODSS, NIH
1:15 p.m. – 1:35 p.m.Workshop Goals and Expectations; Stakeholder Mapping/Setting Expectations 
Laura Biven, Ph.D. 
Data Science Technical Lead, ODSS, NIH
1:35 p.m. – 1:55 p.m.Introduction to Transparency 
Julia Stoyanovich, Ph.D. 
Associate Professor, Department of Computer Science and Engineering, Tandon School of Engineering, and the Center for Data Science, New York University
1:55 p.m. – 2:15 p.m.What Could Go Wrong? 
Tina Hernandez-Boussard, Ph.D. 
Professor of Medicine, Biomedical Data Science, of Surgery and, by courtesy, of Epidemiology and Population Health, Stanford University School Medicine
2:15 p.m. – 2:30 p.m.Break
2:30 p.m. – 3:50 p.m.

Use Case Breakout Session #1—Exploring Use Cases and Stakeholder Mapping

  • Synthetic Data: Room 270-A
  • Data Sharing for General Reuse: Room 270-B
  • Multimodal Data: Room 260-F
  • Foundation Models: Room 280-A
  • Proxy Variables: Room 150-A
3:50 p.m. – 4:05 p.m.Break
4:05 p.m. – 4:45 p.m.Plenary Readout and Discussion
4:45 p.m. – 5:00 p.m.Summary Remarks and Day 1 Closing
Aaron Lee, M.D., M.S.C.I.
Associate Professor, Department of Ophthalmology, University of Washington
Day 2: Thursday, February 1, 2024
TimePresentation
8:30 a.m. – 9:00 a.m.Networking, Coffee
9:00 a.m. – 9:05 a.m.Welcome
Laura Biven, Ph.D 
Data Science Technical Lead, ODSS, NIH
9:05 a.m. – 9:25 a.m.The Current State of Data and Model Transparency 
Aaron Lee, M.D., M.S.C.I.
Associate Professor, Department of Ophthalmology, University of Washington
9:25 a.m. – 9:35 a.m.Morning Plenary Session—Recap and Expectations
Laura Biven, Ph.D. 
Data Science Technical Lead, ODSS, NIH
9:35 a.m. – 9:50 a.m.Break
9:50 a.m. – 11:15 a.m.

Use Case Breakout Session #2—Achieving Transparency and Mapping Capability Gaps

  • Synthetic Data: Room 270-A
  • Data Sharing for General Reuse: Room 270-B
  • Multimodal Data: Room 260-F
  • Foundation Models: Room 280-A
  • Proxy Variables: Room 150-A
11:15 a.m. – 11:30 a.m.Break
11:30 a.m. – 12:10 p.m.Plenary Readout
12:10 p.m. – 1:15 p.m.Lunch Break
1:15 p.m. – 1:45 p.m.Plenary Session by NIH to Provide Guidance for Upcoming Session (NIH speaker TBD)
1:45 p.m. – 2:00 p.m.Break
2:00 p.m. – 3:15 p.m.

Use Case Breakout Session #3 Guidance Session

  • Synthetic Data: Room 270-A
  • Data Sharing for General Reuse: Room 270-B
  • Multimodal Data: Room 260-F
  • Foundation Models: Room 280-A
  • Proxy Variables: Room 150-A
3:15 p.m. – 3:45 p.m.Break
3:45 p.m. – 4:45 p.m.Plenary Readout Session and Discussion
Hybrid discussion with virtual and in-person participants
4:45 p.m. – 5:00 p.m.Day 2 Closing Remarks
Aaron Lee, M.D., M.S.C.I.
Associate Professor, Department of Ophthalmology, University of Washington
5:00 p.m. – 5:30 p.m.Closed Session for Co-leads and Breakout Chairs.
Day 3: Friday, February 2, 2024
TimePresentation
8:15 a.m. – 8:30 a.m.Networking, Coffee
8:30 a.m. – 8:45 a.m.Recap of Day 2
Tina Hernandez-Boussard, Ph.D. 
Professor of Medicine, Biomedical Data Science, of Surgery and, by courtesy, of Epidemiology and Population Health, Stanford University School of Medicine
8:45 a.m. – 9:45 a.m.Continued Plenary Discussion on Transparency Guidance and Capability Gaps
Hybrid discussion with virtual and in-person participants.
9:45 a.m. – 10:00 a.m.Break
10:00 a.m. – 11:00 a.m.

Use Case Breakout Session #4—Future Trends (5 breakout rooms)

  • Synthetic Data: Room 150-A
  • Data Sharing for General Reuse: Room 270-A
  • Multimodal Data: Room 270-B
  • Foundation Models: Room 280-A
  • Proxy Variables: Room 260-F
11:00 a.m. – 11:15 a.m.Break
11:15 a.m. – 12:00 p.m.Plenary Readout Session and Discussion
Hybrid discussion with virtual and in-person participants.
12:00 p.m. – 12:15 p.m.Closing Plenary Talk and Thanks (NIH speakers and co-chairs)
12:15 p.m.Workshop Adjourns
12:15 p.m. – 1:15 p.m.Lunch (Participants are welcome to stay for lunch)

Breakout Groups

In-person participants were given the choice of 5 different breakout themes to foster interactive discussions around key areas. Descriptions of each of the breakout sessions and their framing are outlined below.

Summaries

  1. Proxy Variables: 
    In this breakout session, we will examine the use of proxy variables in algorithms. Proxy variables are confounders and therefore are used (intentionally or unintentionally) in place of another variable that has a true causal relationship with the outcome. A notable example is the use of race and ethnicity in prediction models, as many experts believe that these variables are often oversimplified proxies for such variables as genetic ancestry or complex environmental and social factors. Other examples include the use of health care costs as a proxy for health care needs (Obermeyer 2019); given that less money is spent on Black patients who have the same level of need as White patients, the examined algorithm falsely concluded that Black patients are healthier than equally sick White patients.
  2. Synthetic Data: 
    This breakout focuses on synthetic data—how they are generated, how they might be used, and how they could have both positive and negative impacts on human health. We will discuss specific considerations for the need for and challenges related to synthetic data, including realism, bias, degradation, ethical concerns, and generalizability. Where possible, specific examples will be discussed and used in developing best practices and guiding principles for ethical use and transparency of synthetic data.

    Additional Resources:
  3. Multimodal Data 
    This breakout will discuss using multimodal data within artificial intelligence (AI) model development, validation, and translation for clinical implementation (e.g., combining structured data, such as diagnoses, with unstructured data, such as text or images). This will include specific considerations for the need for and challenges of generating and linking data in relation to ethics, bias, privacy, and transparency when combining complex, multimodal data with such details as time-course relevance.

    Additional Resources:
  4. Foundational Models: 
    Foundational models and the closely related Large Language Models (LLMs), such as ChatGPT, have sparked a huge wave of innovations combining the use of AI with the amazing capabilities of these models to integrate and deliver information. This breakout series will explore key concepts behind these models, as well as the implications when creating and using these models in multiple settings involving the clinician, patient, researcher, developer, and community as a whole. Anticipated areas of discussion include such topics as transparency, ethics, privacy, ownership, and reliability.

    Additional Resources:
  5. Data Sharing for General Reuse: 
    Data sharing has great potential to accelerate scientific innovation; however, it occurs without knowledge of how, whether, or by whom the data will be reused. Responsible reuse of shared data for AI requires technical, operational, ethical, privacy, and regulatory considerations to assess whether the data are fit for purpose.

Workshop Leads

Tina Hernandez-Boussard, Ph.D., M.P.H., M.S. — Professor of Medicine
Stanford Medicine, Stanford University

Tina Hernandez-Boussard, Ph.D., M.P.H., M.S., is an Associate Dean of Research and Professor of Medicine (Biomedical Informatics), Biomedical Data Sciences, Surgery, and Epidemiology & Population Health (by courtesy) at Stanford University. Her background and expertise are in the field of biomedical informatics, health services research, and epidemiology. In her current work, Dr. Hernandez-Boussard develops and evaluates artificial intelligence (AI) technology to accurately and efficiently monitor, measure, and predict health care outcomes. She is a dedicated advocate of ethical AI practices.

Aaron Y. Lee, M.D., M.Sci. — Associate Professor, University of Washington School of Medicine

Aaron Y. Lee M.D. MSCI is an associate professor and vitreoretinal surgeon at University of Washington, Department of Ophthalmology, and the recent recipient of the C. Dan and Irene Hunter Endowed Professorship. He completed his undergraduate at Harvard University and his medical training at Washington University in St Louis. He chairs the American Academy of Ophthalmology Information Technology Steering Committee. He currently serves as an Associate Editor for both Translational Vision Science and Technology and Ophthalmology Science, and on the Editorial Board for the American Journal of Ophthalmology and Nature Scientific Reports. He has published over 175 peer-reviewed manuscripts and is known as a leader in the field of artificial intelligence and ophthalmology. Aaron Lee' s research is focused on the translation of novel computation techniques in machine learning to uncover new disease associations and mechanisms from routine clinical data including electronic health records and imaging.

Ansu Chatterjee, Ph.D. — Sinha Ennovate Endowed Chair Professor, The University of Maryland Baltimore County

Dr. Snigdhansu (Ansu) Chatterjee is the Sinha Ennovate Endowed Chair Professor at the University of Maryland at Baltimore County. Until recently, he was a Professor in the School of Statistics at the University of Minnesota, and the Director of the Institute for Research in Statistics and its Applications (IRSA, http://irsa.stat.umn.edu/), an inter-disciplinary data science institute at the University of Minnesota, and a Data Scholar with the National Institutes of Health (NIH).  His research interests include theoretical foundations of data sciences, high dimensional data geometry, Bayesian and other conditional inferential techniques, digital twins, selection bias, small samples in surveys and related topics, studies on ethics, fairness, diversity, privacy and representativeness in data sciences, and applications of data science techniques in multiple domains including precision medicine and climate change and its downstream effects.

Julia Stoyanovich, Ph.D., M.S. — Institute Associate Professor, Tandon School of Engineering, New York University

Julia Stoyanovich, Ph.D., M.S., is Institute Associate Professor of Computer Science and Engineering, Associate Professor of Data Science, Director of the Center for Responsible AI, and member of the Visualization and Data Analytics Research Center at New York University. Dr. Stoyanovich’s goal is to make "Responsible AI" synonymous with "AI." She works toward this goal by engaging in academic research and education and technology policy, and by speaking to practitioners and members of the public about the benefits and harms of AI. Dr. Stoyanovich’s research interests include AI ethics and legal compliance, as well as data management and AI systems. In addition to academic publications, she has written for The New York Times, The Wall Street Journal, LA Times, The Hill, and Le Monde. Dr. Stoyanovich has been teaching courses on responsible data science and AI for students, practitioners, and the general public. She is a co-author of "Data, Responsibly," an award-winning comic book series for data science enthusiasts, and "We are AI," a comic book series for a general audience. Dr. Stoyanovich is engaged in technology policy and regulation in the United States and internationally, having served on the New York City Automated Decision Systems Task Force, by mayoral appointment, among other roles. She received her M.S. and Ph.D. in computer science from Columbia University and a B.S. in computer science and in mathematics & statistics from the University of Massachusetts Amherst. Dr. Stoyanovich’s work has been generously supported by the National Science Foundation (NSF), Pivotal Ventures, and Meta Responsible AI, among others. She is a recipient of the NSF CAREER Award and a Senior Member of the Association for Computing Machinery (ACM).

Caroline Chung, M.D., M.Sc., FRCPC, CIP — Vice President, Chief Data Officer
The University of Texas MD Anderson Cancer Center

Caroline Chung, M.D., M.Sc., FRCPC, CIP, is Vice President and Chief Data Officer and Director of Data Science Development and Implementation of the Institute of Data Science in Oncology at The University of Texas MD Anderson Cancer Center (MD Anderson). She is a clinician–scientist and an Associate Professor in Radiation Oncology and Diagnostic Imaging, with a clinical practice focused on central nervous system malignancies and a computational imaging lab focused on quantitative imaging and modeling to detect and characterize tumors and toxicities of treatment to enable personalized cancer treatment. Motivated by challenges observed in her own clinical and research pursuits, Dr. Chung has developed and leads institutional efforts to enable quantitative measurements for clinically impactful utilization and interpretation of data through a collaborative team science approach, including the Tumor Measurement Initiative (TMI) at MD Anderson.

Internationally, Dr. Chung leads several multidisciplinary efforts to improve the generation and utilization of high-quality quantitative data to drive research and impact clinical practice, including in her roles as Vice Chair of the Radiological Society of North America Quantitative Imaging Biomarker Alliance; Co-Chair of the Quantitative Imaging for Assessment of Response in Oncology Committee of the International Commission on Radiation Units and Measurements; and member of the National Academies of Sciences, Engineering, and Medicine–appointed committee addressing Foundational Research Gaps and Future Directions for Digital Twins. Beyond her clinical, research, and administrative roles, Dr. Chung enjoys serving as an active educator and mentor, with a passion for supporting the growth of diversity, equity, and inclusion in science, technology, engineering, and math, including through her role as Chair of Women in Cancer (www.womenincancer.com), a nonprofit organization that is committed to advancing cancer care by encouraging the growth, leadership, and connectivity of current and future oncologists, trainees, and medical researchers.

Maia Hightower, M.D., M.P.H., M.B.A. — Executive Vice President and Chief Digital Technology Officer, University of Chicago Medicine

Maia Hightower, M.D., M.P.H., M.B.A., is the CEO and co-founder of Equality AI and former Executive Vice President and Chief Digital Transformation Officer at University of Chicago Medicine. Dr. Hightower is a leading voice on the intersection of health care, digital transformation, and health equity. She is a champion for responsible AI, ensuring that the digital future of health care is equitable and just.

At the heart of Dr. Hightower’s mission is Equality AI, an early-stage, investor-backed health care tech startup. In a world where data scientists have become an integral part of the care team, Equality AI’s cloud-based, responsible AI life-cycle management platform—encompassing machine learning (ML) build, evaluate, monitor, and risk management—empowers health care with trustworthy AI, ensuring quality care for all.

She is a four-time C-suite physician executive with 15 years of executive leadership spanning health care IT, medical affairs, and population health across four academic medical centers, clinically integrated networks, and accountable care organizations.

Dr. Hightower received her B.A. from Cornell University and her M.D. and M.P.H. from the University of Rochester School of Medicine, followed by residencies in internal medicine and pediatrics at the University of California, San Diego. She also holds an M.B.A. from The Wharton School of the University of Pennsylvania.

Sajid Hussain, Ph.D. — Associate Vice Provost for Innovation & Information Technology (CTO) and Discipline Coordinator of Data Science, Fisk University

Dr. Sajid Hussain is Associate Vice Provost for Research, Innovation, & Technology, Fisk University. In 2009, he joined Fisk University as an Associate Professor in the Department of Mathematics and Computer Science. Prior to Fisk, he worked as Associate Professor and Assistant Professor at Acadia University, Canada, 2005-09. He received a Ph.D. in Electrical Engineering from the University of Manitoba, Canada, in 2004.

Dr. Hussain is interested in applying machine learning techniques for interdisciplinary research projects related to healthcare and social justice. He is also interested in energy-efficient communication protocols and security techniques for mobile, ubiquitous, and pervasive applications. He has published more than 80 refereed journal, conference, and workshop papers. His research is financially supported by several grants and contracts, such as NSF INCLUDES National Data Science Alliance, NDSA (2217346), NSF Implementation Award (1817282), NSF IUSE (2235861), NIH/BD2K-R25 Diversity, Army, and Environment Management Department of Energy

He has co-organized several journal special issues, conferences, and workshops and is a senior member of IEEE.

H.V. Jagadish, Ph.D. — Director, Michigan Institute for Data Science, University of Michigan

H.V. Jagadish, Ph.D., is the Edgar F. Codd Distinguished University Professor and Bernard A. Galler Collegiate Professor of Electrical Engineering and Computer Science at the University of Michigan in Ann Arbor and Director of the Michigan Institute for Data Science. Prior to 1999, he was Head of the Database Research Department at AT&T Labs in Florham Park, New Jersey.

Dr. Jagadish is well known for his broad-ranging research on information management and has more than 200 major papers and 38 patents, with an H-index of 101. He has been a fellow of the ACM, "The First Society in Computing," since 2003 and of the American Association for the Advancement of Science since 2018. He currently chairs the board of the Academic Data Science Alliance and previously served on the board of the Computing Research Association (2009–2018). He has been an Associate Editor for ACM Transactions on Database Systems (1992–1995), Program Chair of the ACM Special Interest Group on Management of Data (SIGMOD) annual conference (1996), Program Chair of the Intelligent Systems for Molecular Biology Conference (2005), a trustee of the VLDB (Very Large DataBase) foundation (2004–2009), Founding Editor-in-Chief of the Proceedings of the VLDB Endowment (2008–2014), and Program Chair of the VLDB Conference (2014). Since 2016, he has been the Editor of the Springer (previously Morgan & Claypool) Synthesis Lecture Series on Data Management. Among his many awards are the David E. Liddle Research Excellence Award (at the University of Michigan) in 2008, the ACM SIGMOD Contributions Award in 2013, and the Distinguished Faculty Achievement Award (at the University of Michigan) in 2019. His popular MOOC on Data Science Ethics is available on both EdX and Coursera.

Jayashree Kalpathy-Cramer, Ph.D. — Chief of Artificial Medical Intelligence in Ophthalmology, University of Colorado School of Medicine

Jayashree Kalpathy-Cramer, Ph.D., is an Endowed Chair in Ophthalmic Data Sciences and the Founding Chief of the Division of Artificial Medical Intelligence in the Department of Ophthalmology at the University of Colorado School of Medicine. She leads the development and translation of novel AI methods into effective patient care practices at the Sue Anschutz-Rodgers Eye Center.

Previously, she was an Associate Professor of Radiology at Harvard Medical School, where she was actively involved in data science activities with a focus on medical imaging. Her research interests span the spectrum from novel algorithm development to clinical deployment. She is passionate about the potential that machine learning and mathematical modeling have to improve the quality of access to health care in the United States and worldwide. Dr. Kalpathy-Cramer has authored more than 250 peer-reviewed publications, has written more than a dozen book chapters, and is a co-inventor on a dozen patents. She graduated from the Indian Institute of Technology in Bombay, India, with a degree in electrical engineering and received her Ph.D. from Rensselaer Polytechnic Institute, also in electrical engineering. She returned to academia after almost a decade in the semiconductor industry, with a research pivot toward health care. Since then, her lab has been funded by the National Institutes of Health (NIH), National Science Foundation, and European Union.

Vincent Liu, M.D., M.Sc. — Senior Research Scientist, Kaiser Permanente

Vincent Liu, M.D., M.Sc., is a Senior Research Scientist at the Kaiser Permanente (KP) Division of Research and the Regional Medical Director of the Hospital Advanced Analytics program at The Permanente Medical Group. He is an expert in the application of AI/ML methods to real-world health data and health care delivery, particularly for acute illness, including sepsis. He oversees the implementation of predictive analytics and AI/ML tools for a population of 4.5 million members in KP Northern California, the impact of which has been recognized through national and international awards for safety, innovation, and quality. He also leads the KP Augmented Intelligence in Medicine and Healthcare Initiative (KP AIM-HI), a $5 million program designed to support health systems to rigorously evaluate AI/ML implementation in diverse settings. He has co-authored 200 scholarly manuscripts in peer-reviewed journals and book chapters and has served as an expert advisor for NIH and the National Academy of Medicine, National Quality Forum, and National Committee on Quality Assurance. Dr. Liu maintains a clinical practice in pulmonary critical care medicine at the KP Santa Clara Medical Center and is the Informatics Track Director of the KP Division of Research Delivery Science Fellowship.

Courtney Lyles, Ph.D. — Director, Center for Healthcare Policy and Research, UC Davis Health

Courtney Lyles, Ph.D., is the Director of the UC Davis Health Center for Healthcare Policy and Research (CHPR) and Professor in the Department of Public Health Sciences in the UC Davis School of Medicine. A trained health services researcher, Dr. Lyles has expertise in health equity, digital health and informatics, and implementation science. Her research portfolio designs and evaluates new digital programs and platforms to support patients and families, as well as clinical workflows, with an emphasis on participant- and community-engaged methods and chronic disease prevention and treatment. She was a 2022–2023 Visiting Researcher at Google on its health equity team.

Shazia M. Siddique, M.D., M.S.H.P. — Assistant Professor in Gastroenterology 
University of Pennsylvania Perelman School of Medicine

Shazia Siddique, M.D., M.S.H.P., is an Assistant Professor in Gastroenterology at the University of Pennsylvania Perelman School of Medicine. Dr. Siddique is an NIH-funded health services and health equity researcher. Her research, clinical expertise, and quality efforts aim to promote the integration of evidence-based practices into clinical care. She is the Associate Director of Scientific Research for the Center for Evidence-Based Practice at the University of Pennsylvania and serves on the Clinical Guidelines Committee for the American Gastroenterological Association. Additionally, she serves as Director for Research for the Penn Center for Healthcare Improvement and Patient Safety (CHIPS) program. Most recently, she was the senior author on the newly released report funded by the Agency for Healthcare Research and Quality’s Evidence-based Practice Center Program, titled "The Impact of Healthcare Algorithms on Racial and Ethnic Disparities in Health and Healthcare."

Eric Stahlberg, Ph.D. — Director, Cancer Data Science Initiatives, Frederick National Laboratory for Cancer Research

Eric Stahlberg, Ph.D., directs cancer data science initiatives at the Frederick National Laboratory for Cancer Research. He has been instrumental in establishing the Frederick National Laboratory’s high-performance computing (HPC) initiative and in assembling collaborative teams across multiple complex organizations to advance predictive oncology. Dr. Stahlberg has played a leadership role in many key partnerships, including forming the collaboration between the National Cancer Institute and the U.S. Department of Energy to accelerate progress in precision oncology and computing. Among the collaborative initiatives are IMPROVE (Innovative Methods and New Data for Predictive Oncology Model Evaluation), the Predictive Oncology Model and Data Clearinghouse (modac.cancer.gov), and ATOM (Accelerating Therapeutics for Opportunities in Medicine). He has led program efforts establishing foundations for digital twin applications in cancer, and now for advancing personalized medicine for all individuals through virtual human models and digital twin approaches. Among leading-edge initiatives, he also co-organizes the annual Computational Approaches for Cancer at SC conferences, the HPC Applications of Precision Medicine workshops at ISC, and cross-government workshops with the U.S. Food and Drug Administration. Most recently, he spearheaded the first Virtual Human Global Summit in October 2023, exploring the range of issues inherent in developing and appropriately employing the use of predictive health models and personal data within a digital twin approach.

Dr. Stahlberg has undergraduate degrees in chemistry, computer science, and mathematics and a Ph.D. in theoretical chemistry from The Ohio State University. He has been recognized as one of the FCW top 100, received the President’s Award from the Frederick National Laboratory for Cancer Research, and has received the distinguished alumni award from his alma mater, Wartburg College.

Colin G. Walsh, M.D., M.A., FAMIA, FACMI, FIAHSI — Associate Professor
Vanderbilt University Medical Center

Colin Walsh, M.D., M.A., FAMIA, FACMI, FIAHSI, is an Associate Professor of Biomedical Informatics, Medicine, and Psychiatry at the Vanderbilt University Medical Center. He is an internist whose research includes predictive decision support to enable prevention, scalable phenotyping for precision medicine, and population health informatics.

NIH Committee

Laura Biven, Ph.D.
Office of Data Science Strategy, National Institutes of Health

Susan Gregurick, Ph.D.
Associate Director for Data Science
Office of Data Science Strategy, National Institutes of Health

Christine Cutillo
Office of Data Science Strategy (ODSS)

Deborah Duran
National Institute on Minority Health and Health Disparities (NIMHD)

Samson Gebreab
Office of Data Science Strategy (ODSS), AIM-AHEAD

Christopher Kinsinger
Office of Strategic Coordination (OSC) – NIH Common Fund

Haluk Resat
Office of Strategic Coordination (OSC) – NIH Common Fund

David Resnik
National Institute of Environmental Health Sciences (NIEHS)

Asif Rizwan
National Heart, Lung, and Blood Institute (NHLBI)

July Data Sharing and Reuse Seminar

Friday, July 12, 2024

Dr. Satra Ghosh will present The Transformative Potential and Challenges of Open Data and Technologies in Neuroscience on July 12, 2024, at 12 p.m.

About the Seminar

Open data can democratize neuroscience research by cultivating an ecosystem for neuroscientists that includes robust platforms and tools to share, analyze, and interpret data, transforming it through an information landscape to knowledge. This connection can help accelerate the discovery and understanding of neural systems and their impact on human health and behavior. These efforts involve necessary ethical considerations, call for diverse perspectives, and require sustainable stewardship of neuroscience resources. This talk will use examples from various Brain Research Through Advancing Innovative Neurotechnologies® (BRAIN) Initiative projects—such as the Distributed Archives for Neurophysiology Data Integration (DANDI) platform, BRAIN Initiative Cell Atlas Network (BICAN), BRAIN Initiative Connectivity Across Scales (BRAIN CONNECTS) program, and others—to illustrate how an open ecosystem can foster a community in which data and knowledge are closely connected and form a computable substrate for exploration, discovery, and translation.

About the Seminar Series

The seminar is open to the public and registration is required each month. Individuals who need interpreting services and/or other reasonable accommodations to participate in this event should contact Janiya Peters at 301-670-4990. Requests should be made at least five days in advance of the event.

The National Institutes of Health (NIH) Office of Data Science Strategy hosts this seminar series to highlight exemplars of data sharing and reuse on the second Friday of each month at noon ET. The monthly series highlights researchers who have taken existing data and found clever ways to reuse the data or generate new findings. A different NIH institute or center will also share its data science activities each month.

June Data Sharing and Reuse Seminar

Friday, June 14, 2024

Dr. Michael Schatz will present BioDIGS: BioDiversity and Informatics for Genomics Scholars on June 14, 2024, at 12 p.m.

About the Seminar

Soil and soil organisms are essential for sustaining life, as they mediate many biological processes we rely on for food, fibers, and planetary health. Surveys of biodiversity identify soil as the single most diverse habitat on Earth and indicate that a single gram of soil may contain hundreds of millions to billions of bacterial, archaeal, and eukaryotic cells. Soil species play critical roles in promoting both healthy (e.g., nutrient and nitrogen transport, probiotics) and dysbiotic (e.g., pathogens, antibiotic resistance) environments, yet the vast majority of species remain uncharacterized, and their biological potential remains unknown.

Addressing this critical need, we have launched BioDIGS as a collaborative soil metagenome project to sample and analyze soil biodiversity with a focus on understanding how such biodiversity affects human health. To reach the broadest range of environments and participation, we partner with the Genomic Data Science Community Network (GDSCN) to complete the sampling and analysis. GDSCN was established in 2020 to improve the diversity and accessibility of genomics research and education. It includes more than 25 faculty members at community colleges, historically Black colleges and universities, Hispanic-serving institutions, Tribal colleges and universities, and related institutions. BioDIGS engages GDSCN faculty and students at all stages, from experimental design and collection through computational analysis. All data and workflows are available in Galaxy and the NHGRI AnVIL, which allows collaborative and scalable analysis for all institutions. Complementing the research, BioDIGS serves as a catalyst for a variety of professional development opportunities, classroom trainings, and curricula spanning the genomic data science life cycle.

Through BioDIGS, we have collected soil from more than 100 sites across the United States, selected to represent a variety of managed (e.g., lawns, fields, public parks) and unmanaged (e.g., dense forest, dense underbrush) areas. In addition to performing short- and long-read DNA sequencing, we submit the samples for heavy metal, pH, and other soil measurements. We further augment our data set with more than 3,000 public soil metagenomes to present one of the most comprehensive studies of soil biodiversity ever attempted. Our results highlight significant associations between metagenome diversity and heavy metal content, especially lead and arsenic across urban sites. Using long-read sequencing, we have assembled complete genomes and high-quality metagenome-assembled genomes for more than 100 novel species, as well as gigabases of novel gene sequences. Finally, we detect the presence of antimicrobial resistance genes and microbial pathways for plant and animal signaling molecules, highlighting the complex relations across kingdoms, derived from the soil metagenomes.

About the Speaker

Michael Schatz is the Bloomberg Distinguished Professor of Computer Science and Biology at Johns Hopkins University, Co-director of the National Human Genome Research Institute (NHGRI) Genomic Data Science Analysis, Visualization, and Informatics Lab-space (AnVIL), and co-founder of the Genomic Data Science Community Network. His research is at the intersection of computer science, biology, and biotechnology and focuses on the development of novel algorithms and computing systems for human genetics, comparative genomics, and personalized medicine. For this work, he received the 2015 Alfred P. Sloan Foundation Fellowship and a 2014 National Science Foundation CAREER award and—with Telomere-to-Telomere Consortium co-leads Adam Phillippy, Karen Miga, and Evan Eichler—was named by Time magazine as one of the most influential people in the world in 2022 (TIME100). Schatz received his Ph.D. and M.S. in Computer Science from the University of Maryland in 2010 and 2008, respectively, and his B.S. in Computer Science from Carnegie Mellon University in 2000. More information is available on his laboratory website: http://schatz-lab.org.

About the Seminar Series

The seminar is open to the public and registration is required each month. Individuals who need interpreting services and/or other reasonable accommodations to participate in this event should contact Janiya Peters at 301-670-4990. Requests should be made at least five days in advance of the event.

The National Institutes of Health (NIH) Office of Data Science Strategy hosts this seminar series to highlight exemplars of data sharing and reuse on the second Friday of each month at noon ET. The monthly series highlights researchers who have taken existing data and found clever ways to reuse the data or generate new findings. A different NIH institute or center will also share its data science activities each month.

2024 NIH ODSS AI Supplement Program PI Meeting

Wednesday, March 27, 2024

In 2024, the NIH Office of Data Science Strategy (ODSS) held its second meeting of AI supplement awardees. The meeting was designed to foster the development of a cohesive National Institutes of Health (NIH) AI community by uniting PIs (Principal Investigators) from the FY22 and FY23 Office of Data Science Strategy (ODSS) AI supplement programs for a two-day, virtual gathering that provided a platform for participants to exchange insights on their projects, celebrate accomplishments, discuss best practices, share lessons learned, and engage in collaborative discussions.

Read a report of the meeting.

Attendees

Principal Investigators from the following cloud supplement programs were invited:

  • NOT-OD-22-065 – FY2022 Request for ODSS Funds to Advance the Ethical Development and Use of AI/ML in Biomedical and Behavioral Sciences (also known as FY22 AI-Ethics program)
  • NOT-OD-22-067 – FY2022 Request for ODSS Funds to Support Collaborations to Improve the AI/ML Readiness of NIH-Supported Data (also known as FY22 AI-Readiness program)
  • NOT-OD-23-082 – FY2023 Request for ODSS Funds to Support Collaborations to Improve the AI/ML Readiness of NIH-Supported Data (also known as FY23 AI-Readiness program)

Featured Speakers

Susan Gregurick, Ph.D. — Associate Director for Data Science, NIH; Director, ODSS

Dr. Susan K. Gregurick was appointed Associate Director for ODSS at the NIH on September 16, 2019. Under Dr. Gregurick’s leadership, the ODSS leads the implementation of the NIH Strategic Plan for Data Science through scientific, technical, and operational collaboration with the institutes, centers, and offices that comprise NIH. Dr. Gregurick received the 2020 Leadership in Biological Sciences Award from the Washington Academy of Sciences for her work in this role. She was instrumental in the creation of the ODSS in 2018 and served as a senior advisor to the office until being named to her current position.

Laura Biven, Ph.D. — Lead, Integrated Infrastructure and Emerging Technologies, NIH ODSS

Since joining NIH in 2020, Dr. Laura Biven has led the Integrated Infrastructure and Emerging Technologies (IIET) branch in ODSS. She is responsible for strategic planning, coordination, and oversight of programs that integrate independently managed, cloud data resources across the NIH to advance NIH’s vision for an integrated, FAIR biomedical data ecosystem. She also oversees multidisciplinary NIH-wide programs that focus on integrating computational, mathematical, and biomedical research communities around emerging technologies such as artificial intelligence and machine learning, (AI/ML) quantum computing, and digital twins.

Christine Cutillo — Health Data Scientist for AI Ethics, NIH ODSS

Christine Cutillo is a Health Data Scientist for AI Ethics in the ODSS, within the Integrated Infrastructure and Emerging Technologies (IIET) unit. She is responsible for multidisciplinary NIH-wide AI Ethics efforts – focusing on integrating ethics through the AI lifecycle in biomedical research applications. She is highly interested in data-driven scientific and health care innovations that are transparent, ethical, and designed for and with the patient/end user. Prior to joining ODSS, Christine was the Data Science Lead in the Office of the Director at the National Center for Advancing Translational Sciences (NCATS) for three years.

Agenda

2024 NIH ODSS AI Supplement Program PI Meeting
Day 1
TimePresentation
11am-12:25pm ET

WELCOME
Dr. Laura Biven, Lead, Integrated Infrastructure and Emerging Technologies, NIH ODSS

NIH OFFICE OF DATA SCIENCE STRATEGY OVERVIEW
Dr. Susan Gregurick, Associate Director for Data Science, NIH; Director, ODSS
Download Slides

ODSS AI ACTIVITIES OVERVIEW & FUTURE VISION 
Ms. Christine Cutillo, Health Data Scientist for AI Ethics, NIH ODSS 
Download Slides

BEGINNING OF THE MEETING POLL 
The poll explored various, relevant topics at the forefront of NIH AI
Ms. Christine Cutillo, Health Data Scientist for AI Ethics, NIH ODSS

12:25-1:25pm ET

BREAKOUT SESSION 1
During this breakout session, participants from the FY22 NOT-OD-22-065 and NOT-OD-22-067 programs gave 10-minute lightning presentations that discussed the motivation, achievements, best practices, lessons learned, and future plans of their awarded AI projects.

TRACK A

Dr. Alex Federman (Moderator), Professor of Medicine, Icahn School of Medicine at Mount Sinai, Dr. Jalayne Arias, Associate Professor, Georgia State University
A Qualitative Examination of Patients’ and Clinicians’ Perspectives on AI-driven Automated Screening for Cognitive Impairment
Download Slides

Dr. Andrew Schaefer, Professor, Rice University
Implementation of a Public Data Challenge for MRI-Guided Tumor Segmentation in Head and Neck Cancer Patients
Download Slides

Dr. Lu Tang, Professor, Texas A&M University; Dr. Jinsil Hwaryoung Seo, Associate Professor, Texas A&M University; Dr. Sophia Fantus, Assistant Professor, University of Texas at Arlington
Improving AI Alzheimer Researchers’ Knowledge, Attitudes and Practices of AI Ethics
Download Slides

Dr. James V. Lacey Jr., Professor, City of Hope
Strategies for Improving the Readiness of Large-scale Cohort Data for AI/ML 
Download Slides

TRACK B

Dr. Kyung Sung (Moderator), Associate Professor, University of California, Los Angeles
Detection and Localization of Prostate Cancer: A Structured Multi-Scale Multiparametric MRI Database for AI/ML Research
Download Slides

Mr. Trenton Chang, Ph.D. Candidate, University of Michigan
Measuring and Mitigating the Impact of Biases in Laboratory Testing on Machine Learning Models
Download Slides

Dr. Levi Waldron, Professor, City University of New York; Dr. Sehyun Oh, Assistant Professor, City University of New York
Improving FAIRness and AI/ML readiness of Bioconductor data resources
Download Slides

Dr. Maya Sabatello, Associate Professor of Medical Sciences, Columbia University
Blind/Disability and Intersectional Biases in E-Health Records (EHRs) of Diabetes Patients

Dr. Cathy Wu, Professor and Director, Data Science Institute, University of Delaware
UniProt Knowledgebase to Enable AI/ML Readiness and Applications
Download Slides

1:25-1:35pm ETBREAK
1:35-2:35pm ET

BREAKOUT SESSION 2
During this breakout session, participants from the FY22 NOT-OD-22-065 and NOT-OD-22-067 programs gave 10-minute lightning presentations that discussed the motivation, achievements, best practices, lessons learned, and future plans of their awarded AI projects.

TRACK A

Dr. Bankole Olatosi (Moderator), Associate Professor, University of South Carolina
Framing the Ethical-Framework Guided Metric Tool – Lessons Learned
Download Slides

Dr. Shivakeshvan Ratnadurai Giridharan, Instructor, Burke Neurological Institute
Development of Deep Learning-based Kinematic Data Acquisition
Download Slides

Dr. Amber Simpson, Associate Professor/Canada Research Chair, Queen's University; Ms. Rohan Faiyaz Khan, PhD Student, Queen's University
Ethical Development of Colorectal Cancer Imaging Biomarkers
Download Slides

Dr. Rebecca McNeil, Senior Research Statistician, RTI International
Enabling AI/ML Readiness and Modernization of Longitudinal Pregnancy and Cardiovascular Health Data: Lessons Learned
Download Slides

TRACK B

Dr. Jennifer Wagner (Moderator), Assistant Professor of Law, Policy, and Engineering and Anthropology, Penn State University
A Synopsis of the PREMIERE Ethics Supplement
Download Slides

Dr. Alex Wagner, Principal Investigator, Nationwide Children’s Hospital
Application of Genomic Knowledge Standards to the Genome Aggregation Database
Download Slides

Dr. Jessica Sperling, Director, Office of Evaluation and Applied Research Partnership, Duke University; Dr. Whitney Welsh, Research Scientist, Duke University
Machine Learning and The Ethics of Use: Patient and Provider Perspectives on Utilizing Prediction Models in Medical Care
Download Slides

Dr. Benjamin Vincent, Associate Professor of Medicine, University of North Carolina at Chapel Hill 
ASTOR: Alliance Standardized Translational ‘Omics Resourcse
Download Slides

2:35-2:45pm ETBREAK
2:45-3:45pm ET

BREAKOUT SESSION 3
During this breakout session, participants from the FY22 NOT-OD-22-065 and NOT-OD-22-067 programs gave 10-minute lightning presentations that discussed the motivation, achievements, best practices, lessons learned, and future plans of their awarded AI projects.

TRACK A

Dr. Abhinav Jha (Moderator), Assistant Professor, Washington University
Uncertainty Quantification of AI-Based Imaging Algorithms: The Need and Methods
Download Slides

Dr. Sriram Neelamegham, Professor/PI, University at Buffalo, State University of New York; Dr. Rudiyanto Gunawan, Associate Professor, State University of New York - Buffalo
Systems Biology of Glycosylation: Extending Mechanistic Analysis Toward AI
Download Slides

Dr. Josiah Couch, Postdoctoral Research Fellow, Beth Israel Deaconess Medical Center
Beyond Class Balance: Dataset Diversity and Model Performance in Deep-Learning Classification Tasks
Download Slides

Dr. Alexey Terskikh, Associate Professor, Sanford Burnham Prebys Medical Discovery Institute
ImAge Quantitates Ageing and Rejuvenation

TRACK B

Dr. Qing Zeng-Treitler (Moderator), Director, Biomedical Informatics Center, George Washington University
Shedding Light on the Black Box: Using Explainable AI to Enhance Clinical Research
Download Slides

Dr. Ranjan Ramachandra, Research & Development Engineer, University of California San Diego
Development of Software for the Optimization and Normalization of 3D Electron Microscopic Data Acquisition to Facilitate Use and Reuse of AI/ML-Based Image Analysis Tools
Download Slides

Dr. Joan Casey, Assistant Professor of Environmental and Occupational Health Sciences, University of Washington School of Public Health; Dr. Danielle Braun, Principal Research Scientist, Harvard T.H. Chan School of Public Health
Approaches for AI/ML Readiness for Wildfire Exposures
Download Slides

Dr. Amit Majumdar, Division Director, Associate Professor, University of California San Diego
Implementation of Provenance Metadata on Neuroscience Gateway – A Platform for Neuroscience Software Dissemination
Download Slides

Dr. Samantha Krening, Assistant Professor, The Ohio State University
An Automated AI/ML Platform for Multi-Researcher Collaborations for a NIH BACPAC Funded Spine Phenome Project

3:45-3:55pm ETBREAK
3:55-4:55pm ET

BREAKOUT SESSION 4
During this breakout session, participants from the FY22 NOT-OD-22-065 and NOT-OD-22-067 programs gave 10-minute lightning presentations that discussed the motivation, achievements, best practices, lessons learned, and future plans of their awarded AI projects. Participants from the FY23 NOT-OD-23-082 program gave 10-minute lightning presentations on their project motivation, plan, and expected outcome

TRACK A

Dr. Danton Char (Moderator), Associate Professor, Stanford Medicine
Development of a Method for Identifying Ethical Considerations Arising from Healthcare AI Deployments
Download Slides

Dr. Jaehee Kim, Assistant Professor, Cornell University
Towards AI/ML-Enabled Molecular Epidemiology of Mycobacterium Tuberculosis

Dr. Jodyn Platt, Associate Professor, University of Michigan
Attitudes of Cancer Patients About the Use of AI in Clinical Care: A Nationwide Survey
Download Slides

Dr. Katherine Yates, Rheumatology Fellow, University of North Carolina at Chapel Hill
Development of an AI/ML-Ready Knee Ultrasound Dataset in a Population-Based Cohort 
Download Slides

Dr. Yann Le Guen, Senior Biostatistician, Stanford University
PREcision Care In Cardiac ArrEst - ICECAP (PRECICECAP)
Download Slides

TRACK B

Dr. Clifton Fuller (Moderator), Professor, UT MD Anderson Cancer Center
Leveraging MRI applications for FAIR and Open (Re)Use
Download Slides

Dr. Alaa Youssef, Post-Doctoral Scholar, Stanford University School of Medicine
Ethical Considerations in the Design and Conduct Clinical Trials of AI: A Qualitative Study of Investigators' Experiences with Autonomous AI for Diabetic Retinopathy
Download Slides

Dr. Bofan Song, Associate Research Professor, University of Arizona
Improving AI/ML-Readiness of Data Generated from NIH-Funded Research on Oral Cancer Screening
Download Slides

Dr. Bobbie-Jo Webb-Robertson, Division Director, Biological Sciences, Pacific Northwest National Laboratory
Generating AI/ML-Ready Data for Type 1 Diabetes
Download Slides

Dr. Diana Vera Cruz, Bioinformatician, University of Chicago; Dr. Romuald Girard, Assistant Professor, University of Chicago
Optimizing Diagnostic and Prognostic Biomarkers of CASH using Machine Learning
Download Slides

4:55-5pm ETCLOSE
Ms. Christine Cutillo, Health Data Scientist for AI Ethics, NIH ODSS
2024 NIH ODSS AI Supplement Program PI Meeting
Day 2
TimePresentation
11am-11:10am ETINTRODUCTION
Ms. Christine Cutillo, Health Data Scientist for AI Ethics, NIH ODSS
11:10am-12:10pm ETBREAKOUT SESSION 5
This interactive breakout session, led by NIH program officers, gave participants time to discuss AI barriers, challenges, and opportunities (including novel ideas and future directions). 

TRACK A

Ms. Christine Cutillo (Moderator), NIH ODSS
Dr. Jennifer Couch (Moderator), NIH NCI

TRACK B

Dr. Brad Bower (Moderator), NIH NIBIB
Dr. Deborah Duran (Moderator), NIH NIMHD

TRACK C

Dr. Haluk Resat (Moderator), NIH OD
Dr. Tamara Litwin (Moderator), NIH OD
12:10-12:20pm ETBREAK
12:20pm-1:20pm ET

BREAKOUT SESSION 6
During this breakout session, participants from the FY22 NOT-OD-22-065 and NOT-OD-22-067 programs gave 10-minute lightning presentations that discussed the motivation, achievements, best practices, lessons learned, and future plans of their awarded AI projects. 

TRACK A

Dr. Keith Feldman (Moderator), Assistant Professor, Children's Mercy Kansas City
Consideration of Geospatial Distribution in the Measurement of Study Cohort Representativeness and Data Quality
Download Slides

Dr. Cole Vonder Haar, Assistant Professor, Ohio State University
Behavioral Phenotyping of Risky Decision-Making After TBI in a Rat Model Enables Evaluation of Statistical Methodology
Download Slides

Dr. Pilhwa Lee, Lecturer, Morgan State University
Algorithmic Bias in Single Cell Analysis: A Study of Optimal Transport and Sinkhorn Divergence
Download Slides

Dr. Evelyn Hsieh, Associate Professor of Medicine/Chief of Rheumatology, Yale School of Medicine/VA Connecticut Healthcare System; Mr. Dax Westerman, Senior Data Scientist, Vanderbilt University Medical School 
Enabling Al/ML-Readiness of Data from Dual-Energy X-ray Absorptiometry (DXA) Images via Optical Character Recognition (OCR) and Deep Learning 
Download Slides

TRACK B

Dr. Mark Musen (Moderator), Professor, Stanford University
Metadata for the Masses: Making CEDAR Portable and Cloud-Based
Download Slides

Dr. Matteo D'Antonio, Assistant Professor, UC San Diego
Using Ancestry-Agnostic Approaches for Genome-Wide Association Studies and Polygenic Risk Scores
Download Slides

Dr. James Anderson, Senior Software Design Engineer, University of Utah - Moran Eye Center
Retinal Circuitry - Improving AI Readiness of Existing Retinal Connectomes
Download Slides

Dr. Ron Alkalay, Associate Professor in Orthopedic Surgery, Beth Israel Deaconess Medical Center
Application of AI/ML Models for Musculoskeletal Spine Research in Patients with Metastatic Spinal Disease: Successes and Challenges.
Download Slides

Ms. Jessica Gjonaj, Research Coordinator, NYU Grossman School of Medicine
NYU-Moi Data Science for Social Determinants Training Program
Download Slides

1:20-1:30pm ETBREAK
1:30-2:30pm ET

BREAKOUT SESSION 7
During this breakout session, participants from the FY22 NOT-OD-22-065 and NOT-OD-22-067 programs gave 10-minute lightning presentations that discussed the motivation, achievements, best practices, lessons learned, and future plans of their awarded AI projects. 

TRACK A

Professor Stephanie Kraft (Moderator), Assistant Professor, Seattle Children's Research Institute
Advancing Equity in AI-Enabled Mobile Health Tools: Community-Informed Design Considerations
Download Slides

Dr. Shigang Chen, Professor, University of Florida
Making Parkinson's Disease Data AI-Ready for Cloud-Outsourced Machine Learning Research with Differential Privacy
Download Slides

Dr. David Linden, Associate Professor of Physiology, Mayo Clinic
Developing Computational Tools to Analyze the Structure of Nerve Cells in the Bowel to Better Understand Digestive Disease 
Download Slides

Ms. Deepa Krishnaswamy, Instructor in Radiology, Brigham and Women's Hospital
Generation and Dissemination of Enhanced AI/ML-ready Prostate Cancer Imaging Datasets for Public Use
Download Slides

TRACK B

Dr. Zhe Sage Chen (Moderator), Associate Professor, New York University Grossman School of Medicine
Generative AI for Interictal EEG-Based SUDEP Risk Assessment

Dr. Kristin Kostick-Quenet, Assistant Professor, Baylor College of Medicine
Patient-Centric Federated Learning: Automating Meaningful Consent to Health Data Sharing with Smart Contracts
Download Slides

Dr. Thomas Hampton, Research Scientist, Geisel School of Medicine at Dartmouth
RESPIRE: A Reusable Architecture for Domain Centric ‘Omics Data Sharing
Download Slides

Dr. Tanvi Bhatt, Professor, University of Illinois at Chicago
WalkVIZ: Development of a Comprehensive Tool to Process and Visually Analyze Gait Data
Download Slides

2:30-2:40pm ETBREAK
2:40-3:40pm ET

BREAKOUT SESSION 8
During this breakout session, participants from the FY22 NOT-OD-22-065 and NOT-OD-22-067 programs gave 10-minute lightning presentations that discussed the motivation, achievements, best practices, lessons learned, and future plans of their awarded AI projects. Participants from the FY23 NOT-OD-23-082 program gave 10-minute lightning presentations on their project motivation, plan, and expected outcome.

TRACK A

Dr. Yanbin Yin (Moderator), Professor, University of Nebraska Lincoln
AI/ML Ready Carbohydrate Enzyme Gene Clusters in Human Gut Microbiome
Download Slides

Dr. Tezcan Ozrazgat Baslanti, Research Associate Professor, University of Florida
AI/ML Ready Data Enriched with Social Determinants of Health and Unstructured Text Data for Acute Kidney Injury Risk Prediction
Download Slides

Dr. Xiaoqian Jiang, Professor and Chair, University of Texas Health Science Center at Houston
Ethically Optimize Machine Learning Models with Real-World Data to Improve Algorithmic Fairness
Download Slides

Mr. Seha Ay, Graduate Student, Wake Forest School of Medicine
Applying Gerchberg-Saxton Algorithm on Biomedical Data to Mitigate Sampling Bias on Under-Represented Populations
Download Slides

TRACK B

Dr. David Gutman (Moderator), Associate Professor, Emory University
Piloting a Web-Based Neuropathology Image Resource for the ADRC Community: The Brain Digital Slide Archive
Download Slides

Dr. Andre Holder, Assistant Professor, Emory University
Battling Bias in Sepsis Prediction: Towards an Informed Understanding of EMR Data and Its Limitations
Download Slides

Dr. Vida Abedi, Associate Professor, Penn State University
Enhancing Imputation for Clinical Trials: The Path for a Flexible Toolkit
Download Slides

Dr. Vibhuti Gupta, Assistant Professor, School of Applied Computational Sciences, Meharry Medical College
AI/ML ready mHealth and wearables data for Dyadic HCT
Download Slides

3:40-3:50pm ETBREAK
3:50-4:30pm ET

BREAKOUT SESSION 5 REPORT BACK

This session brought participants back together to see which AI barriers, challenges, and opportunities (discussed in the virtual rooms during Breakout Session 5) are the most prevalent/promising among peers.

Ms. Christine Cutillo, NIH ODSS
Dr. Jennifer Couch, NIH NCI
Dr. Brad Bower, NIH NIBIB
Dr. Deborah Duran, NIH NIMHD
Dr. Haluk Resat, NIH OD
Dr. Tamara Litwin, NIH OD

4:30-4:50pm ETEND OF THE MEETING POLL
The poll provided participants with the opportunity to give their feedback and enhance NIH AI activities.
Ms. Christine Cutillo, Health Data Scientist for AI Ethics, NIH ODSS
Download Slides
4:50-5pm ETCLOSEOUT & ADJOURN
Ms. Christine Cutillo, Health Data Scientist for AI Ethics, NIH ODSS

Summit for Academic Institutional Readiness in Data Sharing (STAIRS)

Monday, August 5, 2024

The Data Curation Network will host the Summit for Academic Institutional Readiness in Data Sharing (STAIRS) on August 5-6, 2024!

Goals of the STAIRS Summit:

The STAIRS summit is designed to bring together data service providers, institutional repository (IR) managers, data curation professionals and other key stakeholders from across universities who support managing and sharing research data. We will use the summit to build up our communities of practice for institutionally based research data services and repositories in academic libraries, identifying common areas of need and exploring ways to strengthen connections between institutions. 

Who should apply:

We strongly encourage applicants from a range of institutions that vary in size, research activity, and level of development of services and infrastructure for research data management and sharing. We invite applications from all institutions regardless of Carnegie classification, including Historically Black Colleges and Universities, Hispanic-serving institutions, or other Minority-serving institutions.

Attendees can expect topics to include:

  • What is the current state of institutionally based data services and repositories?
  • What are current and emerging expectations for data sharing, as identified in documents such as the Desirable Characteristics of Data Repositories for Federally-Funded Research, and how can we best incorporate them into institutional policies and practices? 
  • What challenges and opportunities are common to institutions and where could institutional data service providers work more closely together as a community to address them?

Learn more about the event and apply here

Learn more about the Data Curation Network webinar series and view past events.