March Data Sharing and Reuse Seminar

Friday, March 8, 2024

Dr. Kilian Pohl will present, "Accelerating Neuroscience Discovery Using Shared Software and Data" on March 8, 2024, at 12 p.m.

About the Seminar

Sharing software and data has led to new discoveries in neuroscience and lowered the barriers for replication. Adequate power to promote discovery results from aggregating and repurposing well-curated data acquired by multiple sites. Studies based on NIAAA-funded NCANDA-A are exemplary of this sharing process. Since 2013, NCANDA-A has been collecting multimodal neuroscience data annually on 831 individuals (baseline age: 12–21 years). The data are uploaded and curated through a data management system called Scalable Informatics for Biomedical Imaging Studies (SIBIS) (https://github.com/sibis-platform). SIBIS relies principally on publicly available software to span the entire life cycle of electronic data (i.e., capture, harmonize, quality control, share, and analyze). This talk will review the design of SIBIS, identify the challenges in analyzing public multimodal data via machine-learning technology, and highlight research findings that resulted from overcoming those challenges. 

About the Speaker

In 2002, Kilian M. Pohl started sharing machine-learning software for the analysis of neuroscience data as part of his graduate research at the Massachusetts Institute of Technology and Harvard Medical School. Kilian is now a Professor in Psychiatry and Behavioral Sciences and, by courtesy, Electrical Engineering at Stanford University. He is the contact Principal Investigator of the Data Analysis Resource of the National Consortium on Alcohol and Neurodevelopment in Adolescence - Adulthood (NCANDA-A) and of the Computational Neuroscience Laboratory (CNSLAB). For neuroscience studies such as those conducted by NCANDA-A, the CNSLAB manages the data and creates machine-learning models to identify phenotypes that improve the mechanistic understanding, treatment, and prevention of neuropsychiatric disorders.

About the Seminar Series

The seminar is open to the public and registration is required each month. Individuals who need interpreting services and/or other reasonable accommodations to participate in this event should contact Janiya Peters at 301-670-4990. Requests should be made at least five days in advance of the event.

The National Institutes of Health (NIH) Office of Data Science Strategy hosts this seminar series to highlight exemplars of data sharing and reuse on the second Friday of each month at noon ET. The monthly series highlights researchers who have taken existing data and found clever ways to reuse the data or generate new findings. A different NIH institute or center will also share its data science activities each month.

NIH released RFI on Proposed Use of CDEs for NIH-Funded Clinical Research and Trials

Wednesday, February 21, 2024

Responses due April 20

The National Institutes of Health (NIH) released a Request for Information (RFI) on Proposed Use of Common Data Elements (CDEs) for NIH-Funded Clinical Research and Trials (NOT-OD-24-063). Responses are due April 20, 2024. 

NIH is requesting input on: a set of minimum core CDEs in the demographics/personal characteristics category; recommended CDEs in the clinical domains including autoimmune diseases and immune-mediated diseases; high-level CDEs for social determinants of health (SDoH) domains; tools and technologies that could enhance the use of NIH CDEs; and policies and governance that could facilitate and incentivize broader CDE usage in research and in data sharing and management.

This RFI is also an NIH effort to understand the challenges and opportunities in the use and development of CDEs in research and to inform appropriate NIH guidance and mechanisms to lower the barriers to CDE use and improve the ability to aggregate and integrate CDE-based data.

Interested parties may find additional information at: https://datascience.nih.gov/cde-rfi

Inquiries for this RFI should be directed to Belinda Seto, Ph.D., at cde-rfi@od.nih.gov.

April Data Sharing and Reuse Seminar

Friday, April 12, 2024

Mr. Andrew Smith will present ELIXIR: Working Together to Accelerate the Understanding of Life on April 12, 2024, at 12 p.m.

About the Seminar

ELIXIR is a pan-European research infrastructure for life science data. It recently published its Scientific Programme at https://elixir-europe.org/news/programme-2024-28, setting out its vision for the next 5 years. ELIXIR’s new strategic priorities acknowledge the importance of not only investing in science and technology but also building capacity and increasing participation in ELIXIR member countries. Already present in more than 20 countries across Europe, ELIXIR will work to:

  • Enable scientists across the globe to access and analyse life science data
  • Deliver services to support federated data management and analytics in life science
  • Equip national ELIXIR Nodes for successful long-term operations
  • Develop people and capacity to benefit science and society
     

About the Speaker

Andrew (Andy) Smith joined ELIXIR in 2011 to help establish the organization and support its progression from preparatory stages to permanence. Until June 2024, Andy is serving as Interim Director in addition to his role as Head of External Relations. 
As Head of External Relations, Andy manages ELIXIR’s engagement with Member States, funders, and policymakers. He also leads ELIXIR’s engagement with the EU institutions. His team is responsible for developing ELIXIR’s industry strategy and facilitating international collaborations between ELIXIR partners and global collaborators, including those in the United States. 
Andy has represented ELIXIR on the Organisation for Economic Co-operation and Development (OECD) and G7 Group of Senior Officials working groups on topics relating to open science and international collaboration. He is the coordinator of the EU-funded ELIXIR-STEERS project, which has a focus on software and workflow development best practices.
 

About the Seminar Series

The seminar is open to the public and registration is required each month. Individuals who need interpreting services and/or other reasonable accommodations to participate in this event should contact Janiya Peters at 301-670-4990. Requests should be made at least five days in advance of the event.

The National Institutes of Health (NIH) Office of Data Science Strategy hosts this seminar series to highlight exemplars of data sharing and reuse on the second Friday of each month at noon ET. The monthly series highlights researchers who have taken existing data and found clever ways to reuse the data or generate new findings. A different NIH institute or center will also share its data science activities each month.

Announcing the 2023 DataWorks! Prize Winners

Thursday, February 1, 2024

In early 2023, ODSS partnered with the Federation of American Societies for Experimental Biology (FASEB) to launch the second annual DataWorks! Prize to highlight examples of innovative data sharing and reuse.

This year, 39 teams registered for the challenge to demonstrate their accomplishments. The 218 team members came from a wide variety of disciplines, including biochemistry, clinical research, genomics, immunology, molecular biology, neuroscience, and more.

Representatives from the grand prize-winning team will present at the Data Sharing and Reuse Seminar series on Friday, Feb. 9, 2024.

Grand Prize $100,000

CCC19

COVID-19 and Cancer: Catalyzing Collaboration

The COVID-19 and Cancer Consortium (CCC19) is a collaboration that collects data about patients with cancer who have been diagnosed with COVID-19.

Distinguished Achievement Award $50,000

IPop CATS

GeoPIPE: Reusing Open Data and Letting Data Flow

GeoPipe is pipeline for enriching open data streams with geospatial analyses and natural language processing.

 

Maryellen Giger’s Team

Sharable Curated, Diverse Medical Images at Scale

MIDRC is a collaboration to create an open curated, diverse commons for medical imaging AI research and a sequestered one for translation.

Exemplary Achievement Award $25,000

ASAP Discovery Consortium

An Open Pipeline for Antiviral Drug Discovery

To nucleate a global antiviral pipeline to prevent future pandemics, we created a new model for open science accelerated drug discovery.

 

Karen Yook’s Team

Making Data Useable While Publishing

microPublication Biology re-architects the publishing workflow by including curators to alleviate numerous obstacles in data reusability.

 

StrokeFAIR

StrokeFAIR: A Public Dataset and Analytical Tools

StrokeFAIR shares FAIR images, metadata, and analytical tools for acute brain stroke, democratizing avenues to perform reproducible reliable research.

Significant Achievement Award $12,500

Caltech Library

Naming Data Files Descriptively for Easier Reuse

A worksheet for creating file naming conventions to label research data descriptively and consistently.

February Data Sharing and Reuse Seminar

Friday, February 9, 2024

Mr. Alex VanHelene, Dr. Sanjay Mishra, Dr. Michael Rooney, and Dr. Jeremy L. Warner will present COVID-19 and Cancer: Catalyzing Collaboration on February 9, 2024, at 12 p.m.

About the Seminar

To understand and assess the uncertain effects of COVID-19 on people affected by cancer, CCC19 was founded in March 2020 and developed a robust and agile strategy to collect and disseminate prospective, granular, uniformly organized information on patients with cancer diagnosed with COVID-19 — at scale and as rapidly as possible. This systematic data sharing recipe included three key components: data sourcing and acquisition, data management, and data model sharing.

Taking inspiration from existing best practices, CCC19 sought to accelerate clinical research by facilitating data sharing amongst 126 cancer institutions across North America and eventually logged more than 19,000 cases – the largest registry of its kind. Data standardization is managed through existing clinical vocabularies whenever possible. Through continuous quality assurance of contributed data from participating institutions, CCC19 ensures compliance and standardization with registry-based research standards. Our knowledge is publicly accessible because of the direct features of REDCap that enable local reusability, and open code sharing on GitHub. To further emphasize best practice “recipes” to advance biological and biomedical research activities, all CCC19 publications, the data model, and derived variable code are publicly available. CCC19’s transparent and streamlined approach to data management demonstrates the power of data sharing practices to advance scientific discovery and human health.

About the Speakers

Mr. Alex VanHelene is a Clinical Research Assistant at Rhode Island Hospital. Dr. Sanjay Mishra is the Research Program Manager at Rhode Island Hospital and Coordinator of the COVID-19 and Cancer Consortium. Dr. Michael Rooney is a Radiation Oncology Resident at the University of Texas MD Anderson Cancer Center. Dr. Jeremy L. Warner is a Professor of Medicine at Brown University and Director of the Research Coordination Center of the COVID-19 and Cancer Consortium.

About the Seminar Series

The seminar is open to the public and registration is required each month. Individuals who need interpreting services and/or other reasonable accommodations to participate in this event should contact Janiya Peters at 301-670-4990. Requests should be made at least five days in advance of the event.

The National Institutes of Health (NIH) Office of Data Science Strategy hosts this seminar series to highlight exemplars of data sharing and reuse on the second Friday of each month at noon ET. The monthly series highlights researchers who have taken existing data and found clever ways to reuse the data or generate new findings. A different NIH institute or center will also share its data science activities each month.

January Data Sharing and Reuse Seminar

Friday, January 12, 2024

Dr. Michelle Hribar will present Common Data Models for Ophthalmology Research Collaboration on January 12, 2024, at 12 p.m.

About the Seminar

Large diverse datasets are necessary for building accurate and unbiased AI/ML models in research for vision and eye health, but challenges in data standardization have become a barrier for creating these datasets. Large NIH data generation projects such as All of Us, N3C, and Bridge2AI include minimal ophthalmic clinical or imaging data since these data elements are not yet a part of their underlying common data model (CDM): the OMOP CDM. To address this gap, the Eye Care and Vision Research workgroup was created within the Observational Health Data Science and Informatics (OHDSI) community. As part of her NIH DATA Scholar work, Dr. Hribar co-leads this workgroup. In this talk, she will discuss the standardization efforts and an example research use case that our group has completed as well as the vision for future data models and infrastructure to support research in eye care and vision science. 

About the Speaker

Dr. Hribar is the Assistant Professor of Medical Informatics and Dr. Clinical Epidemiology at Oregon Health & Science University, School of Medicine. 

About the Seminar Series

The seminar is open to the public and registration is required each month. Individuals who need interpreting services and/or other reasonable accommodations to participate in this event should contact Janiya Peters at 301-670-4990. Requests should be made at least five days in advance of the event.

The National Institutes of Health (NIH) Office of Data Science Strategy hosts this seminar series to highlight exemplars of data sharing and reuse on the second Friday of each month at noon ET. The monthly series highlights researchers who have taken existing data and found clever ways to reuse the data or generate new findings. A different NIH institute or center will also share its data science activities each month.

Announcing the Launch of the RADx Tribal Data Repository

Friday, December 1, 2023

American Indian and Alaska Native (AI/AN) communities across the nation were — and continue to be — disproportionately impacted by the COVID-19 pandemic. Health disparities among AI/AN communities include an undue burden of infections, lack of access to health care and increased hospitalizations, and higher death rates.

To address the disparities recognized in these communities, the NIH has focused on supporting research projects that can increase our overall understanding of COVID-19 and its effects on AI/AN communities. In response to the May 2020 Tribal Consultation for COVID-19, NIH incorporated Tribal input into the  Rapid Acceleration of Diagnostics (RADx) Underserved Populations (RADx-UP) initiative. This initiative aims to accelerate innovation in developing and implementing testing strategies for COVID-19 based on community-engaged research.

This week, three years since the launch of RADx, I am incredibly honored and excited to announce the launch of the RADx Tribal Data Repository: Data for Indigenous Implementations, Interventions, and Innovations (RADx TDR: D4I).

RADx TDR: D4I will establish a data repository consistent with Tribal sovereignty for researchers and their collaborators interested in working with RADx data provided by American Indian and Alaska Native research participants to better understand and address the impact of COVID-19 and other health disparities. Specific activities will include education and training programs on best practices for responsible data sharing and access, and constructing a secure repository to support data storage, access, harmonization, and monitored sharing of data related to COVID-19 testing and vaccination.

In support of American Indian/Alaska Native researchers and other scientists working with those communities, will fund efforts working toward a better understanding of COVID-19 impact and provide data to allow for data informed decisions and policy development in addressing the COVID-19 pandemic and potential future pandemics.

The RADx TDR: D4I is supported under an “Other Transaction Agreement” (OTA) managed by ODSS, with six collaborative awards. The awardees includes Stanford University, the prime awardee, with Native BioData Consortium as project and research director, as well as the University of Wisconsin-Madison; The Ohio State University; the University of California, Santa Cruz; Arizona State University; and the University of Washington, Seattle.

As ODSS Director and NIH’s Associate Director for Data Science, I want to express my sincere gratitude to everyone who played a part in this project — my colleagues at the NIH Office of the Director, the NIH Tribal Health Research Office (THRO), the National Institute on Minority Health and Health Disparities (NIMHD), and especially the participants of the Tribal consultations for their guidance and collaboration on this trailblazing project.

ODSS is deeply committed to partnering with Tribal nations to support data science activities that improve the health of American Indian and Alaska Native communities. Across NIH, there is a growing number of Tribal health research efforts with an emphasis on trust, respect, and Tribal sovereignty. We look forward to the work of RADx TDR: D4Ias we continue to understand and address the impacts of COVID-19 and other health disparities. 

To view the NIMHD statement on this announcement, check out the NIMHD Director’s Letter: https://nimhd.nih.gov/about/directors-corner/messages/nih-launches-radx-tribal-data-repository.html

December Data Sharing and Reuse Seminar

Friday, December 8, 2023

Dr. Joaquin M. Espinosa will present Being FAIR in the pan-omics era: lessons from the INCLUDE Project on December 8, 2023, at 12 p.m.

About the Seminar

This presentation will discuss strategies and policies for effective sharing and reuse of large multidimensional datasets. Dr. Espinosa will discuss his experiences as a data generator, data analyst, collaborator, teacher, and mentor through the COVIDome Project, the Human Trisome Project, and the INCLUDE Data Hub.  Dr. Espinosa will illustrate the power of sharing data ahead of publication and the need for user-friendly data sharing platforms and intuitive data visualization portals. His presentation will include real-life examples applicable to the study of COVID19 and Down syndrome. He will also present on the importance of developing training and education opportunities for diverse stakeholders. Lastly, he will discuss the importance of international data collection and sharing at a global scale.

About the Speaker

Dr. Espinosa is the Executive Director of the Linda Crnic Institute for Down Syndrome and Professor of Pharmacology at the University of Colorado School of Medicine at the Anschutz Medical Campus. Dr. Espinosa received his Bachelor’s degree in Biology from the Universidad Nacional de Mar del Plata, Argentina, in 1994, and a PhD in Biology from the Universidad de Buenos Aires, Argentina, in 1999. Supported by a fellowship from the PEW Charitable Trusts, Dr. Espinosa completed his post-doctoral training at the Salk Institute for Biological Studies in La Jolla, California. In 2004, supported by a fellowship from the Leukemia and Lymphoma Society, he began his independent appointment at the University of Colorado Boulder, in the Department of Molecular, Cellular and Developmental Biology. In 2009 he was appointed to the Howard Hughes Medical Institute as an Early Career Scientist. At the Crnic Institute, Dr. Espinosa directs the Human Trisome Project, a pan-omics cohort study of the population with Down syndrome, which has enabled the design and launch of novel clinical trials to improve health outcomes in Down syndrome. Dr. Espinosa currently serves as the Leader of the Administrative and Outreach Core of the NIH INCLUDE Project Data Coordinating Center, a new data resource that aims to accelerate discoveries into the mechanisms underlying the increased risk of co-occurring medical conditions in people with Down syndrome.

About the Seminar Series

The seminar is open to the public and registration is required each month. Individuals who need interpreting services and/or other reasonable accommodations to participate in this event should contact Janiya Peters at 301-670-4990. Requests should be made at least five days in advance of the event.

The National Institutes of Health (NIH) Office of Data Science Strategy hosts this seminar series to highlight exemplars of data sharing and reuse on the second Friday of each month at noon ET. The monthly series highlights researchers who have taken existing data and found clever ways to reuse the data or generate new findings. A different NIH institute or center will also share its data science activities each month.

“Todos Somos, Somos Uno: We Are All, We Are One!” ODSS Celebrates Hispanic Heritage Month

Thursday, September 28, 2023

Guest Blog written by Dr. Samson Gebreab, AIM-AHEAD Program Lead

In celebration of the history, culture, and contributions of Hispanics and Latinos, the NIH Office of Data Science Strategy (ODSS) is highlighting one of its flagship initiatives. The Artificial Intelligence/Machine Learning Consortium to Advance Health Equity and Researcher Diversity (AIM-AHEAD) program, launched in 2021, is increasing diversity in the AI/ML workforce and building a more inclusive research community to address health disparities and advance health equity.

AIM-AHEAD’s overall mission is to bring the benefit of AI/ML to all people of diverse backgrounds, especially those who may have been left out in the AI/ML research enterprise.  Many historically underserved communities, including Hispanic and Latino communities, have not been well represented in the AI/ML workforce, datasets, research, and infrastructure development. The lack of representation can contribute to AI bias, leading to inaccurate clinical outcomes that may not reflect these underserved communities' health conditions or lived experiences.

ODSS recognizes that achieving diversity in the AI/ML workforce is critical to addressing the sources of AI bias contributing to health disparities and inequities. The AIM-AHEAD initiative provides a range of training opportunities across the academic continuum to increase the representation of Hispanic, Latino, and other underrepresented researchers in the AI/ML and data science space, including:

These training and fellowship programs include 15 Hispanic individuals. In recognition of Hispanic Heritage Month 2023, ODSS is pleased to share some of their perspectives in the video below.

 

 

AIM-AHEAD is also committed to using AI/ML to understand and addressing the varied factors driving the health disparities of Hispanic and Latino communities, including economic and healthcare access barriers, cultural factors, and lived experiences. In particular, the AIM-AHEAD program promotes community-centered AI/ML research projects that engage, empower, and closely collaborate with Hispanic and Latino community stakeholders when tackling their health challenges and needs:

  • An AIM-AHEAD-supported community-entered research project is a collaboration with ROSAesROJO that makes wellness and cancer prevention accessible to Hispanic/Latina women and their families in the United States. The researchers are working with the Bi-National Center at Texas A&M University and Hospital Mexico Americano in Nuevo Laredo, Mexico, to build a trilateral relationship to collect data and run a racially unbiased AI algorithm trial for breast cancer detection in the Mobile Mammogram vans.
  • AIM-AHEAD researchers, in partnership with Tepeyac Community Health Center and Clinic Chat LLC, are developing an artificially intelligent chatbot to facilitate improved access to cancer screening in English and Spanish-speaking Hispanic/Latino populations in Colorado experience disparities in cancer screening, timely diagnosis, and access to treatment for several cancers in comparison to other demographic groups.

These training and community-centered pilot projects reflect a small sample of AIM-AHEAD program activities focused on Hispanic and Latino researchers and communities. During Hispanic Heritage Month and beyond, we encourage you to visit the AIM-AHEAD website to engage and learn more about how the program is leading the way to advance health equity using AI/ML by bringing together diverse datasets, researchers, and communities.

“Todos Somos, Somos Uno: We Are All, We Are One!”