
NIH Workshop on the Role of Generalist and Institutional Repositories to Enhance Data Discoverability and Reuse
UPDATE: On July 16, 2020, the workshop co-chairs and participating generalist repositories published a generalist repository comparison chart.
The Office of Data Science Strategy at the National Institutes of Health (NIH) and the National Library of Medicine hosted a workshop on the Role of Generalist Repositories to Enhance Data Discoverability and Reuse on Feb. 11–12, 2020. The workshop was held at the Lister Hill Auditorium on the NIH main campus in Bethesda, MD, and a workshop summary is available.
The primary goals of the workshop were for participants to:
- Learn how generalist repositories see themselves in the larger biomedical data repository landscape.
- Understand how institutional data repositories are creating suites of solutions for their researchers and how they see generalist repositories fitting into this landscape.
- Consider desired characteristics of data repositories and how they relate to institutional expectations of data storage and preservation solutions.
- Explore adoption of common infrastructure, standards, and federated search solutions to enable greater discoverability of NIH research data across federated data repositories.
- Address the role of data curators in ensuring that data and metadata are sufficiently well curated to enhance discovery and enable reuse.
Recordings are available for each day of the workshop (Day 1 and Day 2). Available presentations are accessible by clicking the name of the presentation in the agenda below.
Day 1 |
|
---|---|
Setting the Stage |
|
Keynote Address A Blueprint for the Research Data Landscape Sayeed Choudhury, Johns Hopkins University |
|
Session 1: Introducing the Generalist and Institutional Repository Landscape | |
This session provided a quick introduction to multiple generalist repositories to help set a common understanding of how they operate and so on. Each speaker introduced their platform and described certain characteristics. | |
Vivli: A Global Clinical Trial Data Sharing Platform
Mendeley Data: Enhancing Data Discovery, Sharing, and Reuse
Building Policy-Compliant Infrastructure for Research Data
Community-Minded Data Publishing at Dryad
Zenodo: Specialists Welcome!
Dataverse: A Software, a Community, a Network of Repositories |
|
Session 2: Enabling Data Discovery | |
Given such a complicated repository landscape with data sets potentially located in any one of thousands of existing repositories—often with little metadata—discovering data sets can be difficult. Although users may know of relevant subject-specific repositories, the discoverability challenge is compounded when data sets are located in generalist or institutional repositories where a user might not think to look. This session explored techniques for enabling discovery of data in generalist and institutional repositories, including the development of a common metadata model for data, expert curation to enhance metadata, and linking of digital research objects through identifiers. | |
Dataset Metadata Model (DATMM): A Common Model to Drive Discovery and Adoption
The Role of Institutional Repositories in Data Discovery
PID Graphs: Muggle Scientists Develop Harry Potter “Marauder’s Map” Technology |
|
Session 3: Enabling Data Reuse | |
This session considered several aspects of data reuse, including two different “levels” of reuse and implications for how much pre-work needs to be done to the data: (1) reusing to repeat findings in a publication with which the data are associated and (2) reusing the data to address new scientific questions. This use case also often requires that data be combined with data from other sources—sometimes of a similar type and sometimes of a different type. | |
What Researchers Need When Deciding Whether to Reuse Data: Experiences from Three Disciplines
Collaboration and Re-Use: Experiences with Institutional Data Catalogs
What Role Can Publishers Play in the Open Data Ecosystem? |
|
Breakout Groups: Identifying Common Practices in Discoverability and Reusability | |
Groups were asked to address specific challenges such as:
|
|
Recap of Day 1 and Recess | |
Day 2 |
|
Report Back from Day 1 Breakouts on Data Discovery and Data Reuse |
|
Session 4: Facilitating Reproducibility | |
This session focused on how generalist and institutional repositories support reproducibility of the findings of particular experiments and publications as another major use case for effective sharing of data. | |
Librarian Role in Facilitating Reproducibility through Repositories
Reproducible and Rigorous Research on Open Science Framework (OSF)
Perspectives on Reuse and Reproducibility from a Commercial Research Repository |
|
Session 5: Managing Technical and Cultural Change in Research | |
Data sharing in generalist and institutional repositories will become increasingly important as the NIH and other funders begin to require data sharing, but for some researchers, this is a significant change in how they work with their data. This session addressed the challenges in changing how scientific resources are managed, supported, and used—considering personal and institutional incentives and how to align goals with such drivers of behavior and perspective. | |
Managing Technical and Cultural Change in Research
Top Down, Bottom Up, and Everything In Between: ORCID’s Multifaceted Approach to Technical and Cultural Change
Operationalization of Open Science at the Montreal Neurological Institute—Lessons Learned
Generalist Repositories: NSF Policy and Perspective |
|
Closing Remarks and Adjournment
Wrap-up Slides |
This page last reviewed on September 9, 2022