
Exploring a Generalist Repository for NIH-funded Data
Background
In July 2019, NIH’s Office of Data Science Strategy (ODSS) established the NIH Figshare instance, a one-year pilot with existing generalist repository Figshare, to determine how biomedical researchers may use a generalist repository for sharing and reusing NIH-funded data.
NIH’s overarching goal is to support a more seamless repository ecosystem to ensure that data and other digital objects resulting from NIH research can be stored and shared with the research community. While NIH encourages the use of domain-specific or institutional repositories where available, not all datasets have a logical home in one of these repositories. This pilot allowed ODSS to test the need for and utility of a generalist repository to fill these gaps in the biomedical data repository landscape.
As part of the pilot project, the NIH Figshare instance offered some additional features that NIH wanted to test beyond Figshare’s standard features:
- Search across NIH-funded data on Figshare.
- Support for larger datasets and data files of any type.
- Detailed, NIH-specific metadata to improve discoverability of research and direct links to NIH funding sources and publications.
- User support from a Figshare team member with expertise in biomedical research, including review of data files and description to ensure highest quality and greatest discoverability.
Learn More about Generalist Repositories
NIH encourages researchers to use a generalist repository that meets OSTP criteria when a domain-specific or institutional repository is not available.
Read Dr. Gregurick's blog post offering "Some Insights on the Roles and Uses of Generalist Repositories."
See a comparison of generalist repositories at FAIRsharing.org.
See how generalist repositories are helping researchers share COVID-19 data.
Learn how generalist repositories are enhancing data discoverability and reuse.
Key Takeaways
Overall, this pilot demonstrated that generalist repositories can help fill gaps in the repository landscape for NIH-funded researchers and validated that there is use for a generalist repository that can accept heterogenous and large files. ODSS learned three key lessons from the pilot:
- Generalist repositories are growing. More researchers are depositing data and more publications are linking to generalist repositories.
- Researchers need more education and guidance on where to publish data and how to effectively describe datasets with detailed metadata.
- Metadata enhancement enables greater discoverability. Expert metadata review proved to be one of the most impactful and unique features of the pilot instance.
When compared to uploads indicating NIH funding to the main Figshare repository, the NIH Figshare instance had uploads with titles that are 2x in length and descriptions that are 3x in length.
While the NIH Figshare instance is now an archive, the data is still discoverable and reusable.
Want to learn more about this project? Watch a video of Figshare's founder and CEO, Mark Hahnel, Ph.D., discuss project outcomes and lessons learned, as well as his thoughts on the future of data sharing, or read a summary in this NIH Record article.
Project Outcomes
Over the course of the one-year pilot, NIH assessed how the NIH Figshare instance was meeting researchers’ needs and how it was making an impact on data sharing and discovery. Key outcomes of the assessment are below.
-
366 users, 209 uploaded items, 1499 GB of storage used.
- 30,167 total page views.
- Email campaigns and webinars resulted in 31% of new users.
- 22% of data in the NIH Figshare Instance came from intramural researchers across 9 NIH institutes and centers.
- 72% of items uploaded were datasets; other item types include code, software, figures, multimedia files, workflows, and online resources.
- The repository instance includes data funded by 22 different NIH institutes and centers.
- 29 different public items in NIH Figshare have each been cited at least once, and five items have been cited twice.
- When compared to uploads indicating NIH funding to the main Figshare repository, the NIH Figshare instance had files that are 8x in size and with 2.5x as many views.
NIH Figshare Instance Case Studies
Five case studies take a deeper dive into the ways the instance made an impact:
-
Flexible organization for large datasets: Storing and sharing x-ray scattering data on the NIH Figshare instance
- This case study explores the process of James Fraser and Michael Thompson, researchers in the Fraser Lab at the University of California, San Francisco, uploading x-ray scattering data to the NIH Figshare instance.
-
Using an API to upload large datasets: Making the NIH Figshare instance part of the research lifecycle: a case study of sharing single cell databases in the Carpenter Lab at The Broad Institute of MIT and Harvard
- This case study examines how Gregory Way, postdoctoral associate in the Carpenter Lab at The Broad Institute of MIT and Harvard, and his colleagues published single cell databases on the NIH Figshare instance using Figshare’s API.
-
Collections of heterogenous data: Using the NIH Figshare instance to make fMRI and eye movement data associated with a publication openly available
- This case study explores how Michal Ramot, NIH intramural researcher and visiting fellow at the National Institute of Mental Health, and her colleagues published a collection of neuroimaging research in the NIH Figshare instance.
-
Reusing and sharing a non-traditional output: Using the NIH Figshare instance to share a cholesterol calculator for reuse and further collaboration and development
- This case study showcases how Dr. Alan Remaley and Maureen Sampson, intramural researchers at the National Heart, Lung, and Blood Institute, used NIH Figshare to share equation calculators they created for a novel way to calculate low-density lipoprotein cholesterol.
-
Sharing materials supporting a publication from across repositories: Using NIH Figshare to collect supplementary material associated with a publication: a case study of Yosuke Tanigawa
- This case study demonstrates how Yosuke Tanigawa and his colleagues at Stanford University used NIH Figshare to share supplementary data supporting a NHGRI-funded publication in PLoS Genetics and grouped it in a collection that also includes PLoS supplementary materials for the same publication, which were also published on Figshare.
Read 10 use cases from the pilot
To learn more about the NIH Figshare instance, visit the “About” page of the archive. For technical questions about the NIH Figshare archive or using figshare.com to share NIH-funded research, consult this guide or contact info@figshare.com. For information about sharing NIH data using repositories, contact datascience@nih.gov.
This page last reviewed on May 25, 2021