During the first year of the Generalist Repository Ecosystem Initiative (GREI), the effort has made noteworthy progress in fostering collaboration across the NIH generalist repository landscape. The GREI team has delivered on not only technical capabilities but on community outreach and engagement with a training webinar series, a community workshop, and conference presentations.
The GREI program brings together seven generalist repository awardees (Dataverse, Dryad, Figshare, Mendeley Data, Open Science Framework, Vivli, and Zenodo) to work together in a “coopetition” (competition and cooperation) model of collaboration to reduce the barriers to NIH data sharing, discovery, and reuse. The coopetition effort has organized into functional working groups focused on use cases, metadata and search, metrics, and community engagement with the goals of enhancing interoperability across generalist repositories and supporting the data needs of research communities.
In order to engage the community in GREI’s first year, the repositories held a workshop in January 2023 to examine generalist repositories as a key part of the evolving NIH data sharing landscape from the perspective of researchers, academic, and NIH communities; to offer training about use cases and best practices for data sharing and discovery in generalist repositories; and to gather community feedback to inform future GREI work.
The workshop came at an important time for data scientists. The 2023 NIH Data Management and Sharing Policy (DMSP) had just gone into effect, making it timely for researchers and those supporting them at institutions to consider data sharing practices and resources. Researchers indicated a strong need for more resources specific to the DMSP such as templates for data sharing, guidance on selecting the most appropriate repository for specific data types, and checklists for data sharing workflows and best practices. Workshop attendees also requested that the GREI team focus on developing new ways to give credit for open data sharing and reuse as well as making data discoverable without duplication, including through cataloging and connecting data.
“I’m quite proud of the progress we’ve made in GREI’s first year,” said Ishwar Chandramouliswaran, ODSS team lead for FAIR Data & Resources, who was instrumental in standing up and leading GREI. “We hope to take the momentum from year one and push GREI to new levels in the next year to not only lower barriers for sharing and reuse of NIH funded data, but more importantly to incentivize researchers to understand the value of sharing and data as a scholarly output in of itself.”
Indeed, GREI is engaged in some important work in their second year. They recently published the first iteration of use cases for sharing data and searching for data in each of the GREI repositories – a catalog of use cases that will grow over time. The repositories are currently collaborating to determine a common core metadata schema based on the DataCite schema that each repository will adopt, enhancing interoperability and discoverability of datasets across repositories. This schema is expected to be published in summer 2023 and open for public comment and iteration, as well as for adoption beyond generalist repositories. Similarly, common metadata will also support the collection of enhanced and common metrics of data impact across these repositories, another GREI goal to support tracking the impact of NIH funded research data. Lastly, they are engaging with key audiences including data librarians, academic institutions, and specific biomedical research communities that have reached out such as neuroscience to provide training and outreach via webinars, conferences presentations, and other resources, and to gather community feedback on their data sharing needs for generalist repositories. Community members can engage with GREI via [email protected] to ask questions, offer suggestions, and learn about upcoming events and new resources.
I am so happy to see programs like GREI flourishing under the sponsorship of our office. GREI is an excellent example of a program that touches many corners of the biomedical data science world. From bringing together the research community in workshops and trainings to creating the framework for the sharing and reuse of data, GREI is pushing for a more unified NIH. The more we work together like this, the more strides we can make in pushing data science forward and the more patients’ lives we can improve.