Friday, March 11, 2022
March Data Sharing and Reuse Seminar
Carole Goble, CBE FREng FBCS CITP and Frederik Coppens, Ph.D. will present "RDMkit, a Research Data Management Toolkit Built by the Community for the Community" at the monthly Data Sharing and Reuse Seminar on March 11 at 12 p.m. ET.
About the Seminar
Starting in 2023, the US National Institutes of Health (NIH) will require institutes and researchers receiving funding to include a Data Management Plan (DMP) in their grant applications, including the making their data publicly available. Similar mandates are already in place in Europe, for example a DMP is mandatory in Horizon Europe projects involving data.
Policy is one thing - practice is quite another. How do we provide the necessary information, guidance and advice for our bioscientists, researchers, data stewards and project managers? There are numerous repositories and standards. Which is best? What are the challenges at each step of the data lifecycle? How should different types of data? What tools are available? Research Data Management advice is often too general to be useful and specific information is fragmented and hard to find.
ELIXIR, the pan-national European Research Infrastructure for Life Science data, aims to enable research projects to operate “FAIR data first”. ELIXIR supports researchers across their whole RDM lifecycle, navigating the complexity of a data ecosystem that bridges from local cyberinfrastructures to pan-national archives and across bio-domains.
The ELIXIR RDMkit (https://rdmkit.elixir-europe.org) is a toolkit built by the biosciences community, for the biosciences community to provide the RDM information they need. It is a framework for advice and best practice for RDM and acts as a hub of RDM information, with links to tool registries, training materials, standards, and databases, and to services that offer deeper knowledge for DMP planning and FAIR-ification practices.
Launched in March 2021, over 120 contributors have provided nearly 100 pages of content and links to more than 300 tools. Content covers the data lifecycle and specialized domains in biology, national considerations and examples of “tool assemblies” developed to support RDM. It has been accessed by over 123 countries, and the top of the access list is … the United States.
The RDMkit is already a recommended resource of the European Commission. The platform, editorial, and contributor methods helped build a specialized sister toolkit for infectious diseases as part of the recently launched BY-COVID project. The toolkit’s platform is the simplest we could manage - built on plain GitHub - and the whole development and contribution approach tailored to be as lightweight and sustainable as possible.
In this talk, Carole and Frederik will present the RDMkit; aims and context, content, community management, how folks can contribute, and our future plans and potential prospects for trans-Atlantic cooperation.
Data policy must be partnered with data practice. Our researchers need to be the best informed in order to meet these new data management and data sharing mandates.
About the Speakers
Professor Carole Goble, CBE FREng FBCS CITP
ELIXIR-UK, The University of Manchester, UK
Carole Goble is a Professor of Computer Science at the University of Manchester, UK where she leads a team of Researchers, Research Software Engineers and Data Stewards. She has spent 25 years working on reproducible science, open data and method sharing, knowledge, data and metadata management and computational workflows in a range of disciplines, but mostly biomedicine. She has led many scientific and e-Infrastructure projects and resources at the national and European level and was an early pioneer of linked data approaches in the Life Sciences, founding the FAIRDOM RDM infrastructure that enables bioscientists to manage the research objects produced by their projects. She is Head of Node of ELIXIR-UK, a national node of ELIXIR, the pan-national European Research Infrastructure for Life Science data, and co-leads the ELIXIR’s RDMkit, a data management toolkit for bioscientists and data stewards, as well as the WorkflowHub registry and the TeSS training portal. A long-time advocate of FAIR and Open Data, she serves as the UK representative on the G7 Open Science Working Group and is one of the authors of the original FAIR data principles paper and is the director of FAIR Computational workflows for the Workflow Community Initiative.
Frederik Coppens, Ph.D.
ELIXIR-Belgium, VIB-UGent Center for Plant Systems Biology, Belgium
Frederik Coppens is Head of Node for ELIXIR Belgium and is IT manager at the VIB-UGent Center for Plant Systems Biology. For more than a decade, he has focused on providing infrastructure and services for data in life sciences. Frederik is heading a multidisciplinary team focusing on FAIR data and reproducible data analysis. The team is involved in leading roles in many European projects, contributing to the development of the vision of the European Open Science Cloud. Frederik is co-leading RDMkit, the data management toolkit for bioscientists and data stewards developed by ELIXIR, and WorkflowHub, the ELIXIR registry for computational workflows. Frederik is a member of the Galaxy Executive Board and ELIXIR Belgium hosts a Belgian Galaxy instance. The team contributes to the further development of the Galaxy Research Environment, with a focus on facilitating access to and sharing of data and provenance of workflows.
Frederik was appointed as Belgian delegate in the Strategy Working Group on Data, Computing and Digital Research Infrastructures for ESFRI, 1 Million Genomes WG5 (ICT), the Flemish Supercomputer Center User Council, and the Flemish Open Science Board. More recently, access to and sharing of human (genomic) data has become a priority, contributing to the establishment of a biobank and associated digital ecosystem in Belgium to link health and research data, aligned with the developments in Europe.
About the Seminar Series
The seminar is open to the public and registration is required each month. Individuals who need interpreting services and/or other reasonable accommodations to participate in this event should contact Rachel Pisarski at 301-670-4990. Requests should be made at least five days in advance of the event.
The National Institutes of Health (NIH) Office of Data Science Strategy hosts this seminar series to highlight exemplars of data sharing and reuse on the second Friday of each month at noon ET. The monthly series highlights researchers who have taken existing data and found clever ways to reuse the data or generate new findings. A different NIH institute or center will also share its data science activities each month.