Dr. Julia Lane will present "Data Search and Discovery: Building an Amazon.com for Data" at the monthly Data Sharing and Reuse Seminar on October 14, 2022 at 12 p.m. EDT.
About the Seminar
The call for better data and evidence for decision-making has become very real as evidenced by the Federal Data Strategy, as well as the passage of both the Foundations of Evidence-based Policymaking Act (Evidence Act) and the CHIPS+ Act. The challenge to be addressed is finding out not just what data are produced but how they are used – in essence, to build an Amazon.com for data -so that both governments and researchers can quickly find the data and evidence they need. To paraphrase Lee Platt’s aphorism about HP - “If researchers knew what researchers know, they would be three times more productive"
This talk will provide an overview of a massive effort over the past five years which has been focused on finding out how data are being used, to answer what questions, and find out who are the experts, by mining text documents that are hidden in plain sight - in the text of scientific publications, government reports and public documents.
Just as with Amazon, the results are enormously powerful. The pilot, which is sponsored by agencies such as NSF’s National Center for Science and Engineering Statistics (NCSES) and the Department of Education’s National Center for Education Statistics (NCES) – has generated a prototype API and a dashboard that can be used – so that, for example, agencies can document dataset use for Congress and the public, program managers can identify investment opportunities rapidly and researchers can more easily build on existing knowledge rather than redoing things from scratch.
About the Speaker
Julia Lane is a Professor at the NYU Wagner Graduate School of Public Service. She is founder or co-founder of many data initiatives that have served the public good, including the Longitudinal-Employer Household Dynamics Program at the Census Bureau; the Star Metrics/UMETRICS program that led to the establishment of the Institute for Research on Innovation and Science at the University of Michigan; the New Zealand Integrated Data Infrastructure, which holds data from across various sectors; the NORC Data Enclave supporting research access to confidential data; the Patentsview project to increase the usability of patent data; and the Coleridge Initiative to use data more effectively in government decision-making. She currently serves on the Advisory Committee on Data for Evidence Building and the National AI Research Resources Task Force. Her most recent paper was published in Nature, and used UMETRICS data.
About the Seminar Series
The seminar is open to the public and registration is required each month. Individuals who need interpreting services and/or other reasonable accommodations to participate in this event should contact Rachel Pisarski at 301-670-4990. Requests should be made at least five days in advance of the event.
The National Institutes of Health (NIH) Office of Data Science Strategy hosts this seminar series to highlight exemplars of data sharing and reuse on the second Friday of each month at noon ET. The monthly series highlights researchers who have taken existing data and found clever ways to reuse the data or generate new findings. A different NIH institute or center will also share its data science activities each month.