This month Dr. Susan Gregurick spoke with Dr. Daniel S. Katz from the University of Illinois Urbana-Champaign, where he’s chief scientist at the National Center for Supercomputing and Research Associate Professor in Computer Science, Electrical & Computer Engineering, and the School of Information Sciences, to dive deep into the world of software sustainability. Dr. Katz is a founding member of the Research Software Alliance (ReSA), which convened in 2019 to support recognition of research software as a fundamental and vital component of research worldwide.
Dr. Susan Gregurick: Dan, you have had an amazing impact in the fields of computational sciences, high performance computing, and software sustainability. Today I want to talk specifically about your work in the Research Software Alliance (ReSA). But before we get there, how did you get involved in and motivated by software sustainability?
Dr. Daniel Katz: I started off with a Ph.D. in one computational science field. Over time, I moved into other fields of computational science, then data science, and then into computer science and information science. Since then, I have been trying to stay involved in the software world in a way that allows me to participate in the details, but also try to think about what the bigger picture is and think about how those two different levels work together.
For me, it seems like almost any time that you find people who are trying to make a change, they're usually trying to do that because of some personal experience, as well as some broader experience. That’s definitely the case for me. I've always thought the software that I was doing was being kind of under-recognized and underappreciated, and felt like it was more important than people generally thought.
Once I began working at the National Science Foundation (NSF), I had the opportunity to be on the other side of the project proposal process as a funder — I had something like $15 million that I could spend on proposed software projects. Part of my job was to walk around to other parts of NSF and convince people in other areas that they should put their money in with my money, and we could together support more software — let's say in chemistry, or material science, or bioinformatics or some area that would enable better research in those areas. And I think that was reasonably successful; I was usually able to get somewhere between 30 and 50 percent more funding beyond my own and fund more things. I was reasonably happy with that. But when I would look at proposals, there were a lot of things that we were just never going to be able to fund because we had nowhere near the amount of funding that was needed, compared with the need for software. And that made me think that about, if we can't find people directly to do this work, how can we support them in doing it?
Dr. Gregurick: Your story is so motivating. I completely resonate with the issues that you found at NSF, and that you had too many good applications to the solicitations. But as you know we have limited pools of resources. Finding a way to create software sustainability, with limited resources, as part of the culture in computational sciences, and in academic institutions, would be impactful.
Funding organizations like the NIH can use funding as one lever to get projects off the ground, but funding is not always the motivator. What are some other things that we can do to support researchers?
Dr. Katz: There are different things that lead to a researcher working on certain software. One of them is that funders can either mandate or encourage things like software citation. That could lead to more software citation in general, which will lead to more credit for the people that are actually developing the software that's being used. Or we can use the new Data Management and Sharing Policy as an example. When you get people in a review panel that say, “The data set that's going to be produced in this project could be incredibly valuable, but the data management plan doesn't say that they're going to make it available publicly. I don't think we should fund this because of that,” I think that really is powerful. And it lets other reviewers that are in that same panel start to think, “Oh, maybe that's the way we should be thinking.” I think there's a lot of things about review culture that the funders could encourage.
Dr. Gregurick: I was excited to see the FAIR for Research Software principles published as part of RDA. Can you tell me how FAIR4RS supports software sustainability?
Dr. Katz: If you think about the goal of sustainable software as being that when somebody's produced software, you want it to last and to be to be used over time, then you need to do the work to make that happen. You can think about all the different pieces of FAIR as keys to that. If you want the software to be used, is it findable? Is it accessible? Can somebody get to it so that they can use it? Is it interoperable? Can they use it as part of a larger package? Is it reusable? Can somebody read it and understand what happened and learn from it?
I would say that all of FAIR is supporting software sustainability, but on the other hand, it's not sufficient for software sustainability, because there are pieces of software sustainability that really aren't a part of FAIR. FAIR doesn’t say anything about “correctness.” It doesn’t say anything about “robustness,” about the software being “error-free,” about it being “citable.” So there are all these other pieces that also support sustainability beyond FAIR, but FAIR is an excellent starting point.
Dr. Gregurick: I love that idea that FAIR is necessary but not sufficient. The need for trustworthy, robust, and reproducible software is also such an important concept as well. I’m hoping you can also tell me a bit more about the Research Software Alliance (ReSA) and your motivation to stand up this important group. How does ReSA interact with the Research Data Alliance (RDA)?
Dr. Katz: ReSA is kind of interesting because when I started working in this space, it was in software citation. The software citations principles work was done under FORCE11, which I saw as the community that was looking at how publishing works and how scholarly communication happens. In 2018, around the time I started with FORCE11, I started doing some things with the Research Data Alliance. I saw lots of projects at RDA that were data-focused, but then also some activities that weren’t entirely data focused. Michelle Barker, Neil Chue Hong, and I were all at an RDA meeting, thinking that we needed to do something that was like RDA, but focused on software. Michelle, at the time, was working for the Australia Research Data Commons and Neil was at the Software Sustainability Institute in the UK. We realized that there were all these different organizations that all had an interest in software sustainability, including our own, which for me included NCSA, the US Research Software Engineer Association (US-RSE), and the US Research Software Sustainability Institute (URSSI). But we didn't have a way to coordinate these activities or to work together or to learn from each other. ReSA was started to fill the gap.
There’s not a huge overlap with ReSA and RDA, but the two organizations have a formal relationship where we try to work on a number of things together, primarily related to some aspect of research software. We’re talking about having joint meetings of the funders forums of both organizations, which we’ll try out this year, starting at RDA’s P20 in March, and thinking about how the two organizations can share infrastructure and can learn from each other.
Dr. Gregurick: For those in the community who may not be familiar with ReSA, how can researchers participate? Also, how can funders like me participate in ReSA?
Dr. Katz: We're very happy to have individual researchers join the ReSA community. There’s a ReSA mailing list that comes out monthly, there are ReSA task forces, and there have been ReSA community calls. All of those things are open to anybody that wants to participate.
We're encouraging funders to become organizational members of ReSA now to participate in the funders forum and working groups that are looking at issues that funders have said are important to them. The funders forum is also a venue in which funding organizations that support research software can learn from each other and can collaborate.
The NIH Council of Councils held a meeting January 19 and 20, 2023, where Ishwar Chandramouliswaran (ODSS), Heidi Sofia (NHGRI), Lori Scott-Sheldon (NIMH), and I presented on our recent concept clearance: “Building Sustainable Foundations for Open Software and Tools in Biomedical and Behavioral Science.” The concept was approved by the Council of Councils and aligns with Dr. Katz’s overarching goal of making software more accessible. You can view the presentation here.