In June 2022, Harvard Data Science Review (HDSR) guest editors for the special theme on Changing the Culture on Data Management and Data Sharing in Biomedicine, Maryann Martone and Richard Nakamura, conducted an online interview with Dr. Lawrence Tabak, Acting Director of the National Institutes of Health (NIH) and Dr. Lyric Jorgenson, Acting Associate Director for Science Policy and the Acting Director of the Office of Science Policy at the NIH. The quartet talked about the importance of the new data management and sharing policy to the NIH and updates on the NIH’s activities and priorities since the April 2021 workshop on data management and sharing.
This interview is part of HDSR’s Conversations with Leaders series.
HDSR includes both an audio recording and written transcript of the interview below. The transcript that appears below has been edited for purposes of grammar and clarity.
Richard Nakamura (RN): [00:00:00] Welcome to Conversations With Leaders, a production from the Harvard Data Science Review. I'm Richard Nakamura, retired former director of the National Institutes of Health Center for Scientific Review, and former deputy director and scientific director of the National Institute of Mental Health.
Maryann Martone (MM): [00:00:18] And I'm Maryann Martone, professor emerita in the Department of Neurosciences at the University of California, San Diego. Today we are talking about the implementation of the data management and sharing policy issued by the National Institutes of Health and discussed at the National Academies of Science, Engineering, and Medicine meeting in April 2021. The policy goes into effect in January 2023. This interview is part of a special collection of articles for the Harvard Data Science Review to be released in summer of 2022 in anticipation of the data sharing and management plans due in grant applications starting January 2023.
RN: [00:01:01] We are joined today by Dr. Lawrence Tabak, acting director and longtime principal deputy director of the National Institutes of Health.
MM: [00:01:12] And Dr. Lyric Jorgenson, acting associate director for science policy and the acting director of the Office of Science Policy at the NIH.
RN: [00:01:24] So have there been any National Institutes of Health or federal policy changes on data sharing and management since the National Academies of Science, Engineering, and Medicine meeting of April [2021]?
Lyric Jorgenson (LJ): [00:01:40] I'm happy to take that one. There's been a lot of work that has gone since we last met then—since the Academies workshop took place, I can talk about a good share of it—but in general the Federal Government is really committed to this topic, has a lot of interest in advancing these aims. I think Dr. Tabak could talk, at least from the elements of the COVID experiences of how this has brought this really into focus. I don't know if, Dr. Tabak, you'd like to start with that, and then I can talk about some specific NIH activities.
Lawrence Tabak (LT): [00:02:08] Well, I think the general principle of data sharing equals good science. It ensures transparency, it ensures a greater confidence in what scientists are doing. And obviously, it also extends the value of what scientists are doing. This came into very sharp relief during the pandemic, where it was essential for investigators to share their data as rapidly as possible. And I think, overwhelmingly, the scientific community did just that. And as a result, everybody benefited. You know, as you know, when you're dealing with something that's an unknown, the best way to solve the problems is for everybody to put their heads together. And that was done virtually and in rapid order. And it transcended academia, industry, and the government sectors, perhaps in an unprecedented way. But now that we've been able to do that, I'm not sure there's any way we would want to turn back. It has just proven to be too valuable.
LJ: [00:03:19] Some of the specifics directly related to the workshop, as Dr. Tabak mentioned, there is a real commitment to making this work. There's a real value in seeing real-time data sharing, frankly, as we saw with COVID. After the workshop, the value has been not only in sharing the data, but understanding how to make it accessible and usable. Just having the data available for people to peruse doesn't mean it's meaningful or useful. And again, we've learned a lot of this with COVID, and researchers have known this for some time. Just simply putting data out there does not mean it will be used or people can find it. So NIH has been thinking a lot in terms of implementing the data management sharing policy, not to make it a checkbox of things that we ask researchers to do, but to make a meaningful culture shift in how information is used and accessed. So we've been working very closely with our Office of Data Science and Strategy, held up by Dr. Susan Gregurick, who's been doing amazing work in getting repositories to be accessible and interoperable and in compliance with FAIR standards. We're working very closely with our Office of Extramural Research (OER), who has been thinking about the additional incentives to be able to make this information accessible to the public and in central locations for sustainability purposes. So I think we've seen a lot here at NIH. The White House, as I mentioned, is also really committed to this, just in the past couple of weeks, released some more guidance around repository characteristics again, so that we are working in lockstep agreement with our partners to advance this forward in a way that is useful for the community.
LT: [00:04:49] And I think it's important to add that this policy didn't just arise in Bethesda; this came as a result of extensive consultations with investigators, with various institutions, with research participants. We did a special outreach to tribal nations because of their sovereignty and unique issues that they are concerned about and too many others to ensure that what we came up with was responsive to what is, of course, the very broad breadth and diversity of research. And so this was not done in a vacuum and hopefully does not come as a surprise to anybody.
RN: [00:05:35] What progress has been made on the issues raised at the meeting for changing the culture?
LT: [00:05:41] Well, you know, let me start. Changes in cultural shifts are difficult to measure, but people are not bashful when they comment about changes that the NIH undertakes. And I would say for the most part, we are already seeing a pretty reasonable buy-in, at least conceptually. I think everybody says, 'Well, this is a great idea. It's very important. But—' And you know it's going to follow '—the devil is in the detail.' And that's now what we're really working towards: really elaborating on some of those details to reassure people that this is all being done in a thoughtful way, in a way that is doable and that will not yield undue burden on anybody. But again, it's new. And so folks understandably have some level of disquiet. But I would say on balance, conceptually, everybody buys in. And now it's just a question of working through some of the detail.
LJ: [00:06:47] And as you all know from the meeting, we heard about some of those details that needed or would benefit from being further addressed, right? A decent amount of the meeting was devoted to those topics. So we have worked to take some of those issues raised at the meeting to really implement some solutions or at least some activities. So for instance, one of the questions is how do we share participant data in a way that is respectful, respects privacy, identifiability. Some researchers know this field really well; others will be a little bit newer to it. So we've been working on developing guidance for the community about how do we protect participant privacy and respect autonomy in research. And so that is out for public comment now to be able to help researchers and move forward and do that respectfully. We also have been working, as Dr. Tabak mentioned, with our tribal partners and have been putting out guidance for how to work with tribal communities and some of the unique considerations researchers may need to be thinking about in terms of moving forward with their work. We also have some additional FAQs that we've released, a beautiful new website we heard at the community meeting as well that this information is relatively diverse and where it is located, to put it mildly. And so OER did a very beautiful job in putting this all together on a centralized site for NIH so it can be a one stop shop for researchers to be able to know where this is located. Also, what we heard about at the meeting is 'Is the policies important?' Balancing carrots and sticks is key. The culture change here were trying to move really requires both. The policy has some sticks, of course, for compliance, but what are the incentives? So we heard a lot about having credit for data generators. We've been working to think about persistent identifiers and how we can give credit to researchers. We've also been working with our journal publishers and others to be able to think about where those credit mechanisms can lie. And I think that the big thing that Dr. Tabak raised is there's a general consensus for wanting to do this, but it needs to be not burdensome. And we all understand why, because there is increasing burden, we need to make sure it's commensurate with benefit. And so we are really working to how to provide the least burdensome processes for investigators to be complying with this and streamlining, for instance, our GDS [Genomic Data Sharing] policy plan requirements with our data sharing plan requirements because no one wants to be submitting multiple plans based on unique data sets, right? So looking where we can reduce any of the burden that really is not valuable for researchers has been a real priority since this meeting.
LT: [00:09:12] And we're continuing to develop the resources that will enable compliance, to try and make things easier. Again, no one disagrees with the principles of FAIRness, right? Findability, accessibility, interoperability, and reusability. Nobody disagrees with that. In fact, everybody wants it. Where do I sign up? It's just a question of how best to implement that. And I think we are making good strides toward that goal.
RN: [00:09:43] I'd like to hear a little bit more about two things. One is ensuring equity of costs and benefits. There was a concern about that issue. And then ensuring compatibility of federal, international, and private sector approaches to sharing.
LT: [00:10:00] I'll let Lyric start, and then I can build on what she says.
LJ: [00:10:05] I'll tackle the first question around equity and cost and sharing. And this builds on nicely to the point that Dr. Tabak raised: we really want a robust ecosystem of data sharing. We want data sharing and data to be considered a research output, just like a publication is a research output, recognizing through that it might take us a while to get that to be ingrained in the culture, and we want to make sure we keep researchers all moving in that direction. So there are some flexibilities in the policy for deliberate reasons to ensure we can lift all boats and not actually create inequities in our funding stream. So, for instance, having the FAIR principles required, it was a real conversation piece for us. As Dr. Tabak mentioned, we very much strive for that. But we also didn't want to cut people out of the ability to participate in NIH funded research for not having the infrastructure in place to be able to do so. We want to help, again, help our researchers get there and make these infrastructure systems, which again is a lot of the work that our Office of Data Science Strategy is doing. So we've been thinking a lot in terms of, again, this is supposed to be a culture shift. It's not intended to be punitive. Of course we're prepared to take compliance actions, but how do we lift all boats to make science better?
LT: [00:11:18] And your question about compatibility both with other funders and international funders: obviously we can control what we can control and there are certain things that are out of our control. But certainly the expectations that we have are very much in alignment with the NSF [National Science Foundation], with the Wellcome Trust, and with the Gates Foundation. And I think another catalyst for this is increasingly, as you both know, the journals are requiring more robust sharing as a condition for publication. And so there's a bit of a push pull. The funding organizations lay out what their expectations are. The journals are getting in alignment with us. And that's the perfect mix because obviously investigators want to publish their work. And if they have to do this in order to publish it, that's sort of the ultimate incentive, if you will. But we are certainly in alignment with major funders. And again, we'll continue to refine this. As you know, we have significant collaborations with all of the major funders, and we'll continue to refine this as necessary.
RN: [00:12:39] Maryann?
MM: [00:12:41] Yes. Thank you. You touched on this a little bit in your previous answers, but at the workshop it was recognized that the policy requires the creation of the data management and sharing plan, and that needs to go through review. But a major topic was compliance and enforcement follow through. It was really felt by many of the participants that without enforcement, the policy would not succeed in changing the culture. So we'd like to ask you about what steps the NIH is going to take to determine whether researchers are actually following up with their stated plans.
LT: [00:13:12] Well, this is important. And so it'll become a part of what we've termed the term and condition of the award. It is, as I'm sure you both know, this is the sort of agreement that one makes with the institution. So it's at the institutional level, and the compliance of this will be monitored at regular reporting intervals. And we've been pretty transparent that compliance of not only this issue, but those things that are in the terms and conditions can be taken into account for future funding decisions. Now, again, the ultimate stick, if you will, which hopefully one never has to use, but because it's at an institutional level, there is an inherent incentive institutionally to make sure that your faculty are adhering to what the expectations are. And is one of the advantages of working at the institutional level: no one wants to jeopardize an entire institutional program. I don't know if there's additional details that you want to share, Lyric?
LJ: [00:14:23] I think that that is as perfect answer for from the NIH perspective. And one of the things from the workshop that we also heard. That the data sharing will be an ecosystem-wide challenge for people to be thinking about. And so there is some work to try to make the plans publicly available. That's coming down the road. It probably won't happen right at launch. But again, there is some accountability and transparency among the research community themselves to be able to look at plans that should cite where the data should be posted so that they can go to those repositories to be able to find information. So the compliance component—of course, NIH has its official mechanisms, as Dr. Tabak described—but there will also be some community involvement, which I think will help urge people to move in the right direction here.
MM: [00:15:07] Thank you very much. I think that answer will be reassuring to many. And finally, do you want to make an additional, overarching statement on the importance of data sharing to NIH and science?
LT: [00:15:18] Well, I mean, I can't overstate this. It's really the essence of what we do. And as we have now seen unambiguously during the pandemic, progress is dramatically accelerated when you have data sharing among all interested parties. We can't go back; we have to continue to forge forward. We want to be able to extract every usable bit and bite of the data that is generated. We want to maximize the public investment in science. And also, quite frankly, by sharing, you are reassuring people about the data, about the process. And at a time when, unfortunately, some people are articulating distrust or disquiet with science, I think it's even more important that we share data as one really good way of overcoming that distrust or disquiet. And so we are 1000% supportive of this. We're devoting tremendous amounts of energy and staff time to this and want very much to work with our communities to ensure that this is as successful as possible as soon as possible.
MM: [00:16:50] Dr. Jorgensen?
LJ: [00:16:52] You heard it from NIH leadership.
MM: [00:16:58] Thank you so much for sitting down with us today. And we look forward to future success of this policy and also getting the special issue out finally.
LT: [00:17:09] Well, good luck with that. I'm sure it's hanging over your heads and you want to finish it as soon as possible and then enjoy the rest of your summer. So, best of luck to both of you. And thanks very much for engaging with us today.
Richard Nakamura, Lawrence Tabak, and Lyric Jorgenson have no financial or non-financial disclosures to share for this interview. Maryann Martone is a founder and has equity interest in SciCrunch Inc, a tech start up that provides tools and services in support of rigor and reproducibility.
©2022 Richard Nakamura, Maryann Martone, Lawrence Tabak, and Lyric Jorgenson. This interview is licensed under a Creative Commons Attribution (CC BY 4.0) International license, except where otherwise indicated with respect to particular material included in the interview.