Yesterday, during the annual meeting of the NIH PubMed Central (PMC) National Advisory Committee, NIH staff noted that they were working to include Department of Veterans Affairs (VA)-funded articles into the PMC database. It turns out that very quietly, in early March, the VA released its plan for policies ensuring public access to articles and data resulting from its funded research, as required by the February 2013 White House directive. The plan provides mechanisms for the agency to broadly share articles and research data while fully respecting the Agency’s ethical and legal obligations to safeguard the privacy of veterans and other VA research subjects.
VA Plan for Articles: Partner with NIH’s PubMed Central
The VA’s public access plan calls for the agency to partner with the National Institutes of Health (NIH) to use PubMed Central (PMC) as the repository for articles. The plan calls PMC “a firmly established public-private partnership…that ensures that members of the public can read, download, and analyze final manuscripts or final published documents in digital form.”
The agency also notes that it was critical to them that the use of PMC also ensures that texts and their associated content will be stored in nonproprietary and/or widely-distributed archival, machine readable formats; provide access to persons with disabilities in accordance with Section 508 of the Rehabilitation Act of 1973; and enables interoperability with other Federal public access archival solutions and other appropriate archives.
All VA-funded researchers will be required to deposit their final peer-reviewed manuscripts into PMC upon acceptance in a peer-reviewed journal and make them available to the public with no longer than a 12-month embargo period. VA will also accept final published articles where allowed and will follow the NIH’s current format requirements.
Interestingly, unlike other agencies, the VA plan does contain a specific clause addressing the OSTP requirement that agencies provide stakeholders with a mechanism for petitioning the agency to “shorten or extend the allowable embargo period.” Presumably, the VA will follow the guidance that the NIH has established on this point, allowing stakeholders and funded researchers to provide data indicating how the public’s interest can be better served by changing the embargo period for their specific research.
The VA will also presumably follow the NIH’s current rules on reuse rights for articles in the PMC database, putting the onus on authors to ensure that they have secured sufficient rights to deposit articles, and limiting reuse rights to the bulk of the PMC collection to only those currently allowed under Fair Use.
Finally, the VA indicates that they will provide a system to monitor compliance with the new policy over time, using data from PubMed Central and other sources.
This policy will be effective for articles resulting from research funded by the VA Office of Research and Development on October 1, 2015, and for articles resulting from research funded by Veterans Health Administration (VHA) Program Offices on December 31, 2015.
VA Plan for Research Data: A Phased Approach
The VA’s plan for research data calls for all investigators requesting funding to submit a Data Management Plan (DMP) outlining plans for managing and providing access to research data, or else provide a rationale as to why their research can not be made available. This puts the VA in line with all other agencies who are using DMP’s to effectively set the default mode for all VA-generated research data to “open.”
What sets the VA’s plan apart from that of many other agencies is their explicit discussion of the need to balance safeguarding the privacy of Veterans with the mandate to develop public access data systems. Consequently, the VA will pursue a staged approach over a period of several years that will simultaneously advance three specific, fundamental goals:
- Making VA research data available to the public with the fewest constraints possible.
- Protecting the privacy of Veterans and meeting VA’s obligation to maintain confidentiality and security of health records, and associated personal information.
- Utilizing data sharing mechanisms that can be implemented at reasonable cost.
Taking this approach means that the VA will take a phased approach to both the way research data is shared with the public, as well as the kind of data that is shared. The VA will start by sharing digital data from VA-funded research through controlled public access mechanisms (i.e., through data use agreements (DUAs) and other written agreements) and move as expeditiously as possible toward fully open public access mechanisms that ensure the protection of Veterans’ identifiable private information.
To facilitate a move towards full open data sharing, the VA will enlist federal and non-federal experts to develop an assessment of privacy risks associated with matching of “de-identified” health information with both publically available and readily available private databases. This work is expected to be complete by September 31, 2016.
In terms of the specifics of what research data is to be shared, all VA-funded researchers will ultimately be required to share all digital data underlying the published results from all VA funded research. This policy will go into effect October 1, 2015 for all VA Office of Research and Development-funded research, and on December 31, 2015 for research funded by Veterans Health Administration Program Offices. Expansion of the policy to other categories of data will be considered upon the completion of privacy assessment described above.
Perhaps to be expected given the nature of the data generated by the agency’s research, the VA plan provides a more detailed blueprint for agency-maintained data repositories. The VA has concluded an initial research data inventory, and found that it currently holds 1.5 petabytes of research in online locations, as well as 4-6 petabytes of research data stored offline on media such as tapes, CD/DVD’s, hard drives, thumb drives. etc. The agency also expects to generate an additional 500-700 terabytes of new data annually.
As a result, the VA tasked a team to explore the costs and features needed to leverage its existing data infrastructure to house all research data. The team recommended developing a solution architecture comprised of more than 30 local research systems to service the application, database, and storage needs of active research studies, and centralized repository resources to provide retention and cataloging of all completed studies
Finally, the VA joins the majority of the other U.S. agencies in noting that the Department will explore the development of a “research data commons” along with other departments and agencies, for storage, discoverability, and reuse of data with a particular focus on making the data underlying peer reviewed scientific publications resulting from federally funded scientific research available for free at the time of publication. They also note that they will work to adhere to the FAIR (Find, Access, Interoperate and Reuse) principles of data sharing in any environment that they participate in.