In response to the White House Office of Science and Technology Policy (OSTP) directive issued in February 2013, the U.S. Department of Health and Human Services (HHS) on Friday released a comprehensive set of plans outlining the steps that its five largest agencies will take to create policies expanding public access to the results of their funded research. The policies are expected to go into effect by the end of this year.
The HHS release includes plans to update existing policies at the National Institutes of Health (NIH), and the Centers for Disease Control and Prevention (CDC), as well as plans for new policies at the Food and Drug Administration (FDA), and the Agency for Healthcare Research and Quality (AHRQ). Additionally, the Assistant Secretary for Preparedness and Response (ASPR) is also voluntarily developing a plan for its portfolio of funded research.
There is a lot of information packed into this single release. Fortunately, while each of the five agencies released separate plans, HHS first constructed a common set of guiding principles on which to base those plans, in order to provide continuity and minimize the complexity of the compliance burden on funded researchers. This document serves as a useful “meta plan” document, and is a good starting point for drilling down into individual elements of each Agency’s proposed implementations.
New Plans for Article Access
As required by the OSTP directive, the HHS Public Access plans focus on both articles and research data. The plans relating to article access policies are relatively homogenous. All five agencies will use the NIH Manuscript Submission System to deposit articles into the PubMed Central database, which will serve as the focal point for compliance. The CDC will take the additional step of dually hosting its funded articles in the agency’s local repository, called CDC Stacks.
All five plans require funded researchers to deposit manuscripts of articles into PubMed Central upon acceptance in a peer-reviewed journal, to be made publicly available no later than 12 months after publication. What is of particular note here is the discussion of HHS’s proposed mechanism to allow stakeholders to petition to change the embargo period. The discussion centers on providing mechanisms to shorten the embargo period, underscoring that in all cases, across all agencies, the 12-month embargo is a guideline, and the rights holder can set an embargo period of less than that at any time.
HHS indicates it will provide the means for both the public and internal operating or staff divisions to petition for shorter embargo periods – particularly for articles resulting from funding initiatives that are considered to have important scientific, public health or societal value requiring rapid communication. This is particularly important considering the variety of public health-related research conducted by the NIH, CDC, and FDA.
While the individual plans each note that their copyright and license provisions will be aligned with those currently in use by PubMed Central, which relies on authors to negotiate rights directly with publishers, the NIH plan does contain language which seems to indicate a possible significant shift in the Agency’s thinking on this crucial element. The NIH plan notes:
“NIH is also exploring the possibility of using the government use license specified in 45 CFR 74.36 to help make papers public. Under these terms, the government has a royalty-free, non-exclusive and irrevocable right to reproduce, publish or otherwise use the work for federal purpose and to authorize others to do so.”
Presumably, a shift by NIH to assert these rights under its Federal Purpose license would open up the entire collection of articles in PubMed Central for computational analysis, text and data mining – including those deposited in by other agencies. This move would represent a giant step forward in expanding the utility of the more than three million articles currently contained in PubMed Central, and the estimated 200,000+ articles that will be deposited annually under these new plans.
Use of PubMed Central as a locus of deposit for the five agencies also ensures strong interoperability among article collections and the robust set of more than 50 data collections maintained by the NIH. The direct links to those databases, which are already interconnected through the use of common formats and archival frameworks, provide a unique opportunity to put the text of papers into new context by providing seamless movement between text, chemical structures, proteins, and other data. This provides researchers the immediate ability to integrate individual discoveries with other articles and datasets, opening up rich new areas of scientific exploration.
Data Access and Management in the Spotlight
Unsurprisingly, the HHS blueprint for providing robust access to and sharing of digital research data are not as tightly defined as the agencies’ plans to deal with research articles. In particular, unlike in the article arena, HHS does not have a common repository for research data — though it is envisioned that, ultimately, the agencies will utilize the health.gov portal to serve as the mechanism for the public to locate and access HHS funded research data sets. HHS also notes that it currently lacks common standards for data management and archiving, as well as common requirement and enforcement practices for data sharing across agencies, and creating these will be a central focus of plans going forward.
This looseness does not, however, indicate a lack of commitment on the part of the department – or individual agencies – to move toward making the open sharing of research data a priority. For example, the new NIH Plan indicates that “the NIH intends to make public access to digital scientific data the standard for all NIH funded research,” expanding its data sharing policy well beyond the current requirement, which applies only to research grants of $500,000 or greater. And researchers across all agencies are now required to provide plans for data access and sharing as part of the grant proposal process, or explain why their data cannot or should not be shared — effectively setting a new default mode for research data across HHS.
Echoing a common theme from all of the U.S. agencies that have released their plans to create research data sharing policies to date, HHS notes that their policy development process will be an evolutionary one, and will be conducted in close, regular consultation with the research community. This approach is underscored by the use of terminology such as supporting “investigator initiated data management plans” throughout the document.
With the five agencies in different states of readiness, the Department’s strategy for proceeding with policy development is, by necessity, quite pragmatic. The first priority will be to conduct an assessment of data holding across the five agencies to develop a better understanding of the type and location of data under HHS’s purview. This will be followed by the implementation of a Department-wide “Enterprise Data Inventory,” that will be used as the catalogue for all HHS research data. This inventory will ultimately be accessible via the health.gov public platform., HHS will also focus on developing common metadata elements for research data across its agencies.
As noted earlier, HHS will require that all researchers develop data management plans, which represents a significant change for many of its agencies – several of which currently do not require such plans to be submitted. Additionally, data sharing, management, and archiving practices currently vary widely across the agencies, so the department also intends to develop best practices, and apply new common requirements for all data management plans.
Finally, HHS will take a page from the European Commission’s book, and utilize pilots in areas where technology is rapidly evolving, or where they feel they need additional experience in order to make sound management decisions. They point to areas such as optimizing attribution, use of digital object identifiers for research data, and alternate repository approaches as areas that are ripe for pilots.
Interestingly, as also noted by NASA in its plan, HHS indicates it will explore the development of a “research data commons” along with other departments and agencies – yet another indication that this idea appears to be gaining traction in the federal agency community.
The release of the HHS plan is a welcome signal of the strength of the momentum within the U.S. government to continue to steadily move towards the open sharing of the results of publicly funded research. In a letter accompanying the newly posted plan, HHS Secretary Sylvia Burwell noted:
“At an inflection point in history, we are well poised to strengthen our public access practices through modifying existing policies or creating new ones, and through leveraging existing platforms and tools in order to make sharing of federally funded research results a widespread practice for HHS funded researchers…Together we can accelerate the movement of science and research into the hands of as many as possible. American citizens deserve that and HHS intends to deliver on that promise.”
We in the SPARC community look forward to supporting HHS and its agencies as they work towards delivering on our common goals.