University of Pennsylvania’s Data Refuge Project

Open Access   ·   Open Data
Data Refuge organizers (left to right): Margaret Janz, Patricia Kim, Laurie Allen, & Bethany Wiggin.               Kim Eke, Margaret Janz, Laurie Allen,               & Bethany Wiggin.

With every presidential transition, priorities shift and access to information changes. Yet many in the scientific community were worried that Donald Trump’s administration would usher in a more dramatic shift than usual  — particularly in the area of environmental policy — that could threaten the future of public climate change data.

A group of concerned scholars at the University of Pennsylvania launched a collaborative project, Data Refuge, to galvanize volunteers to copy and store federal climate change information in multiple, trusted locations.

“It began to feel extremely urgent and called for engagement of a much wider research and volunteer community,” says Bethany Wiggin, director of the Penn Program in the Environmental Humanities and one of the project organizers. “We felt we were trying to create an insurance policy.”

With the Penn team’s guidance, the idea of “data rescue” events was born. The focus was on securing persistent access, preservation, and providing education on open data and open science. As a result, 50 data rescues took place across the country between January and May with students, professors, scientists, researchers and the broader tech community turning out to help.

The Data Refuge project put a spotlight on the need for a proactive and diverse effort to save endangered data. For expanding opportunities for service, increasing of the visibility of the open movement, and working to improve access to information, SPARC honors Penn’s Data Refuge project with the June 2017 Innovator Award.

“The team at UPenn saw a problem and acted decisively and creatively to try and solve it,” says Heather Joseph, Executive Director of SPARC. “They also raised their odds of achieving success by being truly open and inclusive of participation of the widest possible community from the start – they are both pragmatic and inspirational.”

Although Data Refuge was initially triggered by the recent change in administrations, the effort need not be considered through a political lens, says Laurie Allen, assistant director for digital scholarship at Penn Libraries and a key organizer of the project.  The shift from paper to digital resources means critical information doesn’t have the same guaranteed permanence – at anytime it in its lifecycle. “The loss of data happens all the time,” she says. “The web that we rely on is really brittle.”

Allen says it was natural for the library to be a leader because its role is to ensure the community has access to information needed to do their work. “That’s why we were invented,” she says. “We were so excited to be involved because it hit at the core of our mission.”

From the beginning, the Data Refuge organizers began reaching out beyond campus. It was clear that the problem was larger than the team alone could handle, and a more sustainable model was needed. They were facing both the immediate push to harvest and curate data before it went down, and also a need to prepare for an uncertain, long-term future.

“We are worried that data will become at risk because of budget cuts. Researchers and libraries won’t be able to maintain certain data sets. Agencies won’t be collecting data because programs are slashed or eliminated,” says Prue Adler, associate executive director of the Association of Research Libraries, which is providing resources to support the work of Data Refuge.

Although the task was big, the Data Refuge team was fully committed to seeing it through.

“We knew it would be hard, tricky and complicated — and it was, but it was also very important,” says Margaret Janz, scholarly communication and data curation librarian at Penn and co-organizer of Data Refuge. They were buoyed by the enthusiastic response of volunteers who provided the free labor to make the data rescue events happen.

At Penn’s first event in mid-January, more than 250 people showed up at the library over the course of two days. “The energy was really high. People were really engaging with one another,” says Janz.

The turnout included men and women, students and senior professors, coders and volunteers from the community. The Data Refuge team trained participants in various working groups to harvest data, download data sets, copy web pages and developed a work-flow structure that served as a framework for subsequent gatherings at cities across the country. There were no time requirements for those who showed up, but many put in long hours and some had to be told to leave at the end of the day. “I found so much community,” says Wiggin, who was amazed by the diversity of the activism. “I was overwhelmed with gratitude.”

Among the supporters joining the Data Refuge cause was the Mozilla Science Lab. Program Lead Stephanie Wright says her company believes in the open data and it was a natural fit to be part of the effort, holding a data rescue event in Portland. “To see the up swell of volunteer activism has been so heartening. I’ve never seen so much passion around data from people outside of academia,” says Wright.

The welcoming nature of the Data Refuge events has fueled the movement, she added. “Anyone who wants to can participate and contribute,” says Wright. “I’m really glad [Penn] came up with the idea and opened it up for the community.”

Adler applauds Penn for its leadership with the initiative. “They are deeply collaborative,” says Adler. “[Penn] recognized what they could provide and developed tools that others could pick up and use or modify in a new direction. That held great value.”

The effort has attracted a wave of publicity with articles in major newspapers and even a segment on The Daily Show with Trevor Noah. “We caught a moment,” says Allen, who says the experience has been life-changing for her.

As a graduate student at Penn concerned about the importance of federal climate and environmental data, Patricia Kim was one of the early Data Refuge co-organizers. “I’m not sure many people are aware that with every new administration, budgets get redone, and priorities are re-set;  that affects what federal websites and data continued to be maintained,” says the 26-year-old who specializes in ancient art history and archaeology — a field that often relies on environmental data. “This affects students’ research, scientific research and different communities throughout the country.”

Part of her role has been to tell the stories about how why Data Refuge is important. “We have to humanize the data or else it’s just a collection of different kinds of measurements that are illegible at best and forgotten at worst,” she says.  The storytelling initiative has been supported with a grant from the National Geographic Society.

Kim says becoming involved in Data Refuge has made her a smarter scholar. “It’s forced me to think about what data are and what constitutes evidence is in more critical way and reflect on my own methods and belief system,” she says.

The women who organized Data Refuge were successful, in part, says Wiggin because they were willing to acknowledge that many different skill sets were needed.  Just as the environmental humanities program is interdisciplinary and has been open with its scholarships since its inception, Wiggin says Data Refuge required a collaborative approach. “We cannot do this by ourselves,” says Wiggin. Channeling their concern into something tangible was therapeutic, she added.

After a whirlwind semester, the core Data Refuge organizers say they were both exhausted and exhilarated. Penn worked to broaden its coalition and craft a path forward. In May, representatives from Penn’s Data Refuge, ARL, and Mozilla held a meeting with members of the Library+Network. The emerging group is a “dream team” with working groups and a timeline to move forward, says Adler of ARL, which she adds is committed to the cause for the long haul.

Along with Penn Libraries, Data Refuge’s institutional partners also include: University of Michigan Libraries, Internet Archives, Temple University Libraries, Environmental Data Governance Initiative, ProjectARCC, Union of Concerned Scientists, Climate Mirror.

Allen says seeing so many people give of themselves has been inspiring and the commitment to open is sincere. “I signed up for this job because I believe people should have access to information,” she says. “The university understand this is important — the provost, IT — everyone has been aligned…This is not a partisan project. This is what we are here for as an institution — to make sure people can learn.”

-Caralee Adams


Learn more about our work