Open Science has long held the promise of changing the world. Now, concrete examples are emerging that illustrate how open collaboration among researchers is doing just that.
The first-ever Open Science Prize, sponsored by a collaboration among the U.K.-based Wellcome Trust, the U.S. National Institutes of Health, and the Howard Hughes Medical Institute, encouraged researchers who used open as an enabling strategy to develop innovative tools and services that could unleash the power of data to advance discovery and improve health around the world.
Over the course of last year, NIH, Wellcome Trust judges, with inputs from a panel of open science advisors, selected six finalist projects from a field of 96 solutions proposed by applicants in 45 countries. Each finalist team was given $80,000 to develop a prototype of its project, which were demoed in front of a packed house – live streamed to viewers worldwide on the Web – at a meeting in January. The public was then invited to weigh in, and nearly 4,000 votes came in from 76 countries to narrow the field from six to three. The final winner was chosen by a review committee appointed by the prize sponsors.
On Feb. 28, the grand prize of $230,000 was awarded to Trevor Bedford, Richard Neher and the Real-time Evolutionary Tracking for Pathogen Surveillance and Epidemiological Investigation team that developed nextstrain.org. The project uses publicly available viral genomic data to create an online visual platform showing the real-time molecular epidemiology and evolutionary analysis of emerging epidemics, such as Ebola and Zika. Rather than waiting for publication of a paper that analyzes outbreaks in small geographic pockets, the project allows public health officials to track the spread of a virus on a map — in real time — as it happens globally.
“The winner really exemplified the global scale of Open Science and what can be done once you approach data sets from around the world to combat public health disease,” says Elizabeth Kittrie, strategic advisor for data and open science for the National Library of Medicine. “It is a model that is scalable. It can apply to a hospital outbreak or an outbreak across the country. It’s a beautiful, flexible technology and it allows us to see transmission patterns in a way that you could not do if you only had data from one country source or one source of researchers.”
The broad response was encouraging for science advocates and underscores the need to continue to push for open policies open. “It shows we are not just interested in Open Science in the U.K. and the U.S — but across the world,” says Aki MacFarlane, Programme Officer in the Open Research team with Wellcome Trust.
The innovative project was spearheaded by Bedford, 35, a faculty member at the Fred Hutchinson Cancer Research Center in Seattle and Neher, 37, an associate professor specializing in computational biology at the University of Basel in Switzerland. The two met in 2011 and, initially, worked on predicting the evolution of influenza. While they were chatting at a conference in 2014, the concept of sharing this kind of open data on the web in a useful, visual format seemed natural and obvious, says Bedford.
Over winter break, they hacked together a prototype, and the project started moving quickly. The two collaborated by Slack, GitHub, Skype and occasionally in person. With the time zone difference, one scientist would often pick up on the work the next morning after the other signed off for the day creating a non-stop cycle, of sorts, on the project. Their skills complemented one another — with Neher’s strength as a programmer and Bedford working more on the visualization side.
“We were both very happy just making this a website and trying to make it useful, iterating features. Other academics may have thought it needed to be a paper,” says Bedford. “But it was rewarding to do something that was actually having an impact.”
Allison Black, a doctoral student in epidemiology at the University of Washington, is using nextstrain.org as she collects data on the Zika virus in the U.S. Virgin Islands. The platform allows her to see quickly how her results fit into patterns elsewhere in the world.
“It’s incredibly valuable to see how the different outbreaks are connected,” says Black. “Working with people who have a desire for Open Science made this project so much easier and more gratifying for me.” When things didn’t work, Black says she had a community of people to troubleshoot with, and the research progressed faster. “I learned so much more from the project because I was able to work openly with so many people. It’s a huge benefit for trainees.” In public health, Black adds it is an “absolute moral necessity” to share data openly.
Using genetic sequencing data from viruses and bacteria to understand how transmission occurs has become a fairly well-developed field. Because of publishing incentives, however, information has not always been shared in a timely fashion and studies have been geographically limited. For instance, Bedford says many recently published papers on Ebola focused on particular geographic study sites. “It’s hard to get an overall snapshot,” says Bedford. “The critical piece that is missing and we are trying to fill is information is not always published in a timely fashion.”
Neher agrees that the traditional model of writing a paper, sending it to a journal for review and perhaps waiting for it to be published six months later is too slow. “By the time that has happened, the epidemic is over. You are learning your lesson for the next one. You don’t really learn something that generates actionable insight into the current one,” says Neher. “We hope this site facilitates and catalyzes a change in the way we analyze pathogen genetic data.”
Instead of getting scientific credit for an article byline, contributors to nextstrain.org may get attention for their work by sharing it on Twitter or other outlets, says Bedford. Those who share their data also know they are providing useful information that can have an immediate impact on public health.
With the prize money, the team hired two additional people to work on the project in Seattle. The exposure has generated interest — even a tweet from Bill Gates (https://twitter.com/BillGates/status/841750279972352005) that read: “Fascinating…Nextstrain uses genetic data from viruses to help scientists track the spread of disease outbreaks,” with a map and link https://qz.com/920836/a-new-genetic-tool-maps-how-deadly-viruses-spread-around-the-world-in-real-time/?linkId=35474420.
For now, the team is working to improve the website. In the long term, the team leaders hope the project will be useful in a broader way to some public health entity, such as the Centers for Disease Control and Prevention, using a version of the prototype as part of a general surveillance system.
Colin Megill is a front-end developer on the team who has been converting code from the prototype to a modern web application. “This is the first time a real-time map is used where you can see the totality of the transmission,” he says. “It’s really amazing to give a tool back to the community.” The prize gave the team a lot of validation and motivation to take the functionality to the next level. Megill says engineers are under-utilized in helping scientists leverage their work and this project demonstrates the potential of such collaboration.
Funders of the prize believe the competition is a huge boost for science. “It’s really bringing to light the nascent ideas that researchers are thinking about, but not necessarily put out there yet,” says MacFarlane of Wellcome. “We’ve managed to bring some awareness to lots of things going on that we as funders and public were not aware of — which is great.”
Other Open Science Prize finalists included:
Fruit Fly Brain Observatory – allowing researchers to better conduct modeling of mental and neurological diseases by connecting data related to the fly brain. fruitflybrain.org
Open Neuroimaging Laboratory – advancing brain research by enabling collaborative annotation, discovery and analysis of brain imaging data. openneu.ro/start
MyGene2: Accelerating Gene Discovering with Radically Open Data Sharing – facilitating the public sharing of health and genetic data through integration with publicly available information. mygene2.org
OpenAQ: A Global Community Building in the First Open, Real-Time Air Quality Data Hub for the World – providing real-time information on poor air quality by combining data from across the globe. openaq.org
OpenTrials FDA – enabling better access to drug approval packages submitted to and made available by the Food and Drug Administration fda.opentrials.net
–Caralee Adams