Translator: Unique Collaborative Approach to Advancing Biomedical Data Sharing

Translational Science Highlight

  • NCATS is leading a collaboration that builds on the diverse expertise of academic and private-sector partners to create a unique data-mining, computational resource that will integrate many different types of biomedical information. When completed, broad access to this resource will help facilitate translational innovation in disease prevention, diagnosis and treatment.

In September 2017, Stephen Ramsey, Ph.D., saw an intriguing Twitter post about a new funding opportunity announcement (FOA). He clicked on the link, expecting to find a traditional government FOA. Instead, he learned that he would first have to solve a math puzzle and complete a series of computational tasks; only then could he gain access to the instructions to begin the application process.

Intrigued, Ramsey rounded up colleagues at Oregon State University and other institutions to get started on the challenge. Eight puzzles and a few days later, they finally accessed the FOA. NCATS was seeking applicants who could build reasoning tool prototypes — a “brain” — for the Biomedical Data Translator program.

Translator is NCATS’ unprecedented effort to create a computational resource that connects many kinds of biomedical information from many sources to help researchers generate new ideas for preventing, diagnosing and treating diseases. Once completed, Translator will be able to draw on data sources ranging from air quality measurements to electronic health records (EHRs) to answer questions such as “What diseases could aspirin treat?” and “What genetic conditions reduce your risk for osteoporosis?” and subsequently ask, “How is it doing that?”

Creating Translator requires exceptional collaboration. Eleven teams of scientists, scattered across the country at 20 institutions, constantly check in with each other about their progress as well as potential solutions to the obstacles they face.

The idea behind the FOA puzzle challenge was to have teams demonstrate they had the skills to build a reasoning tool before they could apply for funding. One of the puzzles was a problem in number theory, which is a branch of mathematics.

“Not being a number theorist, I was not sure how to efficiently solve the problem,” Ramsey said. “It would have taken me days.”

Ramsey decided to email David Koslicki, Ph.D., a mathematician colleague at Oregon State University. Koslicki sent back a solution within an hour. Now, Ramsey and Koslicki are co-principal investigators (co-PIs) for one of the five teams that are working to build Translator’s prototype reasoning tool.

Connecting Data and People

Scientists collaborate at the January 2018 NCATS Data Hackathon

Scientists collaborate at the January 2018 NCATS Data Hackathon. From left to right: Greg McInnes, Stefano Rensi, Margaret Guo, Adam Lavertu, Matt Brush and Tyler Peryea. (OHSU)

Building Translator is an enormous task with many problems yet to be solved.

“We’re trying to figure out how to connect all the disparate knowledge we have,” said Melissa Haendel, Ph.D., Oregon Health & Science University, another Translator PI.

Biomedical knowledge comes in many forms, from individual patients’ records to abstracts of scientific journal articles. Even the same types of data can be represented in different ways. For example, different EHR software packages might record blood pressure differently.

Each researcher in Translator brings particular skills and expertise. Haendel’s team specializes in figuring out how to make information from diverse sources comparable, much like how Translator team members must figure out how to work together. One way the teams connect is through “hackathons,” events at which members gather in small groups to work on specific problems and periodically report to the other groups. A team might brainstorm how to make patient data available publicly without compromising privacy, for example.

Stanley Ahalt, Ph.D., a Translator PI from the University of North Carolina at Chapel Hill, is part of a team working on demonstration projects to show how Translator could find connections between different kinds of data.

“It’s really fun to be a part of something that’s intellectually stimulating, collaborating with people you like and respect, where you’re all making progress together,” Ahalt said.

Learning from Each Other

Translator not only needs to be able to understand the user’s query, it must then be able to find the relevant knowledge sources, extract the right information and piece the information together into a narrative that the user can understand. Each of those tasks is a difficult problem on its own, and each needs the contributions of people from many backgrounds.

“I did not have any inkling of how difficult it is to define a disease,” Ahalt said. “The fact that we don’t have a clean way of describing what a disease is was kind of a shock to me.”

Ahalt’s team is working on identifying different types of asthma, based on whether the condition is related to genetics, to environmental exposures such as air pollution, or perhaps to some other medical problem that the patient is experiencing. Understanding these underlying differences should make it possible to tailor treatments to specific patients.

Koslicki, the mathematician who solved the number theory puzzle challenge, works mostly on developing better methods to analyze biological data. “Translator requires many technical, engineering and scientific problems to be addressed in one integrative system,” he said. The project needs people who understand web design, people who work with application programming interface solutions and people who understand biology.

Koslicki gave an example of one problem that the Translator collaboration solved: His team was interested in a database that contains information based on scientific results published in journal articles. The database contains 20 million different relationships, such as the fact that high salt intake is associated with high blood pressure. Koslicki’s team spent a while trying to figure out how to work with the information before realizing that another Translator team had already done the work. Had he followed the usual scientific path, with individual teams working alone on their projects, Koslicki would have had to duplicate that work, he said. Instead, his team could move on to the next problem.

“Typically, we only get to learn from other scientists’ work through a publication or a presentation,” Koslicki said. “You don’t get to dig in with them and learn how they did what they did. Our collaborations allow things to go very fast and enable us to share different parts of reasoning tools and try to integrate them into our own.”

A New Way of Advancing Science?

The Translator teams are completing feasibility assessments on an ambitious timeline of just a few short years. The kind of tight, unusual collaboration that the team has created seems to be the most efficient way to develop a reasoning tool in such a compressed time frame.

Beyond the efficiencies created by the unique collaborations, Translator may offer a new approach to advancing science. When the tool identifies connections that scientists haven’t thought of before, research teams can evaluate them, choose promising hypotheses and design studies to investigate further.

Ahalt has introduced the idea to colleagues at his university, and they are considering whether a Translator-like project could help them find new ways to tackle complex issues, such as the opioid epidemic.

“I believe Translator can change the way science is being done,” Ahalt added.

Posted May 2018