NCATS Furthers Efforts to Create a Data Ecosystem to Explore Disease Connections

Translational Science Highlight

  • A computational tool that improves connections among the mounting levels of disparate biomedical data types would transform translational science by enabling connections to diverse and comprehensive disease and biology information.

Could more effective targets and treatments for common and rare diseases already exist, but have eluded researchers because disparate pieces of information were not connected? 

“Enabling comprehensive discovery of commonalities among diseases would turbo-charge the translational process,” said NCATS Director Christopher P. Austin, M.D. “This is the goal for our ambitious Biomedical Data Translator program.”

Powerful new technologies are reshaping the biomedical research landscape, enabling scientists to map and decipher the 3 billion chemical letters that make up the human genome and unravel the molecular mysteries of all kinds of diseases. It is becoming possible to identify the hundreds of environmental stimuli and chemicals people are exposed to each day. Electronic medical records contain warehouses of patient information and clinical databases house details on genomic and environmental variants that can affect disease susceptibility.

While these technologies produce massive amounts of data, the pace of data generation has largely outstripped researchers’ ability to make sense of the results. Ideally, scientists could easily mine data from different sources to gain new insights into disease causes and biology as well as the relationship between disease biology and clinical signs and symptoms. However, disconnected data sources and lack of understanding of how disparate data types — such as genomic, cellular and patient — relate to each other has hindered the pace of progress.

Through its Translator program, NCATS is supporting research to develop a computational platform that enables connections among conventionally siloed data types. Translator aims to bring these together in an ecosystem that will reveal complex relationships that help scientists better understand disease and generate hypotheses and treatment options.

“If successful, Translator will help scientists use data more effectively,” said Noel Southall, Ph.D., a program leader in the NCATS Division of Preclinical Innovation. “This tool will bring into focus relationships among data, enabling the researcher to develop hypotheses based on connections that otherwise were not apparent.”

An inability to see connections among data can be frustrating, especially in translational science. For example, researchers might find, through a drug screening, that a drug known to treat one disease is useful against another disease. When scientists dig deeper into the literature, they often find the connection between that drug and that disease was already there.

“Researchers spend a lot of time rediscovering things we already knew because the connection between knowledge sources doesn’t exist,” Southall said. “We’re frequently unable to see the relationships among different data types that can affect clinical outcomes.” 

Feasibility Testing

When NCATS awarded its initial Translator funding in fall 2016, 11 teams were charged with working together to develop “demonstration” projects aimed at assessing whether a biomedical data translator could be created. There are now 12 teams determining the feasibility of building Translator by focusing their efforts on data integration and analysis.

Super resolution image of a group of killer T cells surrounding a cancer cell.

Killer T cells surround a cancer cell. Alex Ritter, Jennifer Lippincott Schwartz and Gillian Griffiths/National Institutes of Health Photo

“We need these projects to give us a sense of what is possible and how to move forward,” Southall said.

For each demonstration project, awardees are identifying data sources that would be needed for a comprehensive Translator and designing ways to combine the data to address a specific biomedical problem.

The projects are wide-ranging. While some researchers examine the impact of environmental exposures on the onset or worsening of disease, others evaluate the ability to help patients for whom existing approaches have failed to identify the origin of disease symptoms. Still other scientists focus on understanding relationships between common and rare diseases.

The demonstration projects are especially collaborative. “A single group cannot build the Translator,” said Christine M. Colvis, Ph.D., NCATS Drug Development Partnership Programs director. “Scientifically and technically, the research teams have realized that to make the data interoperable and gain understanding from them, they must work closely and take advantage of complementary expertise.”

In one project, for example, three groups of researchers with different areas of expertise have combined their resources to study Fanconi anemia, a rare, inherited disease that affects the bone marrow and can cause developmental problems and cancer. Data analytics experts at the University of California, San Diego and scientists at the Renaissance Computing Institute and the University of North Carolina at Chapel Hill, who have experience working with environmental data, are working with clinicians and scientists from Oregon Health & Science University in Portland to test a computational strategy to study variations in genes and environmental exposures and how they might play roles in the disease.

Transcending the Languages of Biology and Disease

“Speaking different languages sets up translational roadblocks to seeing relationships among data,” Colvis said. “If we could redefine diseases based on the data we bring together in Translator, we might see connections among illnesses not based on what organ systems are involved but rather the biology underlying those conditions. Translator has the potential to get a basic scientist to begin to think about downstream clinical connections in a different way.”

Translator queries could run from the more general to the very specific. The system might be asked, for example, to identify all diseases with a particular symptom that affect a particular cell type. Or a query could request all molecular pathways that, when disrupted, lead to malfunctions of cellular structures in a particular organ in people with specific genomic characteristics.

Translator might ultimately change how scientists and clinicians think about disease and treatment. Physicians and biologists tend to think of disease in different ways and speak different languages. Physicians diagnose and treat disease based on signs and symptoms affecting specific organs, while biomedical researchers often consider disease in terms of molecular changes in specific proteins, pathways or cell types.

Supporting High-Risk, High-Reward Research

NCATS funds Translator through its Cures Acceleration Network (CAN), which includes Other Transaction authority (OTA). With CAN and OTA, NCATS has more funding flexibility to support high-risk, high-reward programs such as Translator.

In one case, a team realized after six months that its demonstration project idea wouldn’t work. But the “failure” turned out to be helpful.

“The idea was very innovative and made us think differently about how we consider data and how to build Translator,” Southall said. “We are assessing feasibility, and it’s okay if some of the things we try fail as long as we learn from the experience, including which directions work and which don’t.”

“We’re still trying to figure out what’s possible,” he continued. “We originally thought about Translator as a ‘big database lookup.’ If you put all the facts and databases in one place, eventually you could keep asking questions and stitch together answers.” Now, less than one year later, Southall and his colleagues are exploring a somewhat different model that would also enable researchers to infer data that are not even available yet.

Posted August 2017