Skip to main content

Our Impact on Big Data

Our teams find ways to transform massive amounts of data into health solutions faster than ever.

Turning Data Into Solutions

We have more and more opportunities to collect, produce and access research data. But drawing meaning from all that information is still difficult. Data sets are often vast, complex and not linked together in useful ways.

To speed the translation of research observations into health solutions, we have invested heavily in new approaches to learn from large sets of complex data. Our efforts have shown that connecting data from people’s electronic health records can deliver public health insights in weeks, rather than months or years. We have further demonstrated the power of synthesizing data from a variety of sources to spur therapeutic leads through the drug development pipeline. And we've created the tools that others also use to make these connections and share the results faster.

Big data offers big opportunities across the entire spectrum of translational science. Our newer initiatives apply data science tools like machine learning to mine unexplored chemical space for new therapeutic molecules and use clinical data to speed rare disease diagnoses.

Explore All Areas of Our Impact


Impact Stories

Scientists Identify Characteristics to Better Define Long COVID

Transmission electron micrograph of SARS-CoV-2 virus particles, isolated from a patient.

Using machine learning, researchers found patterns in electronic health record data to better identify those likely to have the condition.

Registry Reveals Risks COVID-19 Risks in Sickle Cell

Image shows sickle cells blocking circulation of red blood cells.

NCATS-funded CTSA Program researchers developed a collaborative registry that collects data on COVID-19 illness in people with sickle cell disease.


Big Data Activities

We invest in data science approaches and tools that connect, explore, analyze and share large and disconnected datasets to produce meaningful insights that speed new health solutions.

Map of United States with medical icons

National COVID Cohort Collaborative (N3C)

We created and maintain this national resource of clinical and other data to answer critical research questions about COVID-19 health outcomes.

Woman looks at other woman's laptop screen

Biomedical Data Translator

This program funds projects that integrate existing medical and biological data from different sources to quickly and easily reveal valuable connections and insights about diseases, including potential treatments.

Illustration of SARS-CoV-2

OpenData Portal

Our OpenData Portal is an open and accessible platform for sharing COVID-19-related drug repurposing data and experiments involving approved drugs.


Big Data News

Artificial Intelligence Is Advancing Clinical Research and Data Quality

April 25, 2023 - NCATS News

NCATS awardees use small business funding and CTSA Program grants to commercialize a customizable natural language processing software.

COVID-19 Hospitalization May Increase Heart Failure Risk

July 15, 2022 - NCATS News

A large, retrospective study using National COVID Cohort Collaborative (N3C) data revealed that people who had been hospitalized with COVID-19 were more likely to develop heart failure during their

NCATS BioPlanet: A Resource for Discovery

March 23, 2022 - NCATS News

Researchers can better study how compounds and drugs affect cells with an NCATS resource that combines all pathways in human cells into one database.