Skip to main content

Announcement: Access to the COVID-19 Data Analytics Platform Is Open

Latest News

National COVID Cohort Collaborative (N3C) Data Enclave

  • More than 4.5 million COVID-19-positive patients and more than 14 billion rows of data are included in this enclave. Apply for N3C access today.
  • The public-facing N3C Cohort Exploration Dashboard provides high-level information about the N3C cohort and N3C Data Enclave.
  • More than 300 projects are underway using the enclave data to examine associations between COVID-19 patient outcomes and social determinants of health. View the current list of publications.
  • Study shows a link between immune dysfunction with an increased risk for COVID-19 breakthrough infection.

September 2, 2020

A Resource Unlike Any Other

The NCATS N3C maintains one of the largest collections of clinical data related to COVID-19 symptoms and patient outcomes in the United States. Having access to a centralized enclave of this magnitude allows research teams to study, probe and answer clinically important questions about COVID-19 that they could not have answered previously.

With stewardship from NCATS, more than 70 institutions, including more than 40 Clinical and Translational Science Awards (CTSA) Program hubs, worked together to build this extensive database to help researchers study COVID-19 and identify potential treatments as the pandemic evolves.

  • Harmonized data
    • The N3C platform translates the different ways that contributing hospitals store patient data into a single, common format to enable combined “apples-to-apples” analyses.
  • Robust in scale and scope
    • Currently, 89 sites across the country have agreed to transfer diverse data from individuals tested for COVID-19, including demographics, symptoms, laboratory test results, procedures, medications, medical conditions, physical measurements and more.
    • By marshalling the national reach of the CTSA Program network, N3C is ensuring that the data represent the diversity of the country so researchers can understand and address geographic and population disparities during the pandemic.
  • Powerful analytics capabilities
    • The platform is built to enable machine-learning approaches and rigorous statistical analyses to identify connections and patterns more quickly than is possible through traditional methodologies. These advanced analytics approaches can lead to the simultaneous exploration of multiple questions — and to the revelation of likely answers — on a powerful scale.
  • Centralized and secure
    • The data reside and remain in NCATS’ secure, cloud-based database, certified through the Federal Risk and Authorization Management Program, or FedRAMP, which provides standardized assessment, authorization and continuous monitoring of cloud products and services, thereby ensuring the validity of the data while protecting patient privacy.
    • Three levels of protected data are included for analysis:
      • Limited Data Set (LDS): Consists of patient data that retain the following protected health information —
        • dates of service
        • patient ZIP code
      • De-identified Data Set: Consists of patient data from the LDS with the following changes —
        • Dates of service are algorithmically shifted to protect patient privacy.
        • Patient ZIP codes are truncated to the first three digits or removed entirely if the ZIP code represents fewer than 20,000 individuals.
      • Synthetic Data Set: Consists of data that are computationally derived from the LDS that resemble patient information statistically but are not actual patient data.

Learn more about N3C data including data stewardship and protections and the requirements for accessing different levels of data.

Hongfang Liu, Ph.D., program director for informatics at Mayo Clinic, is experienced in using clinical data for translational science research and to improve health care delivery. Dr. Liu explains how collaborating with experts in other disciplines to build the N3C Data Enclave will advance the science behind COVID-19 to deliver health care interventions and treatments.

View all I Am Translational Science videos.

A Powerful Tool for Researchers

There are more than 250 projects underway to explore a range of questions. Access the complete listing of projects that have been submitted through the Data Use Request (DUR) process and were approved by the N3C Data Access Committee.

Researchers interested in accessing the data will need to register with N3C and submit a DUR for review by the N3C Data Access Committee.

New Challenge The Biomedical Advanced Research and Development Authority, in partnership with NCATS and the Eunice Kennedy Shriver National Institute of Child Health and Human Development along with the Health Resources and Services Administration’s Maternal and Child Health Bureau are sponsoring a challenge competition that will leverage de-identified electronic health record data to develop, train and validate computational models that can predict severe COVID-19 complications in children. This will equip health care providers with the information and tools they need to identify pediatric patients at risk. Health record data will be provided through the N3C Data Enclave. Learn more about the Challenge.

The N3C is a partnership among the NCATS-supported CTSA Program hubs, the National Center for Data to Health (CD2H), and NIGMS-supported Institutional Development Award Networks for Clinical and Translational Research (IDeA-CTR), with overall stewardship by NCATS.

More information:

If you have questions about the N3C, please email NCATS_N3C@nih.gov.

Last updated on March 6, 2024