HHS Logo U.S. Department of Health & Human Services Divider arrow NIH logo National Institutes of Health Alt desc
Skip Over Navigation Links

Phenotypic Drug Discovery Resource

Through a unique collaboration with Eli Lilly and Company’s Open Innovation Drug Discovery (OIDD) program, NCATS researchers can access Lilly’s panel of human disease-relevant assays, which are cell-based testing systems that help scientists explore the effects of small molecules on disease-related molecular processes. The approach, called phenotypic drug discovery, involves searching for agents that change cell function and is a common method for lead discovery in drug development. Access the data.

Background

The collaborative team screened part of the NCATS Pharmaceutical Collection (NPC), which includes 3,800 approved and investigational medicines, against the OIDD assay panel, which is offered through Lilly’s OIDD platform. The OIDD testing panel consists of assays relevant to Lilly’s areas of focus, such as cardiovascular diseases, diabetes, cancer and endocrine disorders, and is designed to help researchers uncover new starting points for early drug discovery.

Testing approved drugs in the OIDD assay panel broadens scientific knowledge of the biology of these compounds. Results from this effort may enable researchers to better predict treatment outcomes, design new drug combinations, and potentially reveal new uses for existing drugs, a time- and cost-saving approach called repurposing.

Preliminary results from nearly 2,500 of the NPC compounds are available online. These experiments are described in detail in a paper published July 15, 2015, in PLOS ONE. In addition, Lilly plans to screen the NPC compounds further using new phenotypic assay modules, and those data also will be available here.

Access the Data*

Choose a data format:

  1. Processed CSV data files and individual assay datasets
  2. Raw data
  3. Data in the format of a Semantic Knowledge Network

*Data files last updated January 2016.

Download CSV Data Files and Sets

Access individual assay datasets via PubChem AID 1117321.

Download Raw Data

  • Well_level_data_10022014_with_pubchem_info (TXT - 13MB): This file contains experimental data for each individual plate well, normalized between 0 and 100 percent using negative and positive controls, respectively. For screening assays, it contains one row for each screening result (each unique combination of assay, compound, concentration and run date). For concentration-response assays, it contains 10 rows for each experiment (each unique combination of assay, compound and run date). The 10 concentrations analyzed as a dose-response experiment share a common value for the CALC_RESULT_ID field. Screening results do not have CALC_RESULT_ID defined because those results are not analyzed via dose-response curves.
  • Curve_fit_results_10022014_with_pubchem_info (TXT - 684KB): This file contains the IC50 values created by fitting concentration response curve data from the file above. Users can align results from both files via the CALC_RESULT_ID field.

Download OIDD Semantic Knowledge Network Data

The OIDD Semantic Knowledge Network encodes the experimental relationship between NPC compounds and OIDD assays in a standardized format that can be exported into other databases and information sources. Similarly, it also encodes relationships between compounds and assays and among other experimental data, including ChEMBL dataset mapping and classification of assays using the bioassay ontology (BAO).

Users can download Semantic Knowledge Network data represented as resource description format (RDF) and save in the Turtle (.ttl) format. The beta release of these files, which includes everything in the first table below, provides the default graph of a knowledge network and one Web ontology language (OWL) file comprising one BAO module. In addition to the OIDD phenotypic assay data, protein targets and bioactivity associations from the ChEMBL database are included to facilitate queries that integrate phenotypic and biochemical data.

File Name

Type & Size

Description

bao_vocabulary_assay

OWL - 96KB

BAO module with bioassay class hierarchy

chembl_activity

TTL - 9MB

ChEMBL activity associations on protein targets

chembl_cco

TTL - 60KB

ChEMBL core ontology

chembl_target

TTL - 288KB

ChEMBL protein targets

npcpd2_assay

TTL - 4KB

Assay labels and links

npcpd2_bao

TTL - 7KB

Manually curated BAO classifications

npcpd2_substance

TTL - 174KB

Substance links

pubchem_assay

TTL - 9KB

PubChem RDF, includes titles, measure group associations

pubchem_endpoint

TTL - 160KB

PubChem RDF, includes endpoints, activity results

pubchem_substance

TTL - 8MB

PubChem RDF, includes compound identifiers (PubChem CIDs), measure groups, assay identifiers (PubChem AIDs)

pubchem_vocabulary

OWL - 12KB

PubChem module with bioactivity terms

Field Descriptions for CSV Data Files

Compound Meta Data

Field Name

Description

OIDD_ID

Unique substance identifier assigned by Eli Lilly OIDD on receipt of substance; identical to NCGC_ID

NCGC_ID

Unique substance identifier assigned by NCATS; identical to identifier used in the NPC

PUBCHEM_SID

PubChem substance identifier

TRIVIAL NAME

Trivial name of drug

SMILES

SMILES chemical structure representation, taken from PubChem

PUBCHEM_CID

PubChem parent compound identifier

Assay Meta Data

Field Name

Description

OIDD_ASSAY

Unique assay identifier assigned by Eli Lilly OIDD

NAME

Name of the assay

METHOD

Broad category of assay type

TECHNOLOGY

Technology used to implement the assay

PROJECT

Primary therapeutic application for assay

SUBPROJECT

Therapeutic application subcategory

CELL_LINE

Short identifier of cell line used

CELL_LINE_DESCRIPTION

Description of cell line used

ROLE

Screen, Primary, Secondary, Confirmatory or Profiling

RESULT_TYPE

Type of screen: IC50, EC50, %STIM, %INH

DESIRED_RESULT

Whether a positive outcome for the assay is ACTIVE or INACTIVE

SCREENING_THRESHOLD

Cut-off for determining ACTIVE/INACTIVE (not applicable to profiling assays)

Assay Result Meta Data

Field Name

Description

OIDD_ASSAY

Unique assay identifier assigned by Eli Lilly OIDD

NCGC_ID

Unique substance identifier assigned by NCATS; identical to identifier used in the NPC

CONCENTRATION

Drug concentration in uM (%stimulation and %inhibition assays only)

RESULT_TYPE

Type of screen: IC50, EC50, %STIM, %INH

RESULT_PREFIX

Prefix for RESULT indicating =, < or > specified value

RESULT

Assay result value (IC50, EC50, %stimulation, %inhibition)

RESULT_UOM

% or uM

OUTCOME

Assay outcome: ACTIVE or INACTIVE

RUN_DATE

Date the assay was performed

TEST_TYPE

Single-Point (SP) or Concentration-Response Curve (CRC)

Last updated: 02-05-2016
▲ Back to top