Through a unique collaboration with Eli Lilly and Company’s Open Innovation Drug Discovery (OIDD) program, NCATS researchers accessed Lilly’s panel of human disease-relevant assays, which are cell-based testing systems that help scientists explore the effects of small molecules on disease-related molecular processes. The approach, called phenotypic drug discovery, involves searching for agents that change cell function and is a common method for lead discovery in drug development.
Background
The collaborative team screened part of the NCATS Pharmaceutical Collection (NPC), which includes 3,800 approved and investigational medicines, against the OIDD assay panel, which was offered through Lilly’s OIDD platform. The OIDD testing panel consisted of assays relevant to Lilly’s areas of focus, such as cardiovascular diseases, diabetes, cancer and endocrine disorders, and was designed to help researchers uncover new starting points for early drug discovery.
Testing approved drugs in the OIDD assay panel broadens scientific knowledge of the biology of these compounds. Results from this effort may enable researchers to better predict treatment outcomes, design new drug combinations, and potentially reveal new uses for existing drugs, a time- and cost-saving approach called repurposing.
Testing approved drugs in the OIDD assay panel broadens scientific knowledge of the biology of these compounds. Results from this effort may enable researchers to better predict treatment outcomes, design new drug combinations, and potentially reveal new uses for existing drugs, a time- and cost-saving approach called repurposing. Results from nearly 2,500 of the NPC compounds are available online. These experiments are described in detail in a paper published July 15, 2015, in PLOS One.
Access the Data*
Choose a data format:
- Processed CSV data files and individual assay datasets
- Raw data
- Data in the format of a Semantic Knowledge Network
*Data files last updated January 2016.
Download CSV Data Files and Sets
- Compound meta data (CSV - 294KB) (see field descriptions)
- Assay experiment results (CSV - 4MB) (see field descriptions)
Access individual assay datasets via PubChem AID 1117321.
Download Raw Data
- Well_level_data_10022014_with_pubchem_info (TXT - 13MB): This file contains experimental data for each individual plate well, normalized between 0 and 100 percent using negative and positive controls, respectively. For screening assays, it contains one row for each screening result (each unique combination of assay, compound, concentration and run date). For concentration-response assays, it contains 10 rows for each experiment (each unique combination of assay, compound and run date). The 10 concentrations analyzed as a dose-response experiment share a common value for the CALC_RESULT_ID field. Screening results do not have CALC_RESULT_ID defined because those results are not analyzed via dose-response curves.
- Curve_fit_results_10022014_with_pubchem_info (TXT - 684KB): This file contains the IC50 values created by fitting concentration response curve data from the file above. Users can align results from both files via the CALC_RESULT_ID field.
Download OIDD Semantic Knowledge Network Data
The OIDD Semantic Knowledge Network encodes the experimental relationship between NPC compounds and OIDD assays in a standardized format that can be exported into other databases and information sources. Similarly, it also encodes relationships between compounds and assays and among other experimental data, including ChEMBL dataset mapping and classification of assays using the bioassay ontology (BAO).
Users can download Semantic Knowledge Network data represented as resource description format (RDF) and save in the Turtle (.ttl) format. The beta release of these files, which includes everything in the first table below, provides the default graph of a knowledge network and one Web ontology language (OWL) file comprising one BAO module. In addition to the OIDD phenotypic assay data, protein targets and bioactivity associations from the ChEMBL database are included to facilitate queries that integrate phenotypic and biochemical data.
- Semantic Knowledge Network data (ZIP - 2MB)
File Name |
Type & Size |
Description |
---|---|---|
bao_vocabulary_assay |
OWL - 96KB |
BAO module with bioassay class hierarchy |
chembl_activity |
TTL - 9MB |
ChEMBL activity associations on protein targets |
chembl_cco |
TTL - 60KB |
ChEMBL core ontology |
chembl_target |
TTL - 288KB |
ChEMBL protein targets |
npcpd2_assay |
TTL - 4KB |
Assay labels and links |
npcpd2_bao |
TTL - 7KB |
Manually curated BAO classifications |
npcpd2_substance |
TTL - 174KB |
Substance links |
pubchem_assay |
TTL - 9KB |
PubChem RDF, includes titles, measure group associations |
pubchem_endpoint |
TTL - 160KB |
PubChem RDF, includes endpoints, activity results |
pubchem_substance |
TTL - 8MB |
PubChem RDF, includes compound identifiers (PubChem CIDs), measure groups, assay identifiers (PubChem AIDs) |
pubchem_vocabulary |
OWL - 12KB |
PubChem module with bioactivity terms |
Field Descriptions for CSV Data Files
Compound Meta Data
Field Name |
Description |
---|---|
OIDD_ID |
Unique substance identifier assigned by Eli Lilly OIDD on receipt of substance; identical to NCGC_ID |
NCGC_ID |
Unique substance identifier assigned by NCATS; identical to identifier used in the NPC |
PUBCHEM_SID |
PubChem substance identifier |
TRIVIAL NAME |
Trivial name of drug |
SMILES |
SMILES chemical structure representation, taken from PubChem |
PUBCHEM_CID |
PubChem parent compound identifier |
Assay Meta Data
Field Name |
Description |
---|---|
OIDD_ASSAY |
Unique assay identifier assigned by Eli Lilly OIDD |
NAME |
Name of the assay |
METHOD |
Broad category of assay type |
TECHNOLOGY |
Technology used to implement the assay |
PROJECT |
Primary therapeutic application for assay |
SUBPROJECT |
Therapeutic application subcategory |
CELL_LINE |
Short identifier of cell line used |
CELL_LINE_DESCRIPTION |
Description of cell line used |
ROLE |
Screen, Primary, Secondary, Confirmatory or Profiling |
RESULT_TYPE |
Type of screen: IC50, EC50, %STIM, %INH |
DESIRED_RESULT |
Whether a positive outcome for the assay is ACTIVE or INACTIVE |
SCREENING_THRESHOLD |
Cut-off for determining ACTIVE/INACTIVE (not applicable to profiling assays) |
Assay Result Meta Data
Field Name |
Description |
---|---|
OIDD_ASSAY |
Unique assay identifier assigned by Eli Lilly OIDD |
NCGC_ID |
Unique substance identifier assigned by NCATS; identical to identifier used in the NPC |
CONCENTRATION |
Drug concentration in uM (%stimulation and %inhibition assays only) |
RESULT_TYPE |
Type of screen: IC50, EC50, %STIM, %INH |
RESULT_PREFIX |
Prefix for RESULT indicating =, < or > specified value |
RESULT |
Assay result value (IC50, EC50, %stimulation, %inhibition) |
RESULT_UOM |
% or uM |
OUTCOME |
Assay outcome: ACTIVE or INACTIVE |
RUN_DATE |
Date the assay was performed |
TEST_TYPE |
Single-Point (SP) or Concentration-Response Curve (CRC) |