About Informatics Research and Scientific Applications in the Division of Preclinical Innovation Informatics Core

Informatics plays a key role in organizing, processing, analyzing, and interpreting the large quantity and variety of data generated in translational research. The Informatics (IFX) Core, led by Ewy Mathé, Ph.D., within the Division of Preclinical Innovation (DPI) covers a wide array of expertise, including bioinformatics, cheminformatics, clinical informatics, data science, software engineering, UI/UX research and design, and project management. IFX Core team members collaborate extensively with informaticians embedded in other branches and cores of DPI and with other colleagues across DPI to produce methodologies, resources and software for the translational research community.

On this page:

IFX Core Mission

The mission of the NCATS IFX Core is to produce data-driven decisions and accelerate translation by:

  • Identifying biological and chemical mechanisms that underlie diseases, including rare diseases, and their development, drug mechanisms of action, and treatment response using novel or existing methods
  • Improving use and interpretation of metabolomics and other omics datasets by developing new methods or enhancing the application of existing methods
  • Producing open, comprehensive resources to accelerate translational research efforts, spanning ingredient/drug regulatory information, target annotations, disease annotations and molecular/omic phenotypes
  • Producing tools for the analysis and interpretation of complex high-throughput datasets
  • Developing, optimizing and testing models to prioritize targets and therapeutic opportunities, and identifying repurposed drugs through collaborations with NCATS’ DPI branches
  • Enhancing transparency and open research by adhering to user-centric and FAIR (Findable, Accessible, Interoperable, Reproducible) best practices
  • Assessing productivity of the translational research process
  • Expanding the use and understanding of informatics in translational research through workshops, training and mentoring

Who We Are

Our work is highly collaborative, and we take a team science approach in which everyone contributes meaningfully to translational research projects. We also are excited to house a number of trainees, from Ph.D. students to postdoctoral fellows, who bring in fresh ideas to our work. Learn more about the informatics scientists in DPI.

What We Do

Efforts within the IFX Core can be categorized into 3 main parts:

  • Building Standards, Knowledge Sources and Software: Integration, curation and public rendering to support analysis of various types of experimental and curated datasets.
  • Translational Data Analytics: Development of custom workflows and new methodologies to help interpret complex, large-scale datasets, including multi-omic and clinical data.
  • Scientific Computing Services and Research: Development, maintenance and deployment of cheminformatics and bioinformatics workflows/pipelines, web and mobile apps to disseminate our robust methods and data; bioinformatics and cheminformatics collaborative work with non-informaticians (within and beyond NCATS).

These three components are coordinated through our governance structure and efforts where one component informs efforts in another. See below for our publications.

Learn more about each component —

Whom We Work With

Within NCATS

Other Institutes and Centers at NIH

Other Partners and Collaborators

Select Publications

Featured Publications


See More

Gonzalez E, Jain S, Shah P, et al. Development of robust QSAR models for CYP2C9, CYP2D6, and CYP3A4 catalysis and inhibition. Drug Metab and Dispos. 21 Sep;49(9):822-832. doi: 10.1124/dmd.120.000320. PMID:34183376.

Pace BS, Perrine S, Li B, et al. Benserazide racemate and enantiomers induce fetal globin gene expression in vivo: studies to guide clinical development for beta thalassemia and sickle cell disease. Blood Cells Mols Dis. 2021 Jul;89:102561. doi: 10.1016/j.bcmd.2021.102561. PMID: 33744514.

Christov PP, Kim K, Jana S, et al. Optimization of ether and aniline based inhibitors of lactate dehydrogenase. Bioorg Med Chem Lett. 2021 Jun 1;41:127974. doi: 10.1016/j.bmcl.2021.127974. PMID: 33771585.

Gorshkov K, Chen CZ, Bostwick R, et al. The SARS-CoV-2 cytopathic effect is blocked by lysosome alkalizing small molecules. ACS Infect Dis. 2021 Jun 11;7(6):1389-1408. doi: 10.1021/acsinfecdis.0c00349. PMID: 33346633.

Martinez NJ, Braisted JC, Dranchak PK, et al. Genome-edited coincidence and PMP22-HiBiT fusion reporter cell lines enable an artifact-suppressive quantitative high-throughput screening strategy for PMP22 gene-dosage disorder drug discovery. ACS Pharmacol Transl Sci. 2021 Jun 10;4(4):1422-1436. doi: 10.1021/acsptsci.1c00110. PMID: 34423274.

Murai Y, Jo U, Murai J, et al. SLFN11 inactivation induces proteotoxic stress and sensitizes cancer cells to ubiquitin activating enzyme inhibitor TAK-243. Cancer Res. 2021 Jun 1;81(11):3067-3078. doi: 10.1158/0008-5472.CAN-20-2694. PMID: 33863777.

Ryu S, Chu PH, Malley C, et al. Human pluripotent stem cells for high-throughput drug screening and characterization of small molecules. Methods Mol Biol. 2021 Jun 15. Epub ahead of print. PMID: 34128205.

Siramshetty V, Williams J, Nguyễn ÐT, et al. Validating ADME QSAR models using marketed drugs. SLAS Discov. 2021 Jun 26:24725552211017520. doi: 10.1177/24725552211017520. PMID: 34176369.

Son J, Huang S, Zeng Q, et al. JIB-04 has broad-spectrum antiviral activity and inhibits SARS-CoV-2 replication and coronavirus pathogenesis. bioRxiv [Preprint]. 2021 Jun 4:2020.09.24.312165. doi: 10.1101/2020.09.24.312165. PMID: 32995798.

Tan MSY, Koussis K, Withers-Martinez C, et al. Autocatalytic activation of a malarial egress protease is druggable and requires a protein cofactor. EMBO J. 2021 Jun 1;40(11):e107226. doi: 10.15252/embj.2020107226. PMID: 33932049.

Wiedmann M, Dranchak PK, Aitha M, et al. Structure-activity relationship of ipglycermide binding to phosphoglycerate mutases. J Biol Chem. 2021 Jan-Jun;296:100628. doi: 10.1016/j.jbc.2021.100628. PMID: 33812994.

Chen Y, Tristan CA, Chen L, et al. A versatile polypharmacology platform promotes cytoprotection and viability of human pluripotent and differentiated cells. Nat Methods. 2021 May;18(5):528-541. doi: 10.1038/s41592-021-01126-2. PMID: 33941937.

John JN, Sid E, Zhu Q. Recurrent neural networks to automatically identify rare disease epidemiologic studies from PubMed. AMIA Annu Symp Proc. 2021 May 17;2021:325-334. PMID: 34457147.

Shamim K, Xu M, Hu X, et al. Application of niclosamide and analogs as small molecule inhibitors of Zika virus and SARS-CoV-2 infection. Bioorg Med Chem Lett. 2021 May 15;40:127906. doi: 10.1016/j.bmcl.2021.127906. PMID: 33689873.

Henderson MJ, Trychta KA, Yang SM, et al. A target-agnostic screen identifies approved drugs to stabilize the endoplasmic reticulum-resident proteome. Cell Rep. 2021 Apr 27;35(4):109040. doi: 10.1016/j.celrep.2021.109040. PMID: 33910017.

Li J, Wu R, Yung MMH, et al. SENP1-mediated deSUMOylation of JAK2 regulates its kinase activity and platinum drug resistance. Cell Death Dis. 2021 Apr 1;12(4):341. doi: 10.1038/s41419-021-03635-6. PMID: 33795649.

Li S, Zhao J, Huang R, et al. Profiling the Tox21 chemical collection for acetylcholinesterase inhibition. Environ Health Perspect. 2021 Apr;129 (4):47008. doi: 10.1289/EHP6993. PMID: 33844597.

Mansouri K, Karmaus AL, Fitzpatrick J, et al. CATMoS: collaborative acute toxicity modeling suite. Environ Health Perspect. 2021 Apr;129(4):47013. doi: 10.1289/EHP8495. PMID: 33929906.

Rohde JM, Karavadhi S, Pragani R, et al. Discovery and optimization of 2H-1λ2-Pyridin-2-one inhibitors of mutant isocitrate dehydrogenase 1 for the treatment of cancer. J Med Chem. 2021 Apr 22;64(8):4913-4946. doi: 10.1021/acs.jmedchem.1c00019. PMID: 33822623.

Temprosa M, Moore SC, Zanetti KA, et al. COMETS Analytics: an online tool for analyzing and meta-analyzing metabolomics data in large research consortia. Am J Epidemiol. 2021 Apr 22:kwab120. doi: 10.1093/aje/kwab120. PMID: 33889934.

Thomas A, Takahashi N, Rajapakse VN, et al. Therapeutic targeting of ATR yields durable regressions in small cell lung cancers with high replication stress. Cancer Cell. 2021 Apr 12;39 (4):566-579.e7. doi: 10.1016/j.ccell.2021.02.014. PMID: 33848478.

Cheng YS, Roma JS, Shen M, et al. Identification of antifungal compounds against multidrug-resistant Candida auris utilizing a high-throughput drug-repurposing screen. Antimicrob Agents and Chemother. 2021 Mar 18;65(4):e01305-20. doi: 10.1128/AAC.01305-20. PMID: 33468482.

Fecho K, Balhoff J, Bizon C, et al. Application of MCAT questions as a testing tool and evaluation metric for knowledge graph-based reasoning systems. Clin Transl Sci. 2021 Mar 20. doi: 10.1111/cts.13021. PMID: 33742785.

Hu X, Shrimp JH, Guo H, et al. Discovery of TMPRSS2 inhibitors from virtual screening. bioRxiv [Preprint]. 2021 Mar 17:2020.12.28.424413. doi: 10.1101/2020.12.28.424413. PMID: 33398276.

Khaled HG, Feng H, Hu X, et al. A high-throughput screening to identify small molecules that suppress huntingtin promoter activity or activate huntingtin-antisense promoter activity. Sci Rep. 2021 Mar 17;11(1):6157. doi: 10.1038/s41598-021-85279-2. PMID: 33731741.

Li S, Zhang L, Huang R, et al. Evaluation of chemical compounds that inhibit neurite outgrowth using GFP-labeled iPSC-derived human neurons. Neurotoxicology. 2021 Mar;83:137-145. doi: 10.1016/j.neuro.2021.01.003. PMID: 33508353.

Liao G, Ye W, Heitmann T, et al. Identification of small-molecule inhibitors of human inositol hexakisphosphate kinases by high-throughput screening. ACS Pharmacol Transl Sci. 2021 Mar 3;4(2):780-789. doi: 10.1021/acsptsci.0c00218. PMID: 33860201.

Bobrowski T, Chen L, Eastman RT, et al. Synergistic and antagonistic drug combinations against SARS-CoV-2. Mol Ther. 2021 Feb 3;29(2):873-885. doi: 10.1016/j.ymthe.2020.12.016. PMID: 33333292.

Jain S, Siramshetty VB, Alves VM, et al. Large-scale modeling of multispecies acute toxicity end points using consensus of multitask deep learning methods. J Chem Inf Model. 2021 Feb 22;61(2):653-663. doi: 10.1021/acs.jcim.0c01164. PMID: 33533614.

Jo U, Murai Y, Chakka S, et al. SLFN11 promotes CDT1 degradation by CUL4 in response to replicative DNA damage, while its absence leads to synthetic lethality with ATR/CHK1 inhibitors. Proc Natl Acad Sci U S A. 2021 Feb 9;118(6):e2015654118. doi: 10.1073/pnas.2015654118. PMID: 33536335.

Le Manach C, Dam J, Woodland JG, et al. Identification and profiling of a novel diazaspiro[3.4]octane chemical series active against multiple stages of the human malaria parasite Plasmodium falciparum and optimization efforts. J Med Chem. 2021 Feb 25;64(4):2291-2309. doi: 10.1021/acs.jmedchem.1c00034. PMID: 33573376.

Lynch C, Sakamuru S, Huang R, Niebler J, Ferguson SS, Xia M. Characterization of human pregnane X receptor activators identified from a screening of the Tox21 compound library. Biochem Pharmacol. 2021 Feb;184:114368. doi: 10.1016/j.bcp.2020.114368. PMID: 33333074.

Song G, Lee EM, Pan J, et al. An integrated systems biology approach identifies the proteasome as a critical host machinery for ZIKV and DENV replication. Genomics Proteomics Bioinformatics. 2021 Feb 19:S1672-0229(21)00025-5. doi: 10.1016/j.gpb.2020.06.016. PMID: 33610792.

Wu L, Huang R, Tetko IV, Xia Z, Xu J, Tong W. Trade-off predictivity and explainability for machine-learning powered predictive toxicology: an in-depth investigation with Tox21 data sets. Chem Res Toxicol. 2021 Feb 15;34(2):541-549. doi: 10.1021/acs.chemrestox.0c00373. PMID: 33513003.

Avram S, Bologa CG, Holmes J, et al. DrugCentral 2021 supports drug discovery and repositioning. Nucleic Acids Res. 2021 Jan 8;49(D1):D1160-D1169. doi: 10.1093/nar/gkaa997. PMID: 33151287.

Chen CZ, Shinn P, Itkin Z, et al. Drug repurposing screen for compounds inhibiting the cytopathic effect of SARS-CoV-2. Front Pharmacol. 2021 Jan 25;11:592737. doi: 10.3389/fphar.2020.592737. PMID: 33708112.

Dorjsuren D, Eastman RT, Wicht KJ, et al. Chemoprotective antimalarials identified through quantitative high-throughput screening of Plasmodium blood and liver stage parasites. Sci Rep. 2021 Jan 22;11(1):2121. doi: 10.1038/s41598-021-81486-z. PMID: 33483532.

Eicher T, Chan J, Luu H, Machiraju R, Mathé EA. Self-organizing maps with variable neighborhoods facilitate learning of chromatin accessibility signal shapes associated with regulatory elements. BMC Bioinformatics. 2021 Jan 30;22(1):35. doi: 10.1186/s12859-021-03976-1. PMID: 33516170.

Peryea T, Southall N, Miller M, et al. Global Substance Registration System: consistent scientific descriptions for substances related to health. Nucleic Acids Res. 2021 Jan 8;49 (D1):D1179-D1185. doi: 10.1093/nar/gkaa962. PMID: 33137173.

Sheils TK, Mathias SL, Kelleher KJ, et al. TCRD and Pharos 2021: mining the human proteome for disease biology. Nucleic Acids Res. 2021 Jan 8;49(D1):D1334-D1346. doi: 10.1093/nar/gkaa993. PMID: 33156327.


See More

Tong ZB, Braisted J, Chu PH, Gerhold D. The MT1G gene in LUHMES neurons is a sensitive biomarker of neurotoxicity. Neurotox Res. 2020;38(4):967-978. doi:10.1007/s12640-020-00272-3. PMID:32870474.

Zhu Q, Nguyen D, Grishagin I, et al. An integrative knowledge graph for rare diseases, derived from the Genetic and Rare Diseases Information Center (GARD). J Biomed Semantics. 2020;11(1):13. doi: 10.1186/s13326-020-00232-y. PMID:. PMCID:PMC7663894

Sheils T, Mathias S, Kelleher K, et al. TCRD and Pharos 2020: Mining the human proteome for disease biology. Nucleic Acids Res. 2020;gkaa993. doi:10.1093/nar/gkaa993. PMID:33156327

Zhu Q, Nguyen DT, Sid E, Pariser A. Leveraging the UMLS as a data standard for rare disease data normalization and harmonization. Methods Inf Med. 2020. doi:10.1055/s-0040-1718940

Peryea, T, Southall, N, Miller, M, et al. Global Substance Registration System: consistent scientific descriptions for substances related to health. Nucleic Acids Res. 2020;gkaa962. doi:10.1093/nar/gkaa962. PMID:33137173.

Zhu Q, Nguyen DT, Alyea G, Hanson K, Sid E, Pariser A. Phenotypically similar rare disease identification from an integrative knowledge graph for data harmonization: Preliminary study. JMIR Med Inform. 2020;8(10):e18395. doi:10.2196/18395. PMID:33006565.

Patt A, Demoret B, Stets C, et al. MDM2-dependent rewiring of metabolomic and lipidomic profiles in dedifferentiated liposarcoma models. Cancers (Basel). 2020;12(8):2157. doi:10.3390/cancers12082157. PMID:32759684. PMCID:PMC7463633.

Tristan CA, Ormanoglu P, Slamecka J, et al. Robotic high-throughput biomanufacturing and functional differentiation of human pluripotent stem cells. bioRxiv. Preprint posted online 2020;2020.08.03.235242. doi:10.1101/2020.08.03.235242. PMID:32793899. PMCID:PMC7418713.

Zhao Y, Man-Un Ung P, Zahoránszky-Kőhalmi G, et al. Identification of a G-protein-independent activator of GIRK channels. Cell Rep. 2020;31(11):107770. doi:10.1016/j.celrep.2020.107770.

Brimacombe KR, Zhao T, Eastman RT, et al. An OpenData portal to share COVID-19 drug repurposing data in real time. bioRxiv. Preprint posted online 2020;2020.06.04.135046. doi:10.1101/2020.06.04.135046. PMID:32511420. PMCID:PMC7276055.

Eicher T, Kinnebrew G, Patt A, et al. Metabolomics and multi-omics integration: A survey of computational methods and resources. Metabolites. 2020;10(5):202. doi:10.3390/metabo10050202. PMID:32429287. PMCID:PMC7281435.

Siramshetty VB, Nguyen DT, Martinez NJ, Simeonov A, Southall NT, Zakharov A. Critical assessment of artificial intelligence methods for prediction of hERG channel inhibition in the “big data” era. ChemRxiv. Epub April 16, 2020. doi:10.26434/chemrxiv.12119040.v1.

Shah P, Siramshetty VB, Zakharov AV, Southall NT, Xu X, Nguyen DT. Predicting liver cytosol stability of small molecules. J Cheminform. 2020;12:21. Published online 2020 April 7. doi:10.1186/s13321-020-00426-7. PMCID:PMC7140498.

Chu PH, Chen G, Kuo D, et al. Stem cell-derived endothelial cell model that responds to tobacco smoke like primary endothelial cells. Chem Res Toxicol. 2020;33(3):751-763. doi:10.1021/acs.chemrestox.9b00363. PMID:32119531.

Ellis CR, Racz R, Kruhlak NL, et al. Evaluating kratom alkaloids using PHASE. PLoS One. 2020;15(3):e0229646. doi:10.1371/journal.pone.0229646. eCollection 2020. PMID:32126112.

Sheils T, Mathias SL, Siramshetty VB, et al. How to illuminate the druggable genome using Pharos. Curr Protoc Bioinformatics. 2020;69(1):e92. doi:10.1002/cpbi.92. PMID:31898878.

Zahoranszky-Kohalmi G, Wan KK, Godfrey AG. (2020): Hilbert-curve assisted structure embedding method. ChemRxiv. Preprint posted online Feb 28, 2020. doi:10.26434/chemrxiv.11911296.v1.

Godfrey AG, Michael SG, Sittampalam GS, Zahoránszky-Köhalmi G. A perspective on innovating the chemistry lab bench. Front Robot AI. 2020;7:24. doi:10.3389/frobt.2020.00024.

Zahoránszky-Kőhalmi G, Sheils T, Oprea TI. SmartGraph: A network pharmacology investigation platform. J Cheminform. 2020;12:5. doi:10.1186/s13321-020-0409-9.

Ansbro MR, Itkin Z, Chen L, et al. Modulation of triple artemisinin-based combination therapy pharmacodynamics by Plasmodium falciparum genotype. ACS Pharmacol Transl Sci. 2020 Nov 2;3(6):1144-1157. doi:10.1021/acsptsci.0c00110. PMID: 33344893.

Shah P, Siramshetty VB, Zakharov AV, Southall NT, Xu X, Nguyen DT. Predicting liver cytosol stability of small molecules. J Cheminform. 2020 Apr 7;12(1):21. doi: 10.1186/s13321-020-00426-7. PMID:33431020.

Zahoránszky-Kőhalmi G, Sheils T, Oprea TI. SmartGraph: a network pharmacology investigation platform. J Cheminform. 2020 Jan 21;12(1):5. doi: 10.1186/s13321-020-0409-9. PMID:33430980.


See More

Peryea T, Katzel D, Zhao T, Southall N, Nguyen DT. MOLVEC: Open source library for chemical structure recognition. In Abstracts of Papers of the American Chemical Society. Vol. 258. Washington, D.C.: American Chemical Society; 2019.

Fecho K, Ahalt SC, Arunachalam S, et al.; Biomedical Data Translator Consortium. Sex, obesity, diabetes, and exposure to particulate matter among patients with severe asthma: Scientific insights from a comparative analysis of open clinical data sources during a five-day hackathon. J Biomed Inform. 2019;100:103325. doi:10.1016/j.jbi.2019.103325. PMID:31676459. PMCID:PMC6953386.

Huang R, Zhu H, Shinn P, et al. The NCATS Pharmaceutical Collection: A 10-year update. Drug Discov Today. 2019;24(12):2341-2349. doi:10.1016/j.drudis.2019.09.019. PMID:31585169.

Zakharov AV, Zhao T, Nguyen DT, et al. Novel consensus architecture to improve performance of large-scale multitask deep learning QSAR models. J Chem Inf Model. 2019;59(11):4613-4624. doi:10.1021/acs.jcim.9b00526. PMID:31584270.

Chen Y, Tristan CA, Chen L, et al. A versatile polypharmacology platform promotes cytoprotection and viability of human pluripotent and differentiated cells. bioRxiv. Preprint posted online Oct 22, 2019. doi:10.1101/815761.

Southall NT, Natarajan M, Lau LPL, et al.; IRDiRC Data Mining and Repurposing Task Force. The use or generation of biomedical data and existing medicines to discover and establish new treatments for patients with rare diseases — recommendations of the IRDiRC Data Mining and Repurposing Task Force. Orphanet J Rare Dis. 2019;14(1):225. doi:10.1186/s13023-019-1193-3. PMID:31615551.

Southall NT. Freedom of Information Act access to an investigational new drug application. ACS Pharmacol Transl Sci. 2019;2(6):497-500. doi:10.1021/acsptsci.9b00056. eCollection 2019 Dec 13. PMID:32259081.

Solinski HJ, Dranchak P, Oliphant E, et al. Inhibition of natriuretic peptide receptor 1 reduces itch in mice. Sci Transl Med. 2019;11(500):eaav5464. doi:10.1126/scitranslmed.aav5464. PMID:31292265. PMCID:PMC7218920.

Huang R, Grishagin I, Wang Y, et al. The NCATS BioPlanet — An integrated platform for exploring the universe of cellular signaling pathways for toxicology, systems biology, and chemical genomics. Front Pharmacol. 2019;10:445. doi:10.3389/fphar.2019.00445. eCollection 2019. PMID:31133849.

Austin CP, Colvis CM, Southall NT. Deconstructing the translational Tower of Babel. Clin Transl Sci. 2019;12(2):85. doi:10.1111/cts.12595. Epub 2018 Nov 9. PMID:30412342.

Gorshkov K, Chen CZ, Marshall RE, et al. Advancing precision medicine with personalized drug screening. Drug Discov Today. 2019;24(1):272-278. doi:10.1016/j.drudis.2018.08.010. PMID:30125678.


See More

Kearney SE, Zahoránszky-Kőhalmi G, Brimacombe KR, et al. Canvass: A crowd-sourced, natural-product screening library for exploring biological space. ACS Central Science. 2018;4(12):1727-1741. doi:10.1021/acscentsci.8b00747. PMID:30648156.

Zhou W, Sun W, Yung MMH, et al. Autocrine activation of JAK2 by IL-11 promotes platinum drug resistance. Oncogene. 2018;37(29):3981-3997. doi:10.1038/s41388-018-0238-8. PMID:29662190. PMCID:PMC6054535.

Oprea TI, Bologa CG, Brunak S, et al. Unexplored therapeutic opportunities in the human genome. Nat Rev Drug Discov. 2018;17(5):317-332. doi:10.1038/nrd.2018.14. PMID:29472638.

Tong ZB, Huang R, Wang Y, et al. The Toxmatrix: Chemo-genomic profiling identifies interactions that reveal mechanisms of toxicity. Chem Res Toxicol. 2018;31(2):127-136. doi:10.1021/acs.chemrestox.7b00290. PMID:29156121.

Coussens NP, Braisted JC, Peryea T, Sittampalam GS, Simeonov A, Hall MD. Small-molecule screens: A gateway to cancer therapeutic agents with case studies of Food and Drug Administration–approved drugs. Pharmacol Rev. 2017;69(4):479-496. doi:10.1124/pr.117.013755. PMID:28931623. PMCID:PMC5612261.

Nguyen DT, Mathias S, Bologa C, et al. Pharos: Collating protein information to shed light on the druggable genome. Nucleic Acids Res. 2017;45(D1):D995-D1002. doi:10.1093/nar/gkw1072. PMID:27903890. PMCID:PMC5210555.