Artificial Intelligence Is Advancing Clinical Research and Data Quality
April 25, 2023
Collecting and analyzing large amounts of clinical and biological data allows for better patient enrollment, effective engagement, efficient trials and high-quality results. Also, strong clinical and molecular data sets have allowed us to improve the predictive modeling of drugs and biological processes.
Yet all the data in the world is useless unless you can make sense of it. Analysts must be able to look for patterns and trends that are important for health. Unfortunately, the descriptions found in patient health records, reports, scientific literature, and other notes are not standardized and are hard to analyze. That is where natural language processing (NLP) software comes in. NLP uses machine learning to search, identify and organize narrative text into standardized information. Most NLP products need a lot of work before they can be used in medical applications. An off-the-shelf, customizable product can speed discovery and simplify the process.
Hua Xu, Ph.D., an NCATS Small Business Innovation Research (SBIR) grantee, now professor and Assistant Dean for Informatics, Yale School of Medicine, and formerly Associate Dean of the School of Biomedical Information at The University of Texas Health Science Center at Houston (UTHealth Houston), specializes in NLP in health. Xu and his team have been developing NLP methods for years. UTHealth Houston’s Center for Clinical & Translational Sciences (CCTS) is one of 60 NCATS Clinical and Translational Science Awards (CTSA) Program institutions in the country. Around 2015, a research team working in the CCTS needed NLP support for their projects, so they contacted Xu. Their partnership led to the development of a new NLP software called the Clinical Language Annotation, Modeling, and Processing, or CLAMP, toolkit.
CLAMP is unlike many custom NLP systems. It is a modular system made up of multiple parts that support different NLP tasks in the medical field. It has a workbench-style interface that can be accessed by both data scientist and non-data scientist users. The interface allows users to select, build and set up modules to meet their needs.
“It’s an ‘app (application) orchard’ approach. If you need an NLP pipeline to extract cancer pathology terms, for example, if we already have that functionality, you can use it from our library. If you want a pipeline to extract specialized rheumatology terms, right now we may not have that specific function in our library, but we’ll develop it, and we’ll add it into the ‘orchard’ for others to use,” explains Frank Manion, Ph.D., Vice President for Innovations at Melax Tech and former Chief Informatics Officer at the University of Michigan Comprehensive Cancer Center. “Ideally, we’d like to have the community involved in that process.”
As the CTSA Program researchers became more involved in the project, word spread throughout the network and through the team’s participation in CTSA-sponsored NLP challenges. Research teams outside of the CTSA Program also began using CLAMP. Soon, Xu’s lab started getting requests from outside companies.
“Having academic connections to the CTSA Program as a whole has been valuable because it tells us where innovations are needed and what directions we should be looking at for the product now and in the future,” said Manion. The technology transfer office at UTHealth Houston suggested licensing CLAMP from the university to academic and commercial parties. As interest grew, Xu formed a spin-off company called Melax Tech that licensed exclusive rights to CLAMP from the university. Melax Tech then led the distribution of CLAMP to industry users.
Small Business Innovation Research Funding With NCATS
Xu and Manion credit NCATS’ CTSA Program, the SBIR grant, the program officers, and the many resources NCATS offers to grantees for helping CLAMP cross the commercialization finish line.
“With the SBIR program, small businesses need a site to implement their projects. We worked with CTSA Program sites to demonstrate how our system could help with those CTSA-related efforts. That’s actually a good trial. In fact, we recommend SBIR applicants work with local CTSA Program sites to test out their product together for the SBIR grant application,” Xu says.
The NCATS SBIR Program helped Melax Tech test the product and perform research and development to improve and expand CLAMP. Product testing led to the development of MERIC NLP, a cloud-based version of CLAMP. The purpose of MERIC NLP is to help improve health care research through text extraction and standardization.
“The SBIR grant changed our thinking,” Manion says. “It let us use the funding to really construct the building blocks to our future infrastructure or a future product suite.”
Melax Tech has worked with 650 health care and life science organizations. These include academic research centers, contract research organizations, insurance companies, and biotechnology and pharmaceutical companies. The company has grown from two employees to over 20 employees in four years. As they look to the future, Melax Tech aims to grow its “app orchard” by creating more turn-key NLP products and expanding its reach to more life science companies and areas of need.
“Winning the SBIR grant gave us a lot of confidence,” says Xu. “We began to see more real opportunities come up, and people were coming to us. They needed this kind of NLP application, and now three years later, we feel like this is a big market, and we can do many things to become a more successful company.”