AI-Powered inClinico Aims To Solve The Phase 2-To-3 Transition Problem

By Deborah Borfitz 

August 24, 2023 | Insilico Medicine recently broke the news that one of its platforms, powered by artificial intelligence (AI), succeeded in predicting the outcomes of several phase 2 clinical trials (Clinical Pharmacology & Therapeutics, DOI: 10.1002/cpt.3008). The forecasts appear in date-stamped articles that have been deposited on a preprint over the past seven years, and the lengthy prospective validation exercise is one of the major ways the company has set itself apart from other AI solutions on the market, according to President Alex Aliper, Ph.D. 

Aliper and Jan Szollos, senior director of business development at Insilico Medicine, teamed up last week for a webinar on inClinico, the transformer-based AI software platform used for predicting and optimizing study outcomes. The tool, available in software as a service (SaaS) form as of November 2022, has been described by CEO Alex Zhavoronkov, Ph.D, as the company’s oldest and most important project. It is integrated with other Insilico Medicine tools including PandaOmics (tool for novel target identification, prediction, indication expansion, and prioritization, and a good fit for many types of biomarker discovery projects) and Chemistry42 (generative chemistry system for novel drug design) that constitute the end-to-end drug discovery platform named Pharma.AI. 

Among the many “sweet spots” identified for inClinico’s pharma customers are to help triage their preclinical and discovery-stage projects, identify red flags in their ongoing or planned clinical trials, and to conduct post-mortem analysis at the conclusion of studies to look for improvement opportunities, says Aliper. For investors and financial institutions, the platform is equally well suited for due diligence purposes when evaluating the early-stage pipelines of smaller biotech firms—or for placing bets on the stocks most likely to spike (typically by 40%-50%).  

One of the crowning achievements of Insilco Medicine has been in identifying a novel, first-in-class drug for treatment of idiopathic pulmonary fibrosis, validating it as a preclinical candidate in 18 months, and then rapidly proceeding to clinical studies, notes Szollos, highlighting the role of inClinico when it came to risk-minimizing decisions related to target choice and trial design. 

As was widely reported, it took the company only about nine months to initiate first-in-human studies following successful results in preclinical studies. “The program is now in phase 2 in the U.S. and China, and all of it was achieved for a fraction of the cost and extremely truncated timeline compared to the traditional R&D approach,” Szollos says. 

Three Scores In One 

Predictions about the phase 2 to phase 3 clinical trial transition, the primary objective of inClinico, is where approximately 90% of drug development projects fail, says Szollos. This often culminates in trillions of dollars in wasted effort that AI is uniquely positioned to remedy by predicting clinical trial outcomes and thus avoiding many of the pricey failures. 

Three AI engines—one each for scoring a trial’s design, target choice, and patient eligibility criteria—are used by inClinico to produce a meta score indicating the overall likelihood of success for any trial, he says. The training data came from more than 55,000 phase 2 clinical trials registered on  

In terms of its ROC AUC (area under the curve of the receiver operating characteristic), inClinico has demonstrated 88% overall predictive performance during prospective validation, reports Aliper. The platform’s forecasting approach is intended to provide some useful insights before a trial is initiated. 

If a trial appears destined to fail, for instance, the logical next question is why, Aliper says. Likewise, inClinico can extract handy information once a trial is running, “in terms of what can be adjusted and what are the overall prospects of this asset.”  

In addition to these types of observations, Aliper continues, the platform allows users to explore the larger clinical landscape as well as score their own portfolio of trials or discovery projects. As an example, they could forecast how likely a mechanism of action is to work against a specific indication of interest without running the trial to find out.  

The trial design engine ingests different features of clinical protocols as inputs, he says, among them number of patients, accrual rate, blinding mechanism, number of arms, and racial composition. One of the big value-adds observed here is how the participation of clinical sites in certain countries can affect the clinical trial design score, says Aliper. 

In neuroscience trials in the United States, for example, the placebo effect appears to be “more pronounced and prominent” relative to other geographies, he says. These sorts of statistical features are captured by the model and can be interpreted in the context of a specific therapeutic area to optimize clinical trial site composition. 

A second and more important component—at least from the standpoint of the phase 2 to phase 3 transition problem—is the target choice score. This model wraps up many different modalities of data into a giant and “extremely complicated” biomedical knowledge graph to come up with a predictive score about the outcome of discovery- and clinical-stage projects. 

Given only a target and disease of interest, inClinico can generate a score that is much more likely to impact phase 2 clinical trial outcome prediction than trial design, as covered in the latest published study. This serves to underscore that lack of efficacy is the primary driver of clinical trial failures, Alper says. 

Interestingly, inClinico was able to predict the success of a first-in-class factor B inhibitor (LNP023) in a rare disease known as paroxysmal nocturnal hemoglobinuria with no prior information on the clinical relevance of the mechanism of the drug's action in the disease. 

The biomedical knowledge graph used by inClinico is a bit different than the one used in PandaOmics, since the target choice score is not present on that platform, Aliper says. With PandaOmics, the biomedical knowledge graph is more linked to causal relationships between genes and disease and a few methods are combined to produce an aggregate score. 

Patient eligibility criteria, the final component of inClinico, considers comorbidities and other features of trial participants to come up with a score that likewise stands alone but also factors into meta scoring of a trial, says Alper. The rationale for using an ensemble of scoring engines is to provide the “impacts of individual features on the probability of success” rather than just giving black-box predictions. 

Validation Exercises 

inClinico has continued to ingest additional clinical trials at other stages, thereby expanding its biomedical knowledge graph, says Aliper. To date, the platform has taken in over 150,000 clinical trials covering 41,000 small molecules and biologics and 22,000 different conditions and indications. 

“From very early on in the history of inClinico, we understood that retrospective validation is not sufficient to validate the platform,” he continues. Prospective-level validation would also be needed, as soon as possible, and they were going to have to get creative. 

Insilico Medicine began by applying a quasi-prospective validation approach where it trained the inClinico predictive models on what was happening with trials between 1995 and 2017 and validating them on trial outcomes being read out between 2018 and 2021. “We were mimicking prospective validation by using time stamps,” Aliper explains. 

The platform’s target choice score significantly outperformed trial design in the quasi-prospective validation studies in different disease areas, he says. The trial design score on its own had limited utility but when combined with the target choice score boosted overall predictive performance to over 80% in most therapeutic areas and, in some cases, to over 90%. 

For all trials, meta score performance hit a maximum of 88%, Aliper summarizes. For first-in-class trials, performance understandably dropped a bit to about 72% ROC AUC, due to the lack of prior knowledge for the mechanism of action. This is still “way beyond any established benchmarks” for assessing innovative products. 

Prediction performance metrics also showed the target choice score closely tracked with the meta score for both overall and first-in-class phase 2 trials, he notes. The ROC AUC was, respectively, 84% and 70% in the two categories for the target choice score. 

In 2016, two years after the company’s founding, Insilico Medicine deposited its first date-stamped article in the medRvix archive forecasting trial outcomes, says Aliper. It also began actively publishing in bioRvix, the open-access preprint repository for the biological sciences.  

In partnership with big pharma, Insilico Medicine performed a successful pilot in 2019 where it achieved “very good” prediction performance on a subset of active phase 2 clinical trials, he notes. To put forward yet more prospective validation studies, the following year it published several more studies covering trials in the Novartis and Roche pipelines that again put inClinico in a positive light. 

Virtual Bets 

Besides pharma, Insilico Medicine has been running many pilots with hedge funds and banks. Last year, it published predictions made by the inClinico platform on over 40 actively running phase 2 clinical studies being conducted by small- and mid-cap biotech companies whose very existence is heavily dependent on the outcomes of those trials, Aliper says.  

This prospective case study involved examining the performance of inClinico’s forecasts as the results of each trial were made public, as well as placing “virtual bets” on the associated stocks, says Aliper. Only companies who had available option contracts were included in the study, so “put” options could be put on those where success was predicted and “call” options on those where failure was expected, and investigators sought out the cheapest option contracts available at the time. 

For the 14 trials that have been read out so far, he reports, 11 (79%) have been predicted correctly by inClinico based on the stock price. Aliper refers here to a chart corresponding to the small, option-based portfolio over nine months, with performance benchmarked against some of the frequently used biotech and biopharma exchange-traded funds. Over that period, inClinico achieved a 35% time-weighted return. 

Aliper points out that no real positions were taken on the stock because of the “intrinsic uncertainty” about the readout dates. While the duration of the contracts covered the announced readout dates, companies often do not report results of their trials or do so quietly in their quarterly filings.  

Investors and financial institutions may well want to use inClinico for actual trading purposes, he later adds. But since the performance of the model is not 100%, they’ll want to hedge against this risk. 

Toggling back to use cases for enabling and optimizing trials, Aliper highlights recent “de-black boxing” efforts of Insilico Medicine. For a phase 2 trial run by Corcept Therapeutics that read out last December, inClinico revealed how the trial design score responded to the impact of individual features on the resulting probability as measured by a chat bot. 

A trial design score of 29% could thus be dissected to discover the positive and negative contributing features, he explains. For example, inClinico reduced the probability of trial success based on the anticipated enrollment of 70 people. SaaS users can similarly add or subtract different feature components of their trials to see the effect on the AI-derived trial design score.  

User Experience 

The pay-as-you-go SaaS version of the inClinico platform sits in the cloud, is simple to use, and accessible to anyone via their browser, says Aliper. Users can filter out their trials of interest using a “dynamically adjusting table” based on a long list of features such as trial sponsor, drug name, target, indication, disease of interest, therapeutic area, and trial phase. 

A page is produced for each trial that turns up, summarizing the overall layout of the study as well as the desired details. An added utility of the platform is the ability to add a trial to a watch list so it can be monitored for updates, he points out. The platform also provides the means to submit human feedback to enhance future iterations of the model.  

Further, inClinico offers users a view of the competitive landscape in terms of other companies working in the same therapeutic area (e.g., fibromyalgia) and the clinical success rate (in this case, 27%). Insights specific to the target choice score include government grants and total funding, plus metrics linking the mechanism of action to a disease of interest. 

For the trial design score component, users are presented with a SHAP plot—a standard method for explaining individual predictions—indicating the features contributing to success, failure, or uncertainty. Eligibility criteria are also listed for each trial, notably the inclusion and exclusion criteria that can be used as the basis of analysis, say Aliper. 

The “score new trial” feature on the main page is noteworthy, he adds. Users can tap it to dynamically assess any trial of their choice, at minimum by naming the target (e.g., enzyme regulating cholesterol) and the disease (e.g., coronary artery disease) to get a target choice score (93%, in this case). Optionally, the target can be profiled by specifying the drug name or sponsor.  

Likewise, users can profile preclinical stage projects using this “score new trial” tool by entering the indication, mode of action, effectiveness against a specified disease, and therapeutic area, says Aliper. 

The next release of inClinico this fall will introduce “other exciting utilities” on the platform, he adds. “We are now in the stage where we want to partner to improve the platform and... hear feedback about the tool and its utility.” 

Although inClinico partnerships have up to now been the main type of collaboration around inClinico, Szollos says, the SaaS option is the most straightforward with pricing based on the number of users as well as length of the subscription. inClinico can also produce customized reports on a fee-for-service basis when companies are looking to run portfolio triage on many trials.  

Continual Improvement 

Moving forward, inClinico will be ingesting other published libraries of clinical trials other than what appears in, says Aliper, although what’s available there can empower predictions regarding any targeted molecule and single-agent clinical trial.  

Insilico Medicine is not claiming predictive prowess when it comes to the outcome of trials using combination therapies, such as a small molecule coupled with an antibody drug, he says, since this is not yet supported by prospective validation. But this is an area that the company is actively exploring, as well as the utility of the platform for biologic drugs. 

To make predictions even more interpretable, inClinico will soon have some causality built into the target choice score, Aliper says. This will align the platform a bit more with some of the scores in PandaOmics, albeit with training specific to the phase 2 to phase 3 transition. 

The development team also plans to increase the “granularity” of endpoints, in terms of both measurement and statistical significance, he says. “In general, we want to ingest input from experts in finance and pharma and make [inClinico’s] utility... and the user experience as high as possible for the SaaS solution of the platform.” 

Most immediately, the SaaS version of inClinico will be equipped to simulate different trial designs, arms and dosages, Aliper reveals during the Q&A. Up to now, simulations of different trial design protocols have been done strictly through the partnership or service options to see how well recommendations by the model match up with perspectives of clinical development teams. The new capability will be added in a matter of days.