Phesi Highlights Disconnect Between Protocols and Outcomes, Warns of AI Repetition
By Allison Proffitt
June 2, 2026 | An analysis released by Phesi suggests that the systematic use of historical protocol templates without using patient data and context to guide protocol design leads to flaws being scaled, rather than solved, by AI.
That’s the argument Gen Li, founder and president of Phesi, outlined to Clinical Research News.
The Phesi analysis considered 600,000 clinical trial protocols over the past 15 years and found that less than 1/3 of those protocols could be linked to publicly documented patient data and outcomes.
Some of those protocols are, surely, for studies that are still recruiting or are still in the data processing phase, Li acknowledged, but not the majority. “Please realize the magnitude,” Li said. “The portion of these that are still recruiting patients or are in the process of being published is a very small portion of this data.”
So what happened to the missing data from all of these trials? Maybe trials were unable to recruit the needed patients, Li theorizes. Maybe the results were not successful and the trial was abandoned. And while those are troublesome trends in general, that’s not Li’s focus.
His concern is that those fruitless protocols are being used as AI inputs to design future protocols—perpetuating errors that will either lead to costly protocol amendments or abandoned trials.
Using failed protocols as templates has happened in the past, but AI makes the problem more pervasive. “AI will not be able to give us meaningful answers in the areas we don’t have sufficient human knowledge,” Li said.
The Phesi team dug deeper into breast cancer specifically, what it calls the world’s most studied disease over the past year and found just 31% of protocols are linked to trial outcomes. “Even in the most data-rich disease area, high volumes of research do not automatically create the reliable, outcome-driven evidence base needed for future trial design or AI models. This is a systemic issue in drug development,” the report authors write.
Data Informed, Nuance Aware AI
The solution, Li argues, is a more data-informed approach akin to the human protocol writers that considered full context and outcomes, both strengths and weaknesses, before using a protocol as a template for future studies. AI still has a role to play, but only with rich data that includes nuance. “A clinical protocol is effectively a business plan for an investment of tens or hundreds of millions of dollars,” the report authors write, “so AI must be guided by the right data foundations.”
Phesi is compiling that rich dataset and now has data on more than 400 million patients, Li explains, gathered from disease registries, observational studies, real-world datasets, clinical trials and other sources. The company argues that real world data can improve protocol design by better characterizing the patient; optimizing trial design based on successful protocols; offering precision in site selection; and enabling digital twins that could potentially reduce the size of the control arm or completely replace it.
“The gap between what protocols are designed to do and what actually happens in patients is the missing link in both current clinical development processes and emerging AI approaches,” authors write in the report. “Datasets must account for the full patient population, not just narrow subsets from late-phase trials or large protocol datasets disconnected from patient treatment outcomes.”







Leave a comment