Real-World Data Enhanced Clinical Trial Design

Contributed Commentary by Dan O’Connor

May 26, 2021 | The challenges associated with executing successful clinical trials are well documented and increasing.  This reality prevents promising treatments from reaching patients faster. However, clinical research is evolving. The industry is seeing an unprecedented opportunity to improve the pace of new therapies to patients—especially those with life-threatening conditions—through novel applications of real-world data.

With expanding support from the FDA, the use of routinely collected clinical data at the point of care—Real-World Data—is becoming a promising tool for research teams. As regulators approve more therapies via accelerated pathways and novel trial designs e.g., single arm studies, post-approval study commitments are becoming more important than ever.  Sponsors and regulators must understand how these products work outside the tightly designed settings of the clinical trial, in the real-world. 

The interest in real-world data has skyrocketed but knowing when and how to use has been an adoption hurdle for the industry. There are three use cases particularly appropriate for real-world data: synthetic or external comparators (SCA/ECA) to replace or augment standard controls; precision registries for adaptive trial design; and clinical trial site feasibility to improve patient enrollment and recruitment.

Synthetic Control Arms

Synthetic controls or external comparators get the most attention in the real-world data space due to the promise of eliminating the need for a traditional control arm. This type of trial design is especially important in disease areas that have high mortality, and it would be unethical to put a patient on placebo.  During a recent discussion with Dr. Aaron Kamauu, a clinical informatics expert and pioneer in the application of RWD across the drug development lifecycle, he stated, “for some indications, it can be unethical to randomize patients into a placebo or standard-of-care control arm. This can be particularly true for patients with a medical condition that has high morbidity and mortality, and where there is no current specific treatment option. In such cases,” he continues, “a single-arm clinical trial where all patients go on treatment can be supplemented with a synthetic control arm, as long as the data is of sufficient quality and quantity to be appropriately comparable to the trial treatment arm.”

In Dr. Kamauu’s experience developing successful SCAs for regulatory submissions, he cites another benefit of SCAs is how they can ease the logistical and resource challenges of enrolling enough patients within an often-tight window of time—especially with rare diseases that pose much smaller sample sizes.

Synthetic control arms have already been implemented in several significant regulatory decisions as primary and supplemental evidence.  For example, Roche leveraged an SCA to meet European Union coverage requirements for marketing Alecensa (alectinib) in 20 European markets. Roche also used a synthetic control arm to inform EU pricing decisions by providing Alecensa’s relative performance to a standard of care. This brought Alecensa’s coverage forward by 18 months in several European countries.

While SCAs can be challenging, they can be an effective alternative to a standard control when a disease is predictable and depends on a standard of care that has a lengthy track record.  The time and resource benefits make this a high potential tool on the clinical development team’s workbench.  

Precision Research Registries for Trial Design

Observational studies have historically served a crucial role helping researchers understand unmet needs, natural history of disease, care pathways, and value arguments for payors. These types of data are undergoing somewhat of a renaissance in their application. Due to the well documented implications of a failed pivotal trials, cohort definitions are becoming more important.  Trial design teams can pull signals about various patient populations to improve the chances of running a successful pre-registration trial and articulate clearly to regulators what they will commit to study post approval.  These “precision registries” allow researchers to stratify populations by various phenotypic or genotypic features generally not available in legacy databases. 

Trial Site Feasibility

One of the key early design decisions is where to conduct the trial.  Site feasibility requires sizable resource commitments. The Tufts Center for the Study of Drug Development determined that an average of 8 months is required to move from pre-visit to site initiation. An average of 6 to 8 weeks of this time is needed for the completion of feasibility questionnaires. According to an article in Clinical Research, often, the outcomes are costly procedural steps, less-than-ideal site selection, and expensive study delays that can equate to $600,000 to $8 million for each day that a trial postpones product development and launch. Delays in determining study feasibility, among other forms of delay, create a detrimental cascade effect that makes it difficult to meet study milestones. Add that to the resource burden on the research site, site feasibility is an especially burdensome part of the evidence creation process.

The industry is leveraging two RWD-enhancing techniques to improve the site selection and patient selection at a site.  This is typically done by querying a structured database of administrative or “coded” information about the patient population.  New approaches are leveraging the AI-powered technologies to mine physician-observed narrative text data in the  medical record.  This novel approach uses biomedically tuned Natural Language Processing (NLP) to mine and analyze large populations across care settings and health care organizations to surface patients based on their phenotypic fit.  Others are leveraging the same technology as a verification step between consenting a patient and incurring the cost of a physician work up.  Inserting this real-world data-based step into a process can help bring a smaller number of patients into a work-up with higher qualified enrollment ratio as a result.

Novel Techniques, Novel Challenges

While RWD-enhanced trial design has demonstrated positive impact for researchers and patients alike, the challenges in extracting real-world data are also real. For data to accelerate research, they must be trusted. 

While coded data in medical records are useful for shaping a general view of inclusion criteria, the whole patient record needs consideration when shaping the other side of that coin: the exclusion criteria, i.e., why a particular patient or patient site altogether may not be suitable for a study. Using NLP to interpret complete patient records enables researchers to home in on the type of patients and populations that are optimal for an arm of a study. This could importantly distinguish, for example, triple-negative breast cancer patients from breast cancer patients in general or an uncoded procedure that would eliminate a patient’s eligibility.

Complete Patient Record Information Needed for NLP to be Fully Utilized

But even with today’s incredible advancements that improve integration of health data sources and platforms, there is one crucial challenge that needs to be solved—extracting the data so that they are analytics-ready. To use NLP to its fullest, we need to be able to feed the most complete patient record information into a model and apply AI. Patient medical records are often gigantic and can be difficult to access because they are locked away in the media tab as a tiff or PDF. Nonetheless, they provide valuable text, regardless of being a scanned dictation or a hand-scribbled physician note. This text can be “lifted from the page” and interpreted with optical character recognition (OCR). Once unlocked, real-world data will open the door to asking intelligent questions in unprecedented ways. With a complete picture, research is empowered to answer both “why we should” and “why we shouldn’t” do this.

Dan currently serves as Ciox Real World Data’s SVP of Growth, bringing 15 years of health consulting and buy-side software and services investing to the business.  Prior to his role in the emerging RWD business unit, he was Chief Strategy Officer for Ciox, responsible for corporate strategy, mergers and acquisitions and strategic partnerships across the enterprise.  He catalyzed the development of the real-world efforts at Ciox in 2018 and led the fundraising efforts that ended with a foundational equity investment in Ciox from Merck in 2019. He can be reached at

Load more comments