How Real-World Datasets Stack Up to Randomized Controlled Trials: Two Pilots Could Help Inform Regulatory Guidance

By Deborah Borfitz 

February 4, 2020 | Researchers at Brigham & Women’s Hospital (BWH), in partnership with data analytics company Aetion, have been working on a real-world data (RWD) pilot project under the umbrella of RCT DUPLICATE since May 2018 with the financial and leadership support of the U.S. Food and Drug Administration (FDA). The project’s initial goal was replicating results of 30 randomized controlled trials (RCTs) using health insurance claims data, which expanded last April to predicting the results of seven ongoing phase IV trials, says Jessica Franklin, co-director of RCT DUPLICATE as well as a biostatistician in the division of pharmacoepidemiology and pharmacoeconomics at BWH.  

“The clinical areas we’ve focused on for replication—cardiovascular, diabetes, respiratory, and osteoporosis—have been largely determined by the kinds of outcomes that can be captured in claims… any kind of major health event that sends you to the hospital, such as a heart attack, stroke, hip fracture or exacerbation of asthma or COPD [chronic obstructive pulmonary disease],” says Franklin. “Claims are not nearly as good at collecting other kinds of outcomes, like whether or not pain or other patient symptoms are resolved.”  

Starting with claims made sense from several perspectives, including its widespread availability, Franklin says. “Pharmacoepidemiologists have a long history of working with insurance claims to evaluate safety of medications.” The FDA has also been using claims data for several decades to make safety decisions, and it is the basis of its Sentinel Initiative for rapidly assessing safety signals. 

How RWD might best be deployed in a regulatory submission is an open question. “We know claims can’t do everything, but we want to provide the FDA with some evidence of the areas where it is fit for use,” says Franklin. The project began Oct. 1, 2017, and will continue through 2020 and possibly longer. 

The FDA was involved in selecting the trials targeted for replication and has been engaging with researchers throughout the replication process, Franklin says. At the study’s conclusion, results will be turned over to the FDA via the Aetion Evidence Platform. The collaboration is effectively a test run of what the science and processes might look like if the FDA was to accept an observational study, based on claims, as evidence for an approval decision on a supplemental new drug application.  

Both the 30 published studies and the seven ongoing trials are in the process of being replicated and won’t be identified until it is known if the those targeted for inclusion can in fact be reproduced, says Franklin. The plan is to release the identity of targeted trials and results as replications are completed.  

Protocols for seven of the 30 completed trials are posted on clinicaltrials.gov, all diabetes studies. Five more trials will be added in the next few weeks, she adds. Full results are expected to be available online by the end of 2020. 

Evidence Platform 

Aetion—cofounded by Sebastian Schneeweiss, Franklin’s counterpart at RCT DUPLICATE—has been a key enabler of the FDA-sponsored pilot project, Franklin says. The Aetion Evidence Platform allows users, including those unfamiliar with the data and without any programming knowledge, to manipulate real-world healthcare data. 

Claims datasets are massive and doing any sort of safety or effectiveness study would otherwise require specialized technical staff to manage and analyze the data, she continues. Given that the project with FDA involves managing data and results for 37 studies, and not one at a time as is typically done, it would not even have been undertaken without Aetion. 

The platform’s user-friendly graphical interface makes it easy for researchers to explore what information is available in the database and build a study based on their knowledge of the clinical area and patient population, Franklin says. It also discourages investigators from doing the wrong type of analyses.  

All RCT DUPLICATE projects to date have used the Aetion platform to facilitate data management and results analysis, she says. These include a replication project, funded by the National Heart, Lung, and Blood Institute, focused on published RCTs of cardiovascular outcomes that are not part of a regulatory approval submission as well as four earlier pilot studies looking at cardiovascular outcomes in type 2 diabetes and rheumatoid arthritis (RA) patients. 

Over the next two years, RCT DUPLICATE will be tapping some “new and improved” data sources for its prospective replication projects, such as more standardized electronic health records (EHRs) integrated across care sites, says Franklin. Oncology-focused EHRs would be well suited to its planned expansion into cancer that will require information on staging at diagnosis as well as outcomes such as progression-free survival. “The need for RWD in cancer is huge,” she adds, noting that many of the new treatments being approved are for rare tumor subtypes.  

Disease registries are also of interest for trial replication purposes. “Rheumatoid arthritis [RA] is another area where we have a lot of questions about drug effectiveness,” Franklin says. “The relevant outcomes are all patient-reported, including how many inflamed joints they have and their quality of life, and that information isn’t captured in claims. But they’re collected nicely and systematically in an RA registry.” 

OPERAND Initiative 

In 2017, OptumLabs teamed up with the Multi-Regional Clinical Trials Center of BWH and Harvard to see if results of RCTs could be replicated in real-world settings. The partnership gave birth to the OPERAND (Observational Patient Evidence for Regulatory Approval and uNderstanding Disease) initiative, which last year awarded grants to researchers at Brown University and Harvard Pilgrim Health Care Institute to independently replicate two trials—ROCKET for atrial fibrillation and LEAD-2 for diabetes trial (DOI: 10.1111/dom.12012)—using different combinations of claims and EHR data from the OptumLabs Data Warehouse. 

OPERAND is an “important methodological opportunity” to understand the ability of observational data to generate reliable results as measured by their similarity to those of RCTs, says OptumLabs Chief Scientific Officer William Crown. A technical expert panel created during the initial phase of the project designed the study and selected two winners from among its academic research partners submitting bids on the work, based in part on their familiarity with the data and their statistical programming acumen. 

The initiative is being jointly sponsored by Amgen, AstraZeneca, Merck, Optum, Pfizer, Sanofi and UCB Biosciences, he adds.  

The FDA and life science companies are not the only audiences for replication trial findings, Crown says. Statistically speaking, learnings about data methodologies can also be applied to the evaluation of surgical interventions, care organization models, and the benefit design of insurance plans. 

ROCKET and LEAD-2 were chosen for a variety of reasons related to data volume and quality, says Crown. The two trials are in therapeutic areas with plenty of patients and established therapies. They were also completed relatively recently but not so recently that the products under study hadn’t yet had much market uptake. 

Brown and Harvard Pilgrim researchers separately worked from datasets in their own virtual sandbox created by OptumLabs, which prevents external linkage to other datasets that might create re-identification risk, Crown explains. It also keeps the studies blinded. The OptumLabs Data Warehouse contains a medical claims database representing 137 million lives plus EHR data on about 90 million people—all de-identified and linked at the individual level as well as to multiple other sources of data, including Area Health Resources Files—but for this research project the data pull was limited to relevant clinical and claims data generated over the past decade. 

The replication work by Brown and Harvard Pilgrim was recently completed, Crown says. Researchers are now looking at the heterogeneity of treatment effects in real-world patient populations by systematically relaxing the trials’ inclusion and exclusion criteria. 

Results of the replication trial and follow-on analysis will be submitted for publication this year, Crown says. At a workshop hosted by the Duke-Margolis Center for Health Policy last summer, he presented blinded, high-level results from replication of the ROCKET trial indicating that estimates fell within the 95% confidence interval of the original study. Clinical trial and observational study results were likewise well aligned for LEAD-2. 

Data Complexities 

RCTs and observational studies have different strengths and weaknesses, notes Crown. “Randomized studies are strongest from the standpoint of statistical validity, provided they are well designed. Sometimes you can’t randomize because of the ethical issues associated with not providing a patient with a treatment, but the big issue is that clinical trials lack evidence about the treatment population... because they’re designed to eliminate as many confounders as possible to get more precise estimates of treatment effect.” 

Inclusion and exclusion criteria in trials are typically strict, and follow-up periods are relatively short so adverse events that are rare may not be seen, Crown continues. “One of the beauties of observational studies is that they typically involve much larger sample sizes [than RCTs] and follow people forward in the data over long periods of time. They also allow you to include the population that was actually treated with the drug with all their complexities, all their medical comorbidities and all their different age ranges.” 

Data analysis is also a lot more complicated when RWD is being used. “With clinical trials, randomization balances the comparison groups based on what can be seen in the data and what can’t,” says Crown. “The analysis is almost trivial; it’s just a simple statistical test.” Combining claims and EHR data adds missing variables needed for clinical severity control but also introduces new issues, such as censoring (e.g., incomplete information about survival time), which requires sophisticated analytic methods to address.  

Learning Opportunities 

One big learning opportunity with OPERAND is better understanding results variability based on the combined use of claims and EHR data, says Crown. EHRs contain information that claims databases lack, including most of the clinical outcomes of interest as well as important variables such as body mass index and blood pressure readings. Different statistical methods can also influence estimates. 

“With OPERAND, we’re putting a big focus on how the decision-making of researchers affect the results,” says Crown. “Even though two research groups are replicating the same two trials, we want to know how they are making decisions about the variables they put in their models as well as the specific statistical techniques they use.” Researcher decision-making is a known area of variability that is not often studied. 

The FDA will continue to develop robust methods for safety surveillance with clinical data in the mix, which hasn’t been available in large enough quantities in the past, Crown says. The agency will likely be evaluating emerging real-world evidence use cases—as well as RCTs, real-world randomized trials and analysis of observational datasets—in the context of its totality-of-evidence approach to decision-making based on risks and benefits.   

Efforts to predict rather than replicate clinical trial results have been limited because of the difficulty in finding trials that are underway for products already on the market, says Crown. One of the first, most visible such efforts was the Catheter Ablation Versus Anti-Arrhythmic Drug Therapy for Atrial Fibrillation (CABANA) trial that involved a concurrent observational study (DOI: 10.1093/eurheartj/ehz085) by researchers at Mayo Clinic and Tufts Medical Center using data from the OptumLabs Data Warehouse.