Using Real-World Data In Rare Disease Research

By Deborah Borfitz 

March 10, 2020 | Agencies around the world hold highly consistent views on the use of real-world data (RWD) to support regulatory decision-making—especially when it comes to synthetic control arms in rare disease studies. The barriers have less to do with regulators than data access, quality, and completeness and the technical skills required to harmonize, link and enrich disparate datasets, based on a pair of case study presentations made at the 11th Annual Summit for Clinical Ops Executives (SCOPE) in Orlando.  

The 21st Century Cures Act prompted the U.S. Food and Drug Administration (FDA) to issue draft guidance last year on how to submit documents using RWD and real-world evidence (RWE) and it included concrete examples, says Leanne Larson, MHA, corporate vice president and worldwide head of real-world evidence for Parexel. Earlier this year, China’s National Medical Products Administration (NMPA) finalized guidance documents on the same topic. The European Medicines Agency has similar work underway and the UK’s National Institute for Health and Care Excellence also just announced its statement of interest in how to use RWE in making reimbursement decisions.  

“Early engagement with regulators is key,” says Larson, who presented on a global, single-arm phase II oncology study with a synthetic control arm. Multiple real-world sources of primary and secondary data are typically needed to answer all research questions, and “they may not match your purpose specifically.” 

Working with RWE is difficult with many unknowns, Larson says, and requires a “feasibility assessment of the data” to ensure it represents enough of the right information about the right patients. Invariably, some intended data sources will get dropped along the way, she adds.  

The NMPA has done a particularly good job of describing the how-tos of assessing the quality and appropriateness of RWE based on the relevance and reliability of the underlying data, says Larson, as well as the limitations and considerations in constructing external controls. For example, parallel external controls (groups of patients treated during the same time period) are considered superior to historical controls (groups of patients treated at an earlier time), and a statistical analysis plan is required. 

Making the Match 

Design of the synthetic comparator cohort for the oncology study took about six months, Larson says. The initial plan was to use secondary data and existing datasets only, involve the U.S. plus three European countries, and match the external control 1:1 against the study’s 100-patient treatment arm. 

The final study approach was to use three secondary data sources (all electronic health records) in the U.S. and do on-site chart reviews in Europe, due largely to data availability and consistency issues, she says. The sites were not the same ones involved in the clinical trial. The 1:1 matching ratio was achieved in 15 months, “significantly faster” than the 24 months it took just to enroll patients to the phase II treatment arm. 

The patient “matching funnel” was based on nine covariants, says Larson, making it so restrictive that “not even a nearest neighbor match was good enough.” Among the key challenges were mapping vendor data into formats that could be used for statistical analysis and designing a case report form to match the electronic health record structure as much as possible without the benefit of natural language processing. 

One 7,000-patient dataset produced only three patients after all the criterion was applied, making it pointless to go through the exercise of pulling data, Larson continues. “We ran into that a lot.” The major reason was that patients had to be on monotherapy rather than the standard of care that involves a combination of therapeutic agents. 

“Vendor database quality and contracting were significant bottlenecks,” she says. “Getting access to data is always challenging and a few [vendors] dropped out or changed their data access policy in the middle of the project” to disallow external research use. 

Project management was also complicated, since specific expertise was needed to manage both site-based and sourced data collection, says Larson. Early and ongoing FDA engagement was critical, and its top concern was that the two cohorts be “as similar as possible—the closer to a randomized control trial, the better.” Statistical analysis drove all other activities. 

“Beware of over-expanding a study’s scope,” Larson says. “It can spin out of control quickly.” 

The oncology study had a “good outcome” and results are now under review by the FDA, she adds. Agency representatives needed education on the impact of their requests, a situation she expects will change because of the FDA’s ongoing collaboration with Flatiron Health.   

Universal Registry 

A second case study on the use of RWD for a non-interventional rare disease research was presented by Aaron Berger, senior director of real-world evidence in the clinical operations division of UBC. His focus was on the company’s technical expertise and RWE IT architecture that enables fit-for-purpose application of RWD. 

A rare disease affects less than 200,000 people in the U.S. and has variability among patients in terms of demographics, comorbidities, and disease etiology, Berger says. A universal registry increases the number of patients with the same genetic mutation to improve understanding of important disease subgroups and fill the data gaps in individual registries across different countries. 

Merging registries is a tricky undertaking that involves deduplication of patients, so they are only represented once—which is especially critical for a rare disease registry where each patient’s data has relatively high weight, says Berger. Another challenge is creating a common data structure when data elements differ by coding structure, level of completeness and source. Further, data gaps exist across contributing registries because not all registries are collecting the outcomes of interest. 

UBC’s Hemolytic Disease X (HDX) Universal Registry creates a fuller picture of patients and their diagnostic journey by aggregating data from four existing registries, says Berger. After deduplication, the one larger registry held information on 1,849 patients. The technical skillset required to create it includes identity management, data harmonization, and RWD linkage and enrichment. 

A matching algorithm was used for identity management, which involved HealthVerity replacing protected health information with tokens, Berger says. A gap analysis was done across 367 data variables in the registry to identify exact matches and data that could be inferred.  

The universal registry was then linked and enriched with RWD available through HealthVerity to provide an extended view of patient symptoms prior to diagnosis and to fill in missing data, says Berger. The linkage to HealthVerity medical and pharmacy data, relative to registry data alone, reduced the amount of missing data on multiple lab tests and treatment. The before-and-after figures were, respectively, 1,794 and 1,805 for hemoglobin and 1,791 to 1,822 for hematocrit. The numbers also went up across five HDX treatments after the linkage.  

Importantly, longitudinal claims data provides access to symptom patterns in the years prior to a confirmed HDX diagnosis while HDX registry data offers access to disease progression and treatment effectiveness and safety in confirmed patients, Berger says.