Beyond FAIR: Building Platforms for Cultural Changes of Data Sharing

Beyond FAIR: Building Platforms for Cultural Changes of Data Sharing

By Allison Proffitt

June 21, 2022 | “How do you do this FAIR-ification of data across multiple entities, multiple modalities within a single disease, and then multiple diseases. And then what happens when you do this? To make data FAIR just to claim that it was FAIR? Then what?”

At the Bio-IT World Conference & Expo last month, Alex Sherman, Director of the Center for Innovation and Bioinformatics, Neurological Clinical Research Institute (NCRI) at Massachusetts General Hospital, explored what clinical research can look like practically beyond just checking boxes for findable, accessible, interoperable, and reusable.

FAIR has plenty of challenges in the clinical space, Sherman said. Health data are messy and poorly structured. Data are seldom FAIR in an electronic health record. There are not generally accepted technical and semantic standards; privacy laws and regulations change by geography and institution.

Efforts must focus not only on creating FAIR data, Sherman argued, but standardized practices and a data culture of sharing as well.

Sherman’s focus is neurological conditions—so many of the solutions and platforms he shared are branded “neuro”—but he emphasized that the model is applicable to many disease areas. “It could be neuro, cardio, it could be onco. The approach is quite generic and certainly disease-area agnostic.”

The solution, Sherman argued, is platforms, which are ubiquitous across our lives: “Amazon, Netflix, banking, you name it! But in clinical research—let alone the clinical trials space—we are purely vertical. Which makes absolutely no sense,” Sherman said.

He compared the current study startup process to building a new football stadium—ground up—for every game. “At the end of the game you demolish the stadium and wait for the next game,” he said. From protocol development to site selection to IRB approval and all the way to study close, the traditional approach starts each process a new for every trial.

Platforms for Trials

“It takes years and years to even start a research project. In many diseases—and I’ll be talking about ALS or Lou Gehrig’s Disease—patients don’t have time. Not only because they die, but also they progress so fast, they will become ineligible for clinical trials.”

Generally, study sponsors are looking for early disease patients, but also fast progressors so there is measurable signal, Sherman said. For many patients, these criteria block them from trial participation.

“What we tried to do at Mass General—it was done before in several diseases—is to create a platform trial,” Sherman said. The platform trial was created with a single set of eligibility criteria. Patients were enrolled into the trial and then randomized into multiple companies’ drug regimens. Each regimen was further randomized into groups.

“The beauty of this is not just that you try multiple drugs simultaneously. But the beauty of this is that because the eligibility criteria are the same for all arms—for all regimens—you can share controls. In this case you don’t have to randomize 1:1. You can randomize 3:1,” sharing the control arms across multiple therapies.

The HEALEY ALS Platform trial is a perpetual adaptive trial that is run out of the Healey Center for ALS research at Mass General, and the approach has shown an advantage over traditional trials. Sherman reported that the trial was able to test ten therapies in four years with 2,000 patients compared to a traditional timeline of 12 years and 3,600 patients.

PRO-ACT and Data Modeling

Platformization can be successful across clinical research in other areas as well, Sherman said, and he has demonstrated the success of a platform approach with citizen science.

His team approached pharma and asked them to donate the data from their past ALS trials. “Companies like Teva [Pharmaceuticals], Novartis, Regeneron, Sanofi, Biogen gave us data—to my surprise, actually—we ended up in ALS with 23 clinical datasets and 11,000 subjects.”

The result is PRO-ACT, the Pooled Resource Open-Access ALS Clinical Trials dataset. This dataset is available to any researchers, and has been used to test eligibility criteria, and to model things like disease prevention, staging, survival, and more.

“But before opening this dataset to the world,” Sherman said, “we created a crowdsourcing challenge.” They gave three months of data and asked for longitudinal patient outcomes predictions. More than 11,000 “solvers” from 63 countries downloaded the data in the challenge, and three winners were named.

“None of the winners had anything to do with medicine!” Sherman said. One company, that modeled odors of perishable foods, actually spun out a bioinformatics company that now models neurodegenerative disease.

The project got a lot of attention—including winning both a 2013 Bio-IT World Best Practices Award and a 2016 Clinical Informatics News Best Practices Award as well as the 2021 Healey International Prize—but Sherman insists the approach is “quite generic and disease agnostic.”

The best outcome, Sherman said, is that the PRO-ACT project prompted more pharma to donate ALS research data. “That’s a very important cultural shift that never happened before,” Sherman observed. “I wonder why diseases do not use this approach,” he said.

Beyond Clinical Trials

Observational studies are particularly valuable, Sherman said, because they draw more heterogenous participants than those from clinical trials. The team wanted a better way to capture and integrate real world ALS data from EHRs, natural histories, omics, biomarkers, image banks, and other parts of their ecosystem, so they built a series of platforms for these data. NeuroGUID is a global unique patient ID; NeuroBANK is an accelerated research environment; NeuroBIO is a distributed BioRepository; NeuroPRO is a patient reported outcomes portal; and NeuroSHARE is the platform for data sharing.

For each platform, the goal is to facilitate some part of the clinical research process in a way that drives a culture of repeated collaboration and data sharing.

For example, the NeuroGUID platform is for patient identification, but because global unique identifiers that would track patient data across multiple projects, modalities, and even conditions are not allowed in all geographies, the SigNET platform issues unique clinical research identifiers, or tokens, for patients—one per patient per study. “Only the server knows how to collect all this data and knows how to server as an interpreter,” Sherman said, linking tokens. “It’s a very powerful approach to merge data across multiple modalities,” Sherman said.

The NeuroBANK ALS ecosystem tracks and consents types of data that are gathered in each ALS study, so that as participants move through the studies, data can be shared between studies. “It makes absolutely no sense to ask for demographics, medications, or diseases history again,” Sherman said.

Finally, NeuroSHARE is NCRI’s global platform for compliant clinical research information tracking and sharing. The goal is to merge various data resources with the players: participants, researchers, sites, consortia, data analysts, companies and more, and the all the institutional and government legal regulations. The process can be cumbersome—even if the data itself is FAIR.

Sherman recommended beginning with standard consent form language. “The idea is that when you start the project or when you start a data bucket, you have to start from the very beginning: from patients, from consent forms, from how you plan to share data even years later.” Especially in ALS, he said, there’s no way to come back and reconsent years later.

In all of these examples, Sherman is driven by a fundamental rule: “When you talk about clinical data, think of the patients.”