PRO-ACT’s Big Dataset Vital to ALS Research

By Ann Neuer

May 24, 2016 | Few diseases are as terrifying as amyotrophic lateral sclerosis (ALS), a progressive neurodegenerative illness that is fatal and seriously lacking in effective therapies.  There have been several large ALS clinical trials, but results are discouraging, yielding only one therapy—riluzole, which was approved by the Food and Drug Administration more than 20 years ago for slowing disease progression. 

Determined to re-define the treatment landscape for ALS, the Neurological Clinical Research Institute (NCRI) at Massachusetts General Hospital, in collaboration with Prize4Life, a not-for-profit dedicated to discovering treatments and cures for ALS, have created a platform to drive innovation in ALS research.  Known as PRO-ACT (Pooled Resource Open-Access ALS Clinical Trials), the platform leverages the power of big data by integrating data from 23 ALS clinical trials, representing 10,700 de-identified patients, and creating the world’s largest harmonized open access ALS dataset.  This resource is critical because the disease has orphan status in the US, meaning less than 200,000 Americans suffer from ALS, making clinical trials difficult to design and recruit.  With PRO-ACT, however, researchers can gain sharper insight into ALS mechanisms, design better ALS clinical trials, possibly requiring fewer subjects, and improve clinical care.  

Logo200This was the thinking of NCRI and Prize4Life, which submitted a summary of PRO-ACT to the 2016 Clinical Informatics News Best Practices Awards Competition, and was named the winner in the Clinical Data Intelligence category.  Their winning entry was announced at the recent Summit for Clinical Ops Executives (SCOPE) in Miami, Florida.

Maya Bronfeld, Ph.D., Prize4Life Scientific Officer, comments, “Winning the Best Practices Award was a wonderful affirmation that PRO-ACT is fulfilling its task of bridging the gap between clinical research and big data science, and shining a light on ALS disease.”  Similarly, Alex Sherman, Director, Strategic Development & Systems, at NCRI, adds, “Winning at SCOPE validates our work in data aggregation to create a huge resource for researchers to find cures for ALS, and for other neurological diseases, as well.”

The resource Sherman is describing is the aggregated data from the nearly two dozen ALS trials, much of which had remained untapped, as companies generally do not release those data.  “Whether a trial is successful or unsuccessful, a company might abandon that therapeutic area, meaning that the datasets sit and remain unused,” Sherman explains.  But, with PRO-ACT, data from those trials now have great utility, as they have been merged, anonymized, and made available for research.  

For example, at various study visits, there may be blood draws for so-called safety labs to determine safety of the study drug or if there are changes in specific blood levels possibly linked to the study drug.  By pooling this information, PRO-ACT has in excess of 2.5 million records of lab values alone.  Researchers using this PRO-ACT dataset published in Neurology (DOI: 10.1212/WNL.0000000000000951) their findings that certain values, such as higher levels of creatinine and uric acid at baseline, are predictive measures of disease progression, and ultimately survival, in ALS.  They are among the 600-plus researchers attracted to the dataset, and who are seeking a more targeted understanding of the disease and how to structure clinical trials, including appropriate outcome measures.

To develop the PRO-ACT platform, NCRI designed a customizable ALS-specific Common Data Structure (CDS), which is a specialized format for organizing and storing data in a standardized manner so they can be analyzed.  Then, data dictionaries from multiple trials were reviewed, data relationships between those dictionaries and the CDS were identified, and data rules developed.  Next, data were imported, based on those rules.  The platform is very flexible, enabling the assignment of individual data fields to various Common Data Elements (CDE). This is a powerful feature that will allow future interoperability, scalability, and the eventual registration of CDEs with regulatory authorities.

Orchestrating this effort was no small task. As Sherman explains, “We went to companies such as Teva, Sanofi, Novartis, and Regeneron Pharmaceuticals, and asked for their ALS datasets, along with the data dictionaries to explain their definitions.  The data were in multiple languages, and some lacked usable data dictionaries.  Each company had aggregated the data in its own way, but with our Common Data Structure and CDEs, we were able to create common denominators, essentially harmonizing the data.”

With this foundation, Prize4Life is working on new features for PRO-ACT, which Bronfeld explains will appeal to both clinical and data science members of the PRO-ACT community, namely adding a data cleaning and pre-processing tool with methodological suggestions and code files.  This effort is a continuation of Prize4Life’s active involvement in research using PRO-ACT, spawning publications in top medical journals, and global education efforts at several universities, using the ALS dataset to teach data analysis.  Prize4Life has also sponsored two crowd-sourcing initiatives, bringing together more than 1,300 participants from 64 countries to develop predictive algorithms for ALS.  One challenge sought to stratify ALS patients into subgroups, and the other involved developing an algorithm for all patients.  Six teams were awarded prizes, and had impressive results, including the formation of Origent Data Sciences, which is dedicated to the application of predictive modeling for ALS clinical trials.  Several of its models are currently in use. 

“We are at the early stages of big data, and we are continuing to expand PRO-ACT with new data from clinical trials.  When companies contact us to create an ALS protocol, we are asking them to commit to giving us data to be included in the dataset. This is an ongoing process that will help with data aggregation and expand this major resource for researchers,” Sherman remarks.