Merck Outlines Structure of Saama Partnership on AI for Clinical Data

Merck Outlines Structure of Saama Partnership on AI for Clinical Data

By Allison Proffitt

February 15, 2023 | AI for clinical data was the subject of several presentations at the recent Summit for Clinical Ops Executives event in Orlando, Fla. SCOPE speakers from across big pharma shared their use cases in applying artificial intelligence and machine learning to various parts of the clinical research process.

Christopher Lamplugh, Associate VP and Head and Rakesh Maniar, Head of eClinical Technologies, Global Data Management and Standards, Clinical Trial Operations, both of Merck shared their experiences with one specific use case: exploring training, accuracy, and the precision of outputs when using AI to validate clinical data.

Lamplugh was quick to concede that Merck has always been a very conservative company, but the 2020 pandemic changed many things: launching some new trends and greatly accelerating others. Compared to 2018 numbers, Merck’s visit counts were up 100% by 2022; database locks were up 500%. “Not only is there a lot of context in what COVID has made us rethink, but also the velocity, variety, and volume of data [has increased]. We just can’t deal with it all.”

Lamplugh credits the Merck IT teams for pushing the company toward thinking very strategically about data initiatives and mentioned several ongoing digital and eClinical innovations including chatbots for CRAs, natural language processing for financial disclosure forms, and bot review for electronic trial master files (eTMF).

But at SCOPE, Lamplugh and Maniar chose to dig into Merck’s Smart Data Query—SDQ—efforts in partnership with Saama. (Saama now sells the product as Smart Data Quality.)

Emerging technologies are tricky to implement, Lamplugh pointed out. Merck developed a digital experimentation framework to help guide these explorations, beginning with a step to assess options and ideate; then lean due diligence; proof of concept projects that call for a go/no-go decision; full diligences assessing risk, privacy, vendor validation, change management planning; piloting; and finally launching at scale.

Merck’s existing data validation workflow is cumbersome and old, Lamplugh said, identifying data discrepancies through a process that is “time consuming, very antiquated, and requires a lot of human capital to conduct.” But it’s not unique, he argued. Similar processes are still in use across sponsors and CROs, he said.

To assess how the SDQ approach with Saama would be different, Merck compared the current process to what data validation AI could offer. To do that, the company started small, training the SDQ on 10 historical oncology studies. Then the team added a test set of five historical studies and two active studies to compare what the AI-powered SDQ predicted about the studies to what was actually happening.

Training the Model

Training the model involves many rounds of “teach cycles” explained Rakesh Maniar, which include defining sub-category, creating a ground truth by labeling or annotating training datasets, developing a model, then assessing that model’s behavior and retraining. Merck found through the process that machine learning is trainable for both simple and complex subcategories, but it doesn’t replace all the rule-based checks, he said, especially for unique scenarios not in the historical data that haven’t been presented to the model yet.

“Human-in-the-loop is very important for our industry. In the drug development process, we want to make sure that anything we submit has gone through human evaluation,” explained Maniar. He also championed the importance of explainable AI. “It’s very important that AI is explainable to regulators… It’s important to make sure we’re able to explain to authorities which training dataset was used to train the model, what are the test data that were used, and how the model did the ‘thinking’ needed to be captured and explained to agencies.”

Partnership Practicalities

Maniar and Lamplugh both emphasized the value of working closely with Saama (or any vendor) as they tested and trained SDQ. “We rely on the technology companies that we’re working with to really push us out of our paradigm and out of our comfort zone,” Lamplugh said.

Maniar highlighted the role of internal subject matter experts working closely with the vendor to establish ground truths and categories specific to the use case. He also pointed out the value of close vendor relationships in building institutional knowledge for Merck. “It’s very important when you partner with any of your providers or solution developers that we partner so that we can build the institutional knowledge along the way.”

Increasing Merck’s internal understanding of AI’s capabilities and processes will likely increase the use cases within the pharma. That requires a great deal of change management for staff, but a close relationship with the vendor can help.

“We are our own worst nightmare sometimes,” Lamplugh said. “We can’t have people thinking, how do we replicate what we do today with AI… We have to think about what can machine learning and AI do for us, and then look at other ways we may have to validate the data.”