Part of our Health, Wellness, & Life Science Series
In this article we cover the transformational potential of AI technology for drug discovery. After an introduction of the challenges facing the pharmaceutical industry, we discuss three different aspects of how AI technology is being utilised to solve these challenges.
The first topic is an overview of some of the partnerships between the pharmaceutical sector and providers of AI solutions. Here we provide an insight of what aspect of the technology is the basis for the partnership.
The second aspect is how Cloud solutions and Cloud providers have profoundly facilitated the adoption of AI technology.
Finally, we describe how some barriers to a full endorsement of AI are being overcome with a new AI paradigm called “Explanatory AI”.
The aim of drug discovery is to identify novel medicines that can help prevent or treat a particular disease. Although there are many different types of drugs, many are small chemically synthesised molecules that can specifically bind to a target molecule—usually a protein involved in a disease.
Traditionally, researchers screened large libraries of molecules to identify candidates that could potentially become a drug. Although a rational structure-based drug design approach has become more common over time, this approach still requires today multiple rounds of design, synthesis, and testing. Because it is generally difficult to predict which chemical structure will have both the desired biological effects and the properties needed to become an effective drug, the process of drug discovery remains expensive and time consuming.
Once a new drug candidate shows potential in a laboratory, it may still fail in clinical trials. In fact, less than 10% of drug candidates make it to market following Phase I trials . Considering this, it is not surprising that researchers are looking to the unparalleled data processing power of AI as a way to accelerate and reduce the cost of discovering new drugs. AI technologies have the potential to speed up drug design, drive innovation, improve efficiency of clinical trials or help control drug administration.
Three areas where AI is already a powerful tool are target identification, target deconvolution, and de novo molecular design. In target identification, AI facilitates the holistic identification of new targets by harnessing multidimensional data sources including omics, text and images, or public databases. Alternatively, AI improves the speed and efficiency of searching libraries of pre-existing compounds for hits that could be viable active ingredients in new drugs (target deconvolution).
In de novo molecular design, AI can analyse and tailor chemical properties potentially more thoroughly and quickly than teams of scientists using traditional methods can. Besides the design of novel chemical compounds, synthetic feasibility—the ability to synthesise the compound—is one of the challenges of AI-driven de novo drug design.
Fact Box 1
Finally, protein-ligand interactions are quantum systems that are systems based on quantum physics. Exact methods to predict these systems are currently computationally intractable for standard computers, while approximate methods are often not accurate enough when interactions at atomic level are critical. A full simulation of these systems will perhaps be possible with future quantum computing technology.
AI and pharma partnerships
In a survey conducted with pharmaceutical industry professionals in late 2021, 40% of respondents highlighted AI as the technology expected to have the greatest impact in the industry in 2022. A similar percentage also believe that R&D would benefit the most from digitisation in the pharmaceutical sector .
In recent years, biopharma companies have adopted strategies to integrate AI into the discovery process; such as establishing teams of AI experts and data analysts, investing in startups, or creating collaborations with tech giants and/or research centres. According to experts, the main motivation behind these collaborations is either to have access to some critical data or because the AI partner provides a useful and accessible digital product to an AI solution.
Fact Box 2
Top pharmaceutical companies, including Roche, Pfizer, Merck, and AstraZeneca, among others, have started to collaborate with companies in the AI space several years ago. In one example from 2018, the Massachusetts Institute of Technology (MIT) formed with Novartis and Pfizer the Pharmaceutical Discovery and Synthesis consortium. The purpose of this consortium was to remove the barriers between AI and drug discovery research, and focus efforts on relevant problems in the field.
Today, there are multiple examples of partnerships between pharma and big tech or AI-oriented companies. These include Pfizer-IBM Watson, Novartis-Microsoft, Sanofi-Excentia and others (Figure 1). Beyond partnerships, not much merger and acquisition activity has been recorded to date. In some cases, pharmaceutical companies have made a partial purchase in order to sit on the board of the AI company and contribute to steer its direction.
AI is driven by big data. The main ability of AI is to consume vast amounts of data and learn subtle and complex patterns. In chemical processes, AI provides the ability to search a larger space of structures and interactions. This is the basis of the collaboration between the AI company Iktos and Pfizer or Merck KGaA.
Chemical and small molecule screening data are not the only type of data where AI technology can contribute significantly. Data sets critical for clinical trial design are also benefiting from AI. Electronic health records, patient demographics, the results of previous clinical trials or information from omics fields are all types of data that can be used as input to AI models for trial design. As an example, this is the basis of the collaboration of Janssen with companies such as Komodo Health.
Existing data may have limitations related to quantity, quality or suitability to specific applications. For this reason, the generation of data specifically with AI applications in mind has gained steam.
At least two companies have this aim as foundation: Insitro and Recursion. The latter uses images of millions of cells treated with genetic and chemical perturbations to explore the relationship between perturbations and the morphological features of cells. In 2020, Bayer found this approach promising enough to sign an agreement with Recursion to work on fibrotic disease.
Figure 1. Overview of some partnerships between pharmaceutical and AI companies.
Drug discovery and the Cloud
In the past 25 years, virtual screening for candidate small molecules has been the dominant technology in drug discovery. Virtual screening enjoyed the computational enablement of Computer-Aided Drug Discovery (CADD) and the conceptual framework of High Throughput Screening technology.
The principle of virtual screening is to build a 3D model of the target site and dock as many candidate small molecules as possible to estimate how well they might bind. Modelling 3D structures is a complex task, normally optimised for performance on systems available at a given time. This means that a system designed just 10 years ago would deliver accurate results on hardware with just 3% of the power of today’s equivalent.
The business model for virtual screening typically requires the purchase of the most powerful compute servers and graphical systems affordable, as well as the licences to a wide range of expensive modelling software platforms. This asset is often utilised as needed by the researchers, in some cases employed at full power and sometimes sitting idle. This is not a cost effective solution.
Cloud computing acts to lift these constraints. Cloud computing resources are sold on demand, typically by the hour, are elastic and are fully managed by the provider. This offers a powerful alternative to the huge internal resources that would be required to make virtual screening a reality.
Since with Cloud resources computational power is not anymore a constraint, no-compromise CADD and effective virtual screening can be developed, with no design compromises over the quality and rigour of the modelling tool.
Today several pharma companies use prominent Cloud providers to drive their research programs. AWS, for example, has become a significant element in research, development, and production processes at Moderna. Moderna’s mRNA platform uses the computational capacity of AWS to run a variety of algorithms that design individual mRNA molecules. With AWS, the company is able to shorten the time required to bring new molecules to market—something impossible a few years ago.
Alphabet has also combined Cloud with AI as a means to accelerate research in Life Sciences. In 2014, they acquired the London-based company DeepMind. Four years later, DeepMinds’ AlphaFold entered a protein-folding competition where it beat 97 other participants in accurately predicting protein structures, solving this difficult challenge for the first time.
In addition to their work on protein structures, Alphabet has recently introduced a new company, Isomorphic Laboratories, that promises to revolutionise drug discovery by tapping into the technology developed by sister companies DeepMind and AlphaFold.
A lot of organisations want to leverage AI but are not comfortable with letting the AI model make more impactful decisions because they do not trust the model. For this, stakeholders need to understand how AI came to a specific result and establish belief in the quality of the result from an expert perspective. This is why AI ‘explainability’ matters.
How do we build trust in AI models? We show how the model works. In particular, we describe the overall structure of the model, but also quantify how much the different features contribute to a specific prediction. These methods are typical of a new type of AI methodology called Explainable AI (XAI).
Regarding the assessment of a specific prediction, two algorithmic solutions are worth mentioning. These are Local Interpretable Model-agnostic Explanations (LIME) and SHapley Additive ExPlanations (SHAP) . Both algorithms provide a post-hoc explanation of the system’s logic that helps evaluate the performance even of black-box AI systems where the inner operations are not known (Fact Box 3).
XAI is actively researched in both public as well as private space. Major Cloud providers such as GCP and IBM have XAI tools integrated in their platforms [4 & 5]. A group of researchers at TU Berlin is developing a collection of metrics for XAI with the name QUANTUS . Alternatively, the US DARPA has also an Explainable AI program  that aims to produce “glass box” models explainable to a “human in the loop”.
One of the limitations of using XAI in drug discovery is the chemical language employed to represent the decision space of the model. Significant progress has been made in developing ‘low level’ molecular representations of compounds that are both suitable for AI and understandable by chemists. These representations are commonly built on string representations of molecules. A popular example of this is the Simplified Molecular Input Line Entry System (SMILES) .
XAI for drug discovery currently lacks an open-community platform where researchers can work synergistically and share software and tools for model interpretation. EU projects such as MELLODDY  for data sharing and collaborative model development without exposing proprietary information constitute an important first step.
Fact Box 3
The current healthcare sector is facing several complex challenges, such as the increased cost of drugs and therapies, which need transformational changes in this area. AI can make major contributions that go beyond speeding up the time new products go to market. These could include for example how patients use and respond to a drug.
Other contributions of AI to the pharma sector can include developing drugs in its correct dosage form, aiding quick decision making in drug manufacturing, which could lead to better quality products and batch-to-batch consistency, help design and recruit patients for clinical trials, or contribute to the safety and efficacy of the product during clinical trials (Figure 2).
Figure 2. AI can help with complex tasks at different stages of drug discovery.
The entire success of AI depends on the availability of a substantial amount of data. Other challenges that could prevent a full-fledged adoption of AI in the pharmaceutical industry include the lack of skilled personnel to operate AI-based platforms, limited budget for small organisations, apprehension about job loss, and scepticism about the conclusions reached by the AI.
Although there are no drugs in the market developed with AI-based approaches, a 30% projected growth from 2017 to 2025 indicates that AI will likely revolutionise the pharmaceutical and medical sectors in the near future.
Read more from our experts and about our work in health, wellness, and life science here.
 “Clinical Development Success Rates 2006-2015 – Biotechnology ….”
 GlobalData Healthcare, “AI will trend as the most disruptive technology in the pharmaceutical sector in 2022”,
 SHAP, “Welcome to the SHAP documentation”, 2018
 Google Cloud Platform, “Explainable AI SDK”, n.d.
 IBM Research Trusted AI, “AI Explainability 360”, n.d.
 UMILAB, “Understandable Machine Intelligence Lab”, n.d.
 Darpa, “Explainable Artificial Intelligence (XAI)”, 2018
 Daylight Chemical Information Systems, “SMILES – Daylight Theory”, n.d.