Aggregation prediction with protein surface analyzer

Posted on September 8, 2021April 7, 2026 by Anonymous

Aggregation prediction with protein surface analyzer

Background

Biotherapeutics differ from small-molecule drugs not only in the size and complexity of the active ingredient but also in product manufacturing, storage, and delivery requirements. Production of antibodies, for instance, is an intricate process that relies on mammalian cell expression. Post-translational modifications occurring in mammalian cells mediate proper protein folding, multimerization, and secretion. Once produced, biologic drugs have to be kept at a high concentration under controlled storage conditions. Aggregation is one of the key risks that have to be managed throughout production and storage in order to maintain product safety, purity, and potency. Substantial time and monetary cost savings can be realized by early evaluation of relative aggregation tendencies of biomolecules and weeding out aggregation-prone candidates during the initial stages of a project. Unlike many current aggregation prediction methods that depend on primary sequence information, Schrödinger’s AggScore algorithm is entirely based on three-dimensional molecular structure. The structure-based approach of AggScore offers better accuracy and expanded general applicability compared to existing methods, in particular in situations when the differences of the compared structures are very subtle. In addition to surface-exposed hydrophobic regions, AggScore factors in the charge propensities of neighboring residues. The Protein Surface Analyzer and the AggScore metric are useful for 1) ranking and triaging proteins by aggregation propensity, 2) visualizing the distribution of aggregation‐prone regions on the surface of proteins and other biomolecules, and 3) reliably predicting the impact of residue mutation on aggregation behavior.

AggScore for Predicting Chromatography Retention Times of 137 Clinical-stage Antibodies

Three-dimensional structures of 137 clinical-stage mAbs were constructed using Schrödinger’s antibody modeling protocol. AggScore was then calculated for each of these structures and the scores compared with experimental retention times measured by Jian et al¹ using three chromatographic methods – HIC, SMAC and CIC. The antibodies’ elution characteristics were classified as “early elution” (retention time < 10 min) or “delayed elution” (retention time ≥ 10 min). Prediction performance results summarized in Table 1 indicate that AggScore had the best AUC compared to two other commonly used computational approaches – Zyggregator and Aggrecan. The results demonstrate that AggScore’s domain of applicability includes antibodies.

**Table 1.** Performance of the different computational methods in the classification of chromatographic retention times of the 137 clinical-stage antibodies

Mapping the Distribution of Aggregation‐prone Regions on the Surface of 5 Therapeutic Antibody Fv Fragments and Ranking them by Aggregation Propensity

Patch properties were computed from molecular surfaces projected at the water-probe distance (1.4 Å) away from the vdW surface of the protein. The protein surface patch calculation determines three classes of surface patches based on the respective hydrophobic and hydrophilic surface potential values: hydrophobic (green), positive (blue) and negative(red). Input structures were refined prior to protein surface patch calculation. The system pH was set at the appropriate value and atom charges were assigned according to the OPLS3.0 force field. AggScore was calculated on the set of five antibody structures with known liabilities. The score was able to predict their aggregation propensities in perfect rank order (Figure 1).

**Figure 1.** Calculation of Surface Profiles and ranking of five antibody Fv fragment variants based on AggScore

Predicting the Impact of Residue Mutation on Aggregation Behavior

AggScore was used to predict aggregation propensities of several mutants of β-amyloid (Aβ). The predictions were compared with experimental data as shown in Table 2. The results indicate that:

AggScore prediction matched experiment for eight out of nine single mutants (L17Q, L17E, F19S, I31N, I32S, I32V, L34P, V36E) of Aβ that showed reduced aggregation relative to the wild type (WT) as reported by Wurth et al⁵. The exception, A2S, is located at the n-terminal end of the protein, which does not contribute to AggScore.
AggScore exactly reproduced the decreasing order of aggregation propensities of four mutants relative to wild type: WT > I41L > I41V >I41A > I41G.
The prediction that three familial variants of Aβ, Dutch (E22Q), Iowa (D23N), and Italian (E22K) have higher aggregation propensity than the WT whereas the Flemish (A21G) has lower aggregation propensity than the WT is in good agreement with experimental studies performed by van Nostrand et al⁴ and Miravelle et al².

**Table 2.** Comparison of experimental and predicted changes in aggregation rates of different mutants of amyloid-β

Summary

Protein aggregation is a major impediment to the development of lead biomolecules into effective biotherapeutics. Schrödinger’s AggScore predicts aggregation propensities by taking into account residue contributions to charged and hydrophobic patch regions projected onto the surface of three-dimensional input structures. The method is well-suited to assist in identifying and mitigating aggregation issues in a variety of biologic product categories including antibodies, enzymes, and vaccine antigens.

References

Biophysical properties of the clinical-stage antibody landscape
Jan et al. Proc Natl Acad Sci USA. 2017, 114(5), 944-949
Substitutions at codon 22 of Alzheimer’s Aβ peptide induce diverse conformational changes and apoptotic effects human cerebral endothelial cells
Miravalle et al. J Biol Chem. 2000, 275(35), 27110-27116
AggScore: Prediction of aggregation-prone regions in proteins based on the distribution of surface patches
Sankar et al. Proteins. 2018, 1–10
Pathogenic effects of D23N Iowa mutant amyloid β-protein
Van Nostrand et al. J Biol Chem. 2001, 276(35), 32860-32866
Mutations that reduce aggregation of the Alzheimer’s Aβ42 peptide: an unbiased search for the sequence determinants of Aβ amyloidogenesis
Wurth et al. J Mol Biol. 2002, 319(5), 1279-1290

Stories from drug discovery: Dialing out off-target liabilities with protein residue mutation FEP+ in an active drug discovery project

Posted on September 7, 2021April 7, 2026 by Anonymous

Stories from drug discovery: Dialing out off-target liabilities with protein residue mutation FEP+ in an active drug discovery project

Summary

Free energy calculations are revolutionizing how early-stage drug-discovery campaigns are undertaken. In this short, 10-minute video, Schrödinger Senior Principal Scientist, Dr. Jennifer Knight, demonstrates the ability of Protein Residue Mutation FEP+ to elucidate ligand binding preferences for large families of off-targets and how, on an active drug discovery project, this strategy rapidly rescued a series by dialing out off-target liabilities.

Jennifer Knight is a Senior Principal Scientist in Schrödinger’s Drug Discovery Group. While working on one of their active drug discovery projects, the team came across some unexpected challenges with selectivity.

The Drug Discovery Group was working on a project where a competitor series for the same target had demonstrated potential liabilities due to several off-target kinases. The team identified their own series and used structure-based modeling with ligand FEP+ to optimize potency and improve selectivity over the known off-targets.

Over the course of seven months, the team profiled over six million design ideas. We triaged the designs using a variety of chemistry filters, Glide docking, and Active Learning strategies. Ten thousand went through the full ligand FEP+ calculations and multiple series and subseries were identified that looked promising. However, these new series picked up many other unanticipated and problematic off-target liabilities as well.

Instead of abandoning the project (and having months of research go to waste), the team decided to explore further. Looking into the binding pocket, the team observed a particular pattern—compounds in one series tended to hit kinases that had a particular amino acid at a given position, while compounds in another series tended to hit kinases that had a different amino acid at that position.

Table 1: Compound liabilities by series displaying which off-targets are hit based on their amino acids at a specific position in the binding pocket

This pattern gave the team an idea: What if they could use Protein Residue Mutation FEP+ as a way of accounting for these overall kinome scan profiles? Could they use the information (the amino acid identity at that one position) to understand how the compound might interact with all kinases that have that particular amino acid at that position?

The team found that, indeed, Protein Residue Mutation FEP+ was able to recapitulate the trends observed in the experimental data. This strategy allowed the team to reliably identify which family of kinases would be problems for their compounds and which ones wouldn’t. So, the team used this modeling approach prospectively to profile new design ideas and identified compounds that were predicted to not hit any other of these kinase families.

In a single round of synthesis, the team was able to drastically reduce the number of kinases that were hit by their series, and substantially minimize the off-target liabilities. This approach resulted in dramatically improved selectivity as observed in a preliminary panel of 20 kinases (increasing the percentage of kinases from 40% to 90% with >100x selectivity window) and in the full scanMAX panel (reducing the selectivity score, S(35), from 0.28 to 0.03). Both on-target ligand FEP+ and protein FEP+ for off-target selectivity were used successfully in this novel strategy to rescue these project series.

By using the combination of these two methods—determining not only the compounds with the best on-target potency with Ligand FEP+, but also eliminating the compounds that had off-target liabilities by using Protein FEP+—the team identified the best compounds to move forward into synthesis in approximately three months.

view all case studies

Stories from drug discovery: Modeling strategies in the pursuit of development candidate in oncology program 1

Posted on September 4, 2021April 7, 2026 by Anonymous

Stories from drug discovery: Modeling strategies in the pursuit of development candidate in oncology program 1

Summary

In this Stories from Drug Discovery series, Dr. Sayan Mondal, a Research Leader in Schrödinger’s Drug Discovery Group, demonstrates modeling strategies used from start to finish in a two-year oncology project in this short, 10-minute video. His team was tasked to first discover multiple potent, highly ligand-efficient lead series and then optimize them to find the potent compounds with balanced ADME profiles for development candidate nomination.

Dr. Sayan Mondal was the Modeling Lead on a recent drug discovery project at Schrödinger that was tasked with rapidly discovering a potentially best-in-class Type 1 kinase inhibitor in the backdrop of a competitor clinical compound entering Phase 2. The team’s goal was to have the key compounds for development candidate nomination in hand within two years from start to finish.

In this video, Dr. Mondal describes the modeling strategies the team used to efficiently overcome the typical challenges that often slow down programs like this or take them off the track. The team’s strategies delivered very potent, optimal compounds in multiple distinct chemical series as candidates for development candidate nomination within two years.

Feasibility Stage: Hit Identification

The project began with the feasibility stage and had two goals:

Discover novel and potent chemical series
Confirm that Schrödinger’s FEP+ is performing well for this program prospectively

Using LiveDesign as their centralized platform, the team crowd-sourced ideation in an iterative progression from the drug discovery group at large. Ideas were modeled with FEP+, and the FEP+ data were collated in LiveDesign to inspire new rounds of ideation. If an idea looked promising, the team would expand the model around it to come up with a final list of compounds.

21 out of 23 predictions matched the experimental data with overall Mean Unsigned Error (MUE) well within one log order. The team identified five novel picomolar (pM) cores that were highly ligand efficient, ultimately reaching 2-20 pM in SPR binding assay with multiple compounds — among the most potent non-covalent kinase inhibitors known —for both the lead and the backup series.

Overcoming Selectivity, Permeability, and Solubility Challenges

Selectivity

Selectivity challenges are common in kinase projects. Across both series, all tested compounds emerging from the feasibility stage were binding to a kinase with known cardiovascular toxicity, in addition to several other off target kinases. In order to dial out that liability, the team decided to use the new induced-fit docking engine, IFD-MD, to develop a putative binding pose for use in FEP+ off target modeling, instead of waiting for several months to solve a crystal structure of their series in the kinase off target.

Within days of receiving the kinase subpanel data, the team was able to get a reliable predictive FEP+ model for the key off-target that they were able to validate with compounds in the recent synthesis queue, where the predicted potency and experimental data were found to be in good agreement with each other.Thus, within two weeks the team was able to have a prospectively predictive model which they coupled with two design strategies for proof of concept (PoC) compounds for gaining kinase selectivity.

With this, the team was able to quickly find a PoC compound that was tolerated and maintained 100 pM potency on-target, but lost potency in the off-target kinase. In the DiscoverX kinase panel, the compound tested at 1µM (10,000x higher concentration than their on-target potency), and was quite clean, making it an excellent proof of concept for the model and the strategy.

Passive Membrane Permeability

Passive Membrane Permeability RRCK is a fast method that can triage a large list of ideas before heavy modeling, reserving GPU resources for modeling only cell permeable chemical matter. RRCK modeling can inform project strategy in terms of whether and where more polarity can be tolerated in a series to achieve other ADME goals such as clearance and solubility.

In the beginning months of the program, the team was making compounds that were potent on-target in binding and biochemical assays, but were shifted in the cell potency by 3-5 log orders, a typical problem for kinase projects. After collecting rounds of data, the team found that the RRCK predictions could explain the experimental data with a thresholding effect, with an atypically aggressive cutoff of around -5.1 (cutoffs can be project and series dependent, tend to be around -5.5 in our experience). The team shifted their modeling strategy to meet the -5.1 cutoff, controlling the cell shift to 2-3 log orders.

Solubility

FEP+ solubility calculations helped transform the backup series from an insoluble to a soluble, potent, permeable regime within one round of synthesis prioritization, with an MUE of 0.5 logS vs. the experimental kinetic solubility in the program. This will be covered in detail in a future case study.

Conclusion

By implementing various modeling strategies in the Schrödinger platform throughout the lifecycle of the drug discovery program, the team was able to overcome challenges that are pervasive in kinase programs and get to pre-DC compounds within two years in distinct chemical series.

Using FEP+ and RRCK permeability modeling, the team was able to obtain two distinct pM series that were both optimized to 2-20 pM potency and had good translation to cell potency
Deploying selectivity FEP+ with an IFD-MD generated pose in a key kinase off target enabled the team to get to a rapid proof-of-concept compound without waiting for a crystal structure
Working within LiveDesign allowed multiple contributors from across the globe to push the project ahead by testing, tracking and analyzing all relevant information in a centralized platform

view all case studies

An expedited gene-to-drug approach using thermo scientific cryo-em and the Schrödinger platform

Posted on September 1, 2021April 7, 2026 by Anonymous

An expedited gene-to-drug approach using thermo scientific cryo-em and the Schrödinger platform

Introduction

Structure-Based Drug Design (SBDD) imparts cost-efficiency, timeliness and superior properties to target-based small molecule drug discovery^1-3. Today, computational approaches have evolved to the point where they provide quantitative predictions of ligand affinity⁴ and selectivity⁵. Historically, X-ray crystallography has driven SBDD efforts resulting in the development of potent and selective protease and kinase inhibitors for the treatment of a variety of diseases including AIDS⁶ and cancer². However, many pharmaceutically important drug targets such as large macromolecular assemblies or membrane proteins are less amenable to SBDD due to the lack of 3D structural information. Recent advances in protein production and cryo-electron microscopy (cryo-EM) have expanded the role of SBDD for these critical targets^7,8. Here, we present the approach taken by Thermo Fisher Scientific and Schrödinger research teams that deployed GeneArt Gene-to-Protein, Thermo Scientific iSPA Workflow and the Schrödinger Drug Discovery platform.

Target Selection

Selection of targets for initiation of drug discovery programs is dependent on an understanding of the unmet medical need and multiple lines of evidence from clinical, genetic and preclinical studies that inform therapeutic potential (Figure 1). Biological rationale in the form of biochemical, pharmacological and genetic data that link modulation of target activity to pathophysiology and disease mechanisms are key factors. For example, targets with a clearly defined relationship between genotype and phenotype in preclinical models and human studies are considered higher priority. Confirming the impact of inhibition or activation and an understanding of tractability are also important aspects of target validation that serve as triggers for program initiation.

**Figure 1.** Evaluation of thousands of targets – shortlists of priority targets were identified after assessment of biological rationale and unmet medical need.

All targets must be modeling enabled for SBDD. After a careful review of current clinical and preclinical programs for a given target, challenges that could be solved by Schrödinger’s computational platform are identified. The Schrödinger platform is used to analyze protein structure quality and binding site druggability, followed by an assessment of the amenability of the structures for use with the technology. Due to the lack of high-quality structures, a large number of very interesting targets that meet the criteria for biological rationale and therapeutic momentum would need to be reprioritized until high-resolution structures are available. This is often because such targets are membrane proteins or proteins which form large multimeric structures that are often challenging for X-ray crystallography. Fortunately, these types of targets are often well suited for structural determination by cryo-EM, and allow for modelling of previously unreachable targets. A target was selected for the collaboration after careful analysis of the biological rationale, the unmet medical need and the potential for cryo-EM enablement of the target.

Protein Production

Once a target has been selected, the first hurdle to overcome is the production of protein of suitable quality for use with cryo-EM. Cryo-EM is typically applied to proteins that are multimeric and/or membrane protein complexes with a molecular weight of over 50 kDa. This poses challenges for protein production and purification, particularly within the timeline expectations of drug discovery. Thermo Fisher’s GeneArt platform comprises a Gene-to-Protein service that only requires a protein sequence and covers every step from gene synthesis to protein purification (Figure 2).

**Figure 2.** The GeneArt Gene-to-Protein service that only requires a protein sequence.

In this project, the process involved gene optimization, DNA synthesis, and transient protein expression in Gibco Expi293 cells. Purification from the cytoplasmic fraction via a terminal His tag yielded protein that was highly pure and required no further purification. The entire workflow from gene sequence to purified protein was completed within 6 weeks. The protein was obtained at a concentration of 5mg/ml, sufficient for creating dense micrographs with many particles for subsequent analysis, and with a yield in excess of 10 mg, more than enough protein to supply a cryo-EM based structure determination pipeline for months.

Cryo-EM Structure Determination

For structure determination of our selected target, we used the Thermo Scientific iSPA Workflow, a commercially available single particle analysis (SPA) solution for drug discovery. It includes a Thermo Scientific Vitrobot Mark IV device for the preparation of vitrified cryo-EM specimens, which facilitates rapid plunge-freezing of holey carbon grids in liquid ethane after application and blotting of the protein solution. This achieves embedding of proteins in a thin layer of non-crystalline ice, preserving their native state in solution. The specimens are then subjected to cryo-EM data collection performed on a Thermo Scientific Krios Rx cryo-TEM. The Krios Rx is operated with Thermo Scientific EPU, which is data acquisition software that enables automated screening and data collection across multiple grids (thanks to the recent EPU Multigrid feature). The Krios Rx records movies that represent 2D projection images of the target of interest and are subsequently used for computational 3D image reconstruction. After data collection, 3D reconstruction involves orienting and averaging hundreds of thousands of 2D images of isolated particles to calculate a high-resolution map of the protein. EPU Quality Monitor and EPU Data Management (powered by Thermo Scientific Athena Software) ensure optimal data quality and data flow.

In this study, the iSPA Workflow readily yielded ~2.5 Å resolution reconstructions for both unliganded and liganded complexes (Figure 3). The initial structural enablement of this target required two weeks from receipt of protein, involving two imaging attempts. The first attempt, where protein was frozen at a concentration of 0.5 mg/ml, was not successful. Cryo-EM screening yielded an uneven distribution of particles, which were only found on carbon areas or close to the edge of the carbon film holes. Since high particle density is generally favorable for vitrification, GeneArt delivered a second batch of protein at a higher concentration (5 mg/ml) which resulted in an even and highly dense “monolayer” distribution of particles on the grid. Compounds of interest were dissolved in DMSO and added to the protein solution prior to vitrification, aiming for final compound and DMSO concentrations of ~50 μM and ~0.5%, respectively. For each dataset, we collected roughly five thousand movies and Relion 3⁹, running on a low-cost quad-GPU workstation, was used for the reconstruction. In total, the time from sample preparation to high-resolution reconstruction of the first complex structure was performed within 3 days with the ability to solve additional liganded structures on a similar timeframe.

Figure 3. Left, the Thermo Scientific Krios Rx Cryo-TEM high-end microscope is the first pharma-dedicated solution for cryo-EM SBDD with a guaranteed high throughput ideal for iterative structure determination. Right, detail of 4 representative residues showing the coulombic potential reconstructed from the EM data collection and the fitted model. As can be seen, the data was sufficient to ascertain the position of most sidechains.

Structure-Based Drug Discovery

Once protein-ligand complex structures were obtained, atomic models were prepared with the Schrödinger platform and passed into the Schrödinger SBDD pipeline. As part of this process, ligands were placed using GlideEM, a tool that combines molecular-mechanics-based docking with real-space cross correlations to place ligands into cryo-EM maps¹⁰. The resulting atomic models were then refined with Phenix/OPLS3e (the prior version of OPLS4) which combines state-of-the-art real-space refinement with advanced Schrödinger force fields and implicit solvent models that can capture the underlying physics of the protein-ligand system. This workflow is able to rapidly create robust atomic models that are consistent both with the cryo-EM data and the physics of the system.

These atomic models can then be leveraged within the Schrödinger SBDD pipeline to virtually screen massive numbers of diverse compounds. In addition, targeted computational screens can be designed to remove known liabilities or exploit opportunities to differentiate a series from competitor compounds. In this case, we used the refined atomic model with FEP+ to create an affinity prediction model that was validated by retrospectively predicting affinity differences for a previously patented 62-compound congeneric series. We were able to capture several major affinity cliffs, thus validating that the refined atomic models were of sufficiently high quality to be used in prospective design as part of an SBDD-led program (Figure 4). We used this validated affinity model to explore modifications to the compounds to address the target product profile goals identified by the project team, such as novelty, improving permeability, and elimination of a potentially reactive group, all while maintaining potency. Pathfinder was used to ideate promising compounds, Glide was used to generate potential poses and FEP+ with our validated affinity model was used to score potency. Within weeks of obtaining the cryo-EM structure of the target-ligand complex, structure-based computational methods were used to prioritize compounds for synthesis.

**Figure 4.** Validation of one structure-based drug discovery (SBDD) technique used in this work. Cryo-EM-enabled FEP+ was used to retrospectively predict binding affinities for a series of 62 previously-patented congeneric molecules. A comparison of the predicted and measured binding affinity for each compound is shown.

Conclusion

Using a combination of solutions from Thermo Fisher’s GeneArt Gene-to-Protein, Thermo Scientific iSPA Workflow (Thermo Scientific Cryo-EM) and the Schrödinger Drug Discovery platform, the team was able to facilitate the structural enablement of the drug target within two months and arrive at novel, computationally-designed chemical matter just a few weeks later (Figure 5). This clearly illustrates that the combination of cryo-EM and computational chemistry methods can create a pipeline that can have a major impact on drug discovery projects.

**Figure 5.** Once the target was selected, Thermo Fisher Scientific and Schrödinger solutions enabled the progression of the project from gene to novel, computationally-designed small molecules in approximately three months.

References

RCSB Protein Data Bank: Enabling Biomedical Research and Drug Discovery
Goodsell DS et al. Protein Sci. 2020, 29, 52–65
Structural Biology Contributions to Tyrosine Kinase Drug Discovery
Cowan-Jacob SW et al. Curr Opin Cell Biol. 2009, 21, 280-287
Structural Biology Contributions to the Discovery of Drugs to Treat Chronic Myelogenous Leukaemia
Cowan-Jacob SW et al. Acta Crystallogr. 2007, D63, 80-93
Large-Scale Assessment of Binding Free Energy Calculations in Active Drug Discovery Projects
Schindler CEM et al. J Chem Inf Model. 2020, 60, 5457-5474
Is Structure-Based Drug Design Ready for Selectivity Optimization?
Albanese SK et al. J Chem Inf Model. 2020, 60, 6211-6227
Molecular Basis for Drug Resistance in HIV-1 Protease
Ali A et al. Viruses. 2010, 2, 2509-2535
The Rapidly Evolving Role of Cryo-EM in Drug Design
Wigge C et al. Drug Discovery Today: Technologies in press, 2020, 38, 91-102
Multiparameter RNA and Codon Optimization: A Standardized Tool to Assess and Enhance Autologous Mammalian Gene Expression
Fath S et al. PLoS ONE, 2011, 6, e17596
New Tools for Automated High-resolution Cryo-EM Structure Determination in Relion-3
Zivanov J et al. eLife, 2018, 7, e42166
GemSpot: A Pipeline for Robust Modeling of Ligands Into Cryo-EM Maps
Robertson MJ et al. Structure, 2020, 28, 707-716

Limited Experimental Data? No Problem: Machine Learning and Physics in Preclinical Drug Discovery

Posted on August 31, 2021January 17, 2025 by Anonymous

AUG 31, 2021

Limited Experimental Data? No Problem: Machine Learning and Physics in Preclinical Drug Discovery

Speakers

Sathesh Bhat, Steven Jerome, and Karl Leswing
Executive Director, Sr. Principal Scientist, Research Leader

Abstract

The rise of machine learning and accurate, physics-based modeling have facilitated breakthroughs in preclinical drug discovery, accelerating discovery of compounds with improved chemical properties at reduced cost relative to traditional methods.

Application of cutting-edge machine learning methods enables accurate exploration of significantly larger regions of chemical space through interpolation. Combined with accurate, extrapolative physics-based methods through active learning, hit discovery from purchasable libraries of billions of compounds becomes cost effective for the first time. Similarly, drastic expansion in chemical space searched in design-make-test-analyze cycles during lead optimization are enabled by physics-informed machine learning, resulting in speed and cost advantages.

This webcast will provide three practical examples of the application of machine learning in active drug discovery programs – one for property prediction, one for hit identification, and one for maintaining or boosting affinity through design-make-test-analyze cycles in lead optimization.

Understanding Water Interaction Leads to the Discovery of a New Class of Reversible USP7 Inhibitors that Suppress Tumor Growth

Posted on August 19, 2021May 9, 2024 by Anonymous

Understanding Water Interaction Leads to the Discovery of a New Class of Reversible USP7 Inhibitors that Suppress Tumor Growth

Summary

The Journal of Medicinal Chemistry recently published Discovery of Potent, Selective, and Orally Bioavailable Inhibitors of USP7 with In Vivo Anti-Tumor Activity. This paper was authored by researchers at RAPT Therapeutics with contributions from Schrödinger’s Christopher Higgs, a Senior Director in the Application Science group who applied FEP+ to help guide RAPT’s chemistry efforts.

USP7 is a promising target for cancer therapy as its inhibition is expected to decrease function of oncogenes, increase tumor suppressor function, and enhance immune function. The paper used a structure-based drug design strategy to discover a new class of reversible USP7 inhibitors that is highly potent in biochemical and cellular assays and extremely selective for USP7 over other deubiquitinases (DUBs). Their research led to the discovery of compound 41 (Table 1), a highly potent, selective, and orally bioavailable USP7 inhibitor that demonstrated tumor growth inhibition in both p53 wildtype and p53 mutant cancer cell lines, showing that USP7 inhibitors can suppress tumor growth through multiple different pathways.

The ability of compound 41 to suppress tumor growth was assessed in an xenograft model using mice. When given a 50 mg/kg oral dose of compound 41 twice daily, nearly complete tumor growth inhibition was observed (Figure 2).

**Figure 2.** Xenograft study with NOD-SCID mice engrafted with MM.1S tumors

Dr. Higgs explains the paper’s key findings:

Explain what this paper means. Why is it so exciting?

A growing number of proteins involved in different types of cancers are believed to be binding partners or substrates of USP7. This makes USP7 an attractive target for new therapies against several cancers, including ovarian, colon, and neuroblastoma. Hit compounds, identified by high throughput screening or derived from starting points reported in the literature, were co-crystalized with USP7 and found to bind to the allosteric pocket in the “palm” domain. Binding in this pocket is believed to stabilize an inactive conformation of USP7, preventing USP7 catalytic activity. With this data, our structure-based methods, including WaterMap and FEP+, were used to help guide and prioritize compounds to further optimize these series.

The paper showcases the use of the Schrödinger platform to aid drug discovery. Could you explain how Schrödinger software was used in this research?

The co-crystal structure of compound 7 in complex with USP7 (Figure 3) was prepared for modeling using the Protein Preparation Wizard in Maestro. Schrödinger’s WaterMap allows for the a 3D visualization of hydration sites in the binding pocket, which provides an understanding of each site’s entropy, enthalpy, and free energy, thus illuminating how different compounds could interact with waters in the binding pocket and predict their relative stability. Using this method in combination with FEP+, Schrödinger scientists were able to provide critical information for helping determine which compounds were likely to be strong binders.

**Figure 3.** Schematic of the catalytic domain of USP7 (orange) bound to ubiquitin (blue), with the palm region (below purple dashed line), tunnel area (red box), and catalytic site (green circle) highlighted. The yellow arrow indicates the compound binding region highlighted in the structures on the right. (Right) Comparison of co-crystal structures of 4-hydroxy piperidine inhibitor 1 and pyridylbenzofuran compound 7. Both inhibitors bind to the same allosteric site of USP7 and are selective inhibitors of USP7. The PDB code for 1 bound to USP7 is 6VN4. The PDB code for 7 bound to USP7 is 6VN5.

How does this process help you design compounds with optimal molecular properties?

Initial studies using WaterMap identified a small cluster of unfavorable hydration sites , adjacent to the aminopyridine, that with the appropriate functional group, could be displaced to gain further potency (Figure 4). A talented team of Medicinal Chemists at RAPT designed compounds to try and take advantage of this region of the pocket, while FEP+ was used to help prioritize their synthesis queue. This approach allows for fewer DMTA (design-make-test-analyze) cycles while allowing more chemical space to be explored. In addition, FEP+ was used to explore other parts of the molecule, not covered in this publication, which also enabled improvements in potency and PK.

**Figure 4.** WaterMap analysis of USP7 binding pocket. Hydration sites are shown as spheres colored by their predicted free energies (G). Green spheres indicate a favorable free energy G while red spheres indicate an unfavorable G. (A) Hydration sites calculated with the ligand (compound 7) removed. Overlay of compound 7 indicates that the magnified (enlarged) spheres are predicted to be displaced by compound 7 while the other (smaller) spheres are not. (B) Hydration sites with compound 7 present. Three unfavorable water sites (red/orange) near the 2-aminopyridyl headgroup of 7 suggest opportunities to further improve the potency of this inhibitor. WaterMap analysis based on co-crystal structure of compound 7 and USP7, PDB code 6VN5.

This publication is significant for cancer research, is that right?

USP7 is a very promising target for additional cancer treatments due to the large number of interactions it appears to mediate. However, more work needs to be done to fully characterize USP7’s role in these signaling pathways. This paper’s goal is to aid in that effort.

**Figure 5.** X-ray co-crystal structure of ether 23 with USP7, highlighting key interactions with the binding pocket. The PDB code for 23 bound to USP7 is 6VN3.

How did this work between RAPT Therapeutics and Schrödinger come about?

Schrödinger’s Applications Science group works closely with users of our software around the world, providing expert guidance on best practices and training on new functionality Schrödinger’s computational platform. The group does, on occasion, get actively involved in discovery projects under modeling services agreements, particularly where a new application area is being explored or to show the value our platform can bring in a real world context.

Data Scarcity? No Problem. Combining Machine Learning and Physics-based Simulations for Smart Design of Materials

Posted on August 17, 2021January 23, 2025 by Anonymous

JUL 22, 2021

Data Scarcity? No Problem. Combining Machine Learning and Physics-based Simulations for Smart Design of Materials

Speaker

Anand Chandrasekaran
Senior Scientist

Summary

The materials innovation R&D cycle has long benefitted from the use of physics-based simulation engines such as quantum mechanics and molecular dynamics to help lower the cost of discovering novel chemistries, structures, morphologies, and compositions of materials for a wide array of applications and industries. In the past few years, the growth of computational power and the interest in building large datasets of materials properties has led to the growing adoption of materials informatics and AI-powered approaches in materials science. However, such approaches are highly data intensive and suffer from an inability to extrapolate beyond the chemical space of the training model. In this webinar we demonstrate, using a number of case studies, that the tradeoffs between accuracy and computational complexity lead to a natural synergy between physics-based modeling and machine learning methods and showcase the ability to apply these methods successfully even in the absence of large datasets. By combining the latest physics-based and data-driven approaches, decision-making process for the materials design is quickly assessed over the extensive chemical design space. We will demonstrate this idea over a few recent case studies including in organic electronics, aerospace, automotive, and semiconductor industry. Advanced machine learning techniques such as active learning, genetic optimization, and deep neural network will be utilized to showcase how key materials properties like vapor pressure, electronic structures, chemical stability, optical characteristics, and thermomechanical properties can be predicted with little to no user bias.

Moving Beyond Spreadsheets: Rational Design of Materials Using Advanced Informatics and Machine Learning

Posted on August 17, 2021July 11, 2024 by Anonymous

AUG 17, 2021

Moving Beyond Spreadsheets: Rational Design of Materials Using Advanced Informatics and Machine Learning

Speaker

Yuling An
Product Manager

Summary

In this webinar, Schrödinger’s Dr. Yuling An will demonstrate that machine learning, which often ignores the underlying physics, and physics-based modeling, which may require intensive computing resources, can naturally complement each other to create not only predictive models but also new materials with desired properties over an extensive design space. The growing urgency to digitize and make use of existing data, both from experiments and from simulations, through machine learning, also heightens the need of materials informatics platforms, which bring together automated computational workflows with data analysis and collaboration to make materials innovation more efficient and successful. Examples include recent progress in de novo OLED materials design and prediction of molecular volatility using Schrӧdinger’s materials informatics platform, LiveDesign.

Potency- and selectivity-enhancing mutations of conotoxins for nicotinic acetylcholine receptors can be predicted using accurate free-energy calculations

Posted on July 6, 2021April 25, 2025 by Anonymous

An Introduction to Automating Workflows with KNIME

Posted on June 16, 2021January 22, 2025 by Anonymous

NOV 5, 2016

An Introduction to Automating Workflows with KNIME

Speaker

Katalin Phimister
Senior Tech Support

Abstract

A brief overview on how to get started with KNIME and use Schrödinger Nodes to create workflows for use within KNIME and Maestro. We will show a couple of practical examples of how workflows can be used to automate tasks and how the Schrödinger extensions can work together.

What is KNIME, Nodes available from Schrödinger
Installation and getting started with Workflows provided on website
Detailed explanation of two example workflows
Educational resources for your specific project needs

Design and Optimization of Biologics Driven by Physics-Based Computational Modeling

Posted on June 15, 2021January 17, 2025 by Anonymous

JUN 15, 2021

Design and Optimization of Biologics Driven by Physics-Based Computational Modeling

Speakers

Eliud Oloo
Sr Principal Scientist
Lingle Wang
Vice President

Abstract

In this webinar, we will address the important role that computational modeling can play in accelerating the discovery and development of novel biologics. The presentation will highlight recent advances in Schrödinger’s protein modeling capabilities and their application to common protein design problems encountered in real-world projects. Topics covered will include strategies for protein property prediction followed by optimization through engineering. In particular, we will discuss quantitatively accurate prediction of the effects of residue mutations on protein thermal stability and protein-protein binding affinity using FEP+ technology, and the recent developments of the method to further improve accuracy and reliability.

OPLS4: Improving Force Field Accuracy on Challenging Regimes of Chemical Space

Posted on June 11, 2021November 25, 2024 by Anonymous

Aggregation prediction with protein surface analyzer

Background

Summary

References

Biophysical properties of the clinical-stage antibody landscape

Substitutions at codon 22 of Alzheimer’s Aβ peptide induce diverse conformational changes and apoptotic effects human cerebral endothelial cells

AggScore: Prediction of aggregation-prone regions in proteins based on the distribution of surface patches

Pathogenic effects of D23N Iowa mutant amyloid β-protein

Mutations that reduce aggregation of the Alzheimer’s Aβ42 peptide: an unbiased search for the sequence determinants of Aβ amyloidogenesis

Stories from drug discovery: Dialing out off-target liabilities with protein residue mutation FEP+ in an active drug discovery project

Summary

Stories from drug discovery: Modeling strategies in the pursuit of development candidate in oncology program 1

Summary

Feasibility Stage: Hit Identification

Overcoming Selectivity, Permeability, and Solubility Challenges

Selectivity

Passive Membrane Permeability

Solubility

Conclusion

An expedited gene-to-drug approach using thermo scientific cryo-em and the Schrödinger platform

Introduction

Target Selection

Protein Production

Cryo-EM Structure Determination

Structure-Based Drug Discovery

Conclusion

References

RCSB Protein Data Bank: Enabling Biomedical Research and Drug Discovery

Structural Biology Contributions to Tyrosine Kinase Drug Discovery

Structural Biology Contributions to the Discovery of Drugs to Treat Chronic Myelogenous Leukaemia

Large-Scale Assessment of Binding Free Energy Calculations in Active Drug Discovery Projects

Is Structure-Based Drug Design Ready for Selectivity Optimization?

Molecular Basis for Drug Resistance in HIV-1 Protease

The Rapidly Evolving Role of Cryo-EM in Drug Design

Multiparameter RNA and Codon Optimization: A Standardized Tool to Assess and Enhance Autologous Mammalian Gene Expression

New Tools for Automated High-resolution Cryo-EM Structure Determination in Relion-3

GemSpot: A Pipeline for Robust Modeling of Ligands Into Cryo-EM Maps

AUG 31, 2021

Limited Experimental Data? No Problem: Machine Learning and Physics in Preclinical Drug Discovery

Speakers

Abstract

Understanding Water Interaction Leads to the Discovery of a New Class of Reversible USP7 Inhibitors that Suppress Tumor Growth

Summary

Dr. Higgs explains the paper’s key findings:

Explain what this paper means. Why is it so exciting?

The paper showcases the use of the Schrödinger platform to aid drug discovery. Could you explain how Schrödinger software was used in this research?

How does this process help you design compounds with optimal molecular properties?

This publication is significant for cancer research, is that right?

How did this work between RAPT Therapeutics and Schrödinger come about?

JUL 22, 2021

Data Scarcity? No Problem. Combining Machine Learning and Physics-based Simulations for Smart Design of Materials

Speaker

Summary

AUG 17, 2021

Moving Beyond Spreadsheets: Rational Design of Materials Using Advanced Informatics and Machine Learning

Speaker

Summary

NOV 5, 2016

An Introduction to Automating Workflows with KNIME

Speaker

Abstract

JUN 15, 2021

Design and Optimization of Biologics Driven by Physics-Based Computational Modeling

Speakers

Abstract