Leveraging machine learning applications combined with physics-based modeling for drug discovery

OCT 4, 2023

Leveraging machine learning applications combined with physics-based modeling for drug discovery

Abstract:

Machine learning strategies in drug discovery are becoming increasingly popular and can be used in various areas. In the Schrödinger Suite DeepAutoQSAR serves as the main tool for training machine learning models to predict activity, ADMET, and other compound properties. In order to leverage both the proven accuracy and wide applicability domain of physics-based computational models, such as QM and FEP, together with the speed and scale of machine learning, we have combined our physics-based modeling technologies with an active learning framework. This framework can effectively speed up virtual screening methods such as in Active Learning -Glide, Active Learning-FEP, and Active Learning-ABFEP, or to improve the accuracy and applicability domain of models such as pKa prediction in Epik and machine learned force fields such as QRNN. We will also discuss how to utilize machine learning protein structure prediction methods to enable new targets for structure-based drug design.

Speaker:

Dr. Marton Vass, Principal Scientist II, Schrödinger

ACS Fall 2025

Conference

ACS Fall 2025

CalendarDate & Time
  • August 17th-21st, 2025
LocationLocation
  • Washington, D.C

Schrödinger is excited to be participating in the ACS Fall 2025 conference taking place on August 17th – 21st in Washington, D.C. Join us for presentations by Schrödinger scientists.

icon time AUG 20 | 11:20AM
icon location Hall E – Room 25
Dissipative particle dynamics simulations of mRNA containing lipid nanoparticles

Speaker:
John Shelley, Fellow, Schrödinger

Abstract:
We describe and apply an automated structure-based bottom-up workflow to produce a new dissipative particle dynamics (DPD) force field for simulating lipid nanoparticles (LNPs).   This new force field is then applied to study the self-assembly of mRNA-containing LNPs and the critical process of endosomal escape. 

icon time AUG 21 | 9:50AM
icon location Room 103A
Coarse-grained modeling of membrane permeation for drug discovery

Speaker:
Martin Vögele, Senior Scientist II, Schrödinger

Abstract:
Drug molecules must cross biological membranes to reach their intended targets, thus predicting their permeability is a critical aspect of drug discovery. While computational models that rely on implicit membrane representations are fast and offer reasonable predictive power, they fail to capture the structural dynamics of lipid bilayers and its impact on permeant molecules. Explicit modeling of lipid bilayers via atomic simulations, on the other hand, can in principle provide detailed physical insights and more accurate predictions, but they require too much compute resources. Here,  we evaluate the Martini coarse-grained (CG) model as an alternative approach for simulating membrane permeation of small molecules. Using umbrella sampling simulations, we calculated profiles of the potential of mean force of rigid and flexible drug-like compounds across membranes with and without cholesterol. Our results demonstrate that the Martini CG model can provide permeability predictions that correlate well with experiments  while offering computational efficiency gains of up to two orders of magnitude compared to atomistic simulations. Additionally, this approach can reveal mechanistic insights inaccessible to implicit models, such as the influence of membrane composition on molecular orientation and conformation during permeation. We observed that cholesterol content significantly alters permeation barriers and transition pathways. For large and flexible molecules with a large number of accessible conformations, sampling all the translational, rotational, and conformational states across the membrane is a major challenge for accurate and efficient permeability predictions, and the preferred conformational states may be coupled with the  membrane composition and the compound’s position within the membrane. Despite these challenges, we find that coarse-grained modeling is a promising approach for high-throughput permeability prediction in drug discovery workflows.

IMID 2025

Conference

IMID 2025

CalendarDate & Time
  • August 19th-22nd, 2025
LocationLocation
  • Busan, Korea

Schrödinger is excited to be participating in the 25th International Meeting on Information Display conference taking place on August 19th – 22nd in Busan, Korea. Mathew D. Halls, Senior Vice President of Materials at Schrödinger, will be chairing the session “AI for Efficient Display Design” on Wednesday, Aug 20.

Join us for a presentation by Mathew D. Halls, titled “From Molecules to Displays: A Digital Chemistry Platform Uniting Physics-Based Simulation with Machine Learning for Optoelectronic Design”. Stop by our booth to speak with Schrödinger scientists.

icon time AUG 20 | 10:50AM
icon location Room F (313)
From Molecules to Displays: A Digital Chemistry Platform Uniting Physics-Based Simulation with Machine Learning for Optoelectronic Design

Speaker:
Mathew D. Halls, Senior Vice President of Materials, Schrödinger

Abstract:
Experimental exploration of innovative architectures and material compositions for OLED devices requires substantial time, labor, and resources, due to the complexity and cost of device fabrication, characterization, and analysis. Predictive modeling offers a powerful alternative, enabling efficient and targeted evaluation of devices across broad design spaces by integrating informatics and large-scale property predictions. In this presentation, Schrödinger will showcase recent advances of its digital platform combining machine learning (ML) technologies with quantum mechanics and molecular dynamics. The first advancement extends our automated ML algorithm for chemical formulations [1] to predict performance parameters of multicomponent layered devices, e.g., OLEDs. The second advancement involves overcoming limitations of classical force fields by using a message-passing neural network potential with iterative charge equilibration to achieve quantum mechanical accuracy at minimal computational cost. These ML models encode device components (i.e., material structures, layer architectures, physicochemical properties, and operating conditions) as features to predict OLED device performance metrics for operational output, stability, and efficiency (Fig. 1). This approach moves beyond traditional chemical modeling strategies, capturing complex relationships between device architecture, composition and function. Complementing this development are advances in Schrödinger’s physics-based simulation software, which computes determinative properties of OLED materials. MPNICE, the latest version of our ML potential, delivers accurate DFT-quality predictions at reduced computational cost, enabling simulation of increasingly more complex films and processes. For example, systems combining metal and organic chemistries typically outside the coverage of traditional force fields can now be more efficiently explored. Schrödinger’s new solutions for optoelectronic materials development and device optimization provides unprecedented capabilities for accelerated development of innovative display technologies.

In silico cryptic binding site detection and prioritization

JUL 30, 2025

In silico cryptic binding site detection and prioritization

Targeting cryptic binding sites is becoming an increasingly powerful strategy for tackling challenging drug targets, especially where traditional orthosteric approaches fall short due to issues like selectivity, resistance, or poor developability. However, identifying and evaluating cryptic binding sites—especially cryptic sites not visible in apo structures—remains a key challenge in early drug discovery.

In this webinar, we will introduce a novel computational workflow that integrates mixed solvent molecular dynamics (MxMD) with SiteMap to reveal and identify cryptic binding sites. This new combined workflow achieved a remarkable 83% success rate in detecting the cryptic binding sites within a retrospective benchmark set of 61 targets.

Join us to learn how this new workflow can support the identification of cryptic binding sites and enable more structure-based drug discovery campaigns for novel targets.

Webinar Highlights

  • Overview of the MxMD method
  • Introduction of new MxMD+SiteMap workflow to identify cryptic binding sites
  • Benchmarking the new workflow against popular machine learning methods and SiteMap in its default pocket detection mode

Our Speakers

Da Shi

Principal Scientist I, Life Science Software, Schrödinger

Da Shi is a Principal Scientist in the Hit Discovery team at Schrödinger. He obtained his Ph.D. at the University of California San Diego with the supervision of Prof. Ruben Abagyan. After graduation, he worked at the Frederick National Laboratory for Cancer Research as a Data Scientist on developing machine learning platforms for drug discovery. In 2021, he joined Schrödinger and worked as an All Access Applications Scientist. He later transitioned to the Hit Discovery team working on developing workflows on cryptic binding site identification and FEP ligand pose generation.

Dima Lupyan

Senior Principal Scientist, Life Science Software, Schrödinger

Dr. Dmitry Lupyan, a product manager, spearheads the development of Desmond and FEP analysis tools, showcasing his expertise in the realm of molecular dynamics. Notably, he’s behind the Python API for simulation analysis, a cornerstone utilized across Schrödinger’s MD, MxMD, and FEP+ products. Driven by a passion for scientific advancement, he actively promotes the utilization of simulation analysis tools, fostering a community of exploration. His research interests delve into the intricate domains of protein engineering, membrane-bound systems, and the fascinating dynamics of unbinding kinetics.

Educator’s Month: Visualizing atomic and molecular orbitals and calculations exploring protonation of nitrogen versus oxygen

JUN 24, 2025

Visualizing atomic and molecular orbitals and calculations exploring protonation of nitrogen versus oxygen

Molecular computation and visualization capabilities are more accessible than ever. Activities involving the visualization and interpretation of atomic and molecular orbitals used in General Chemistry at Commonwealth University – Lock Haven will be discussed. A computational activity exploring the thermodynamics of protonation of nitrogen versus oxygen will also be presented.

Our Speaker

Kevin Range

Professor, Commonwealth University – Lock Haven

Dr. Kevin Range is a Professor of Chemistry in the Department of Physical and Environmental Sciences at Commonwealth University of Pennsylvania (formerly Lock Haven University). He earned his Ph.D. in Physical Chemistry from the University of Minnesota and B.S. in Chemistry from Moravian College (now Moravian University). Dr. Range has taught across a wide area of the chemistry curriculum including General, Organic, Inorganic, Analytical, and Physical Chemistry. One of his professional goals, as long-time member of MoleCVUE, is to help enhance the teaching of chemistry through the use of molecular computation and visualization.

Extending the Functionality of the Excel-to-SBOL Converter for Broader Synthetic Biology Applications

JUN 12, 2025

Extending the Functionality of the Excel-to-SBOL Converter for Broader Synthetic Biology Applications

Abstract:

Background & Research Question
Synthetic biology relies on data standardization to support collaboration and reuse. SBOL (Synthetic Biology Open Language) is a key standard but has a steep learning curve, especially for biologists with limited programming experience. This project extends an Excel-to-SBOL converter tool, enabling users to generate SBOL-compliant files from familiar spreadsheet formats to improve SBOL adoption and streamline data sharing.

Methods
Using a Python script and, in particular, the pySBOL2 library, this project translates biological design elements from spreadsheet rows into structured SBOL objects. The tool supports key design patterns such as transcriptional regulation, protein production, and complex formation by generating corresponding ComponentDefinition and ModuleDefinition objects with appropriate interactions (e.g., inhibition, stimulation, binding). Each interaction, functional components, and ontological annotations are added to an SBOL document. The output is validated and uploaded to SynBioHub to ensure structural integrity and enable visual exploration.

Results
As a result of this project, I successfully implemented new features for modeling genetic production, repression, activation, biochemical reaction, DNA and protein sequencing, and complex component formation. In addition, I deployed sequence and part name validation to detect duplicates and ensure structural data integrity. These contributions expanded the tool’s ability to capture complex biological interactions from spreadsheets and improved the reliability of SBOL output, supporting more accurate and reusable designs for synthetic biology workflows.

Conclusion & Implications
By lowering the barrier to SBOL generation, Excel-to-SBOL supports broader adoption of standardized design practices in synthetic biology. Available as a Python package, it is already used by research labs and organizations nationwide, facilitating reproducibility and accelerating collaborative design efforts. Future work will ensure compatibility and conversion with SBOL3 data format and expand functionality to support a broader range of biological interactions as user needs evolve.

Speaker:

Taisiia Sherstiukova, University of Colorado Boulder

Taisiia Sherstiukova is a recent graduate from the University of Colorado Boulder with a Bachelor’s degree in Computer Science. Her work focuses on simplifying synthetic biology workflows by developing tools that bridge the gap between lab scientists and computational tools.

Comparative Molecular Drug Docking to hERG and CaV1.2- Channels to Understand Drug-Induced Cardiac Risks

JUN 12, 2025

Comparative Molecular Drug Docking to hERG and CaV1.2- Channels to Understand Drug-Induced Cardiac Risks

Abstract:

Background and research questions
Voltage-gated ion channel proteins, such as KV11.1 and CaV1.2, are critical in maintaining regular cardiac rhythm. The hERG potassium channel, KV11.1, is responsible for cardiac repolarization, and its inhibition by drug molecules may lead to prolonged QT intervals and arrhythmias. Additionally, CaV1.2 is the main Ca2+ channel expressed in cardiac muscle cells, and its drug interactions can also alter heart rhythms. Since drug-ion channel interactions may pose serious clinical concerns, this study aims to investigate drug-binding affinities with cardiac ion channels using in-silico molecular docking models and to provide accurate arrhythmia risk predictions.

Methods
In this study, approximately 60 drugs were selected based on their known cardiac risks. The small molecules were sourced from PubChem and prepared with the dominant protonation states at the physiological pH. Molecular docking was performed using Schrödinger Glide and OpenEye FRED software; we also employed various programs such as PyMol, VMD, and ChimeraX, for structural analysis. Computed drug binding affinities were further used in logistic regression models to assess arrhythmia risks.

Results
The study demonstrated reasonable correlation between binding scores and experimental IC50 values although variation was observed depending on the protein structure and conformational state. Binding poses aligned well with the available cryo-EM structures. Logistic regression models achieved up to 77% accuracy in classifying high and low Torsades de pointes (TdP) arrhythmia risks implying their potential clinical applications.

Conclusions
This molecular docking study has provided a fairly accurate description of ion channel protein – ligand interactions, contributing to our understanding of drug-induced cardiac risks and aiding in drug safety screening. To improve the prediction accuracy, future work can incorporate additional docking platforms and molecular dynamic simulation approaches.

Speaker:

Ensley Jang, University of California, Davis

Ensley is an undergraduate student majoring in Pharmaceutical Chemistry at University of California, Davis. Her interests lie in molecular mechanisms underlying disease and drug interactions.

Discovery of FabG Inhibitors for Yersinia pestis Using Computational and Biochemical Approaches

JUN 12, 2025

Discovery of FabG Inhibitors for Yersinia pestis Using Computational and Biochemical Approaches

Abstract:

Due to the limited number of approved antibiotics for Yersinia pestis, the bacterium responsible for the plague epidemic known as the Black Death, the discovery of small-molecule inhibitors targeting essential bacterial enzymes is critical. One such enzyme is ketoacyl-acyl carrier protein reductase (FabG), which plays a key role in the biosynthesis of fatty acids responsible for maintaining the integrity of the bacterial cell envelope. This study integrates computational and biochemical methods to identify potential FabG inhibitors. An X-ray crystallography structure of the target protein (PDB ID: 5CEJ) was used alongside an artificially predicted model generated via ChimeraX and AlphaFold. Using both structures allowed us to evaluate the utility and accuracy of computational models in drug discovery, especially in future cases where no experimental structure is available. The 5CEJ structure, which lacks a native substrate, was aligned using the ligand acetoacetyl-coenzyme A (CAA) from Bacillus sp. FabG (PDB ID: 4NBU). In a validation docking (GOLD, CCDC), a control library of energy-minimized ligands from LipPrep (Schrödinger) was used. Following validation, compound libraries were screened for binding affinity against both the experimental and predicted structures. GOLD scores ranged from 60–102 for 5CEJ and 40–80 for the AlphaFold model. Docking results were visualized in PyMOL (Schrödinger). FabG was successfully cloned, expressed, and purified using standard biochemical techniques. Future work would focus on enzymatic assays to evaluate whether the top-scoring compounds inhibit FabG activity in vitro.

Speaker:

Catalina Colling, University of Texas at Austin

Catalina Colling is a 2025 graduate of the University of Texas at Austin, where she earned a Bachelor of Science and Arts in Biology. Her independent project focused on early-stage drug discovery, combining computational screening with biochemical validation to more efficiently identify potential inhibitors of target proteins.

Protein and Solvent Dynamics Simulations to Understand Cancer Mutations

JUN 12, 2025

Protein and Solvent Dynamics Simulations to Understand Cancer Mutations

Abstract:

Mutations in a class of protein-based enzymes called kinases cause many cancers, prominently the epidermal growth factor receptor (EGFR) in lung cancer. Drugs that block binding of a key substrate–ATP–can block these cancers, but their effects vary greatly depending on how the kinase is altered by the mutation. However, no clear molecular mechanism has been identified to explain how different mutations affect drug sensitivity. Based on our previous experimental and computational work, we hypothesize that molecular dynamics (MD) simulations paired with various statistical models can classify mutations in terms of drug sensitivity by analyzing coupled protein and solvent dynamics. Using AlphaFold3 to obtain 3D conformations of mutated kinases, I executed MD simulations of known drug-sensitive and drug-resistant EGFR insertions. I constructed a pair-wise dynamical cross-correlation matrix (DCCM) with MDTraj to measure paired secondary-structure fluctuations, fit a logistic regression model to these correlations, and identified features associated with drug sensitivity. My analysis revealed that drug-sensitive mutants display significantly unstable MD trajectories, especially in areas key for ATP binding. Visual analysis of DCCM plots revealed specific clusters of extreme correlations specific to drug-sensitive mutants. Preliminary statistical analysis of correlations revealed ‘breathing’ motions (moving together and apart) between two key parts of the kinase structure in resistant mutations. This association is likely due to a disruption in the integrity of the ATP binding site, allowing drugs to block ATP binding more effectively. Continuing to enhance understanding of the biochemical underpinnings behind drug sensitivity of cancer-causing kinases should lead to streamlined treatment options for lung cancer patients, and deduced structural implications should aid development of future ATP-competitive and allosteric kinase-targeted cancer drugs. Future directions involve analyses of molecular docking experiments paired with principal component analysis (PCA) of trajectories to identify new drug opportunities.

Speaker:

Michael Sarullo, Yale University

Michael Sarullo is a rising third-year undergraduate at Yale University, where he is pursuing a double major in Molecular Biophysics and Biochemistry alongside Statistics and Data Science. His research focuses on computational approaches to understanding biological systems, particularly protein dynamics and their therapeutic applications.

Dynamic Docking: A Scalable Computational Framework for Conformational Profiling of Small Molecule/RNA Binding

JUN 12, 2025

Dynamic Docking: A Scalable Computational Framework for Conformational Profiling of Small Molecule/RNA Binding

Abstract:

Structured RNA elements, such as CUG repeat expansions characterized by UU internal loop motifs, are high-value but challenging targets for small molecule recognition. A significant barrier in RNA-targeted drug discovery is the efficient identification of ligands that can bind to dynamic RNA loop motifs. To address this, we developed two dynamic docking approaches, DynaD and DynaD/Auto—computational tools designed to rapidly identify small molecule binding properties to RNA targets. These methods predict global minimum and local minima bound states along with their binding energies, elucidating the binding energy landscape not accessible by current experimental techniques.

DynaD is a physics-based computational method that predicts initial ligand-bound states to RNA loops where binding pockets are known but structural data is missing. It simulates the binding process using molecular dynamics (MD), guided by a distance-based reaction coordinate and force-field interactions, rather than empirical scoring functions. This enables exploration of complex RNA-ligand energy landscapes and identification of energetically favorable binding modes. Binding free energies are calculated using MM/PBSA and MM/3D-RISM to compare modes and determine the global minimum.

To address conformational sampling limitations in flexible systems, we developed DynaD/Auto, a hybrid approach incorporating umbrella sampling data from UU internal loops and AutoDock predictions. This workflow improves sampling and prediction accuracy by integrating prior structural insights. Umbrella sampling trajectories enable sampling of RNA loop conformational ensembles, identifying biologically relevant global and local minima. Simulation predictions were validated against experimental data, showing strong positive correlations. Dendrogram analysis highlighted distinct binding modes based on RMSD clustering.

DynaD offers forced targeting capabilities when structural data is lacking, while DynaD/Auto leverages known free energy landscapes to enhance prediction accuracy. Together, these approaches provide a powerful framework for reliable identification of RNA-ligand interactions, particularly in dynamic and structurally complex RNA targets like CUG repeats.

Speaker:

Nakul Balaji, Florida Atlantic University

Nakul Balaji is an undergraduate researcher in computational biophysics at Florida Atlantic University’s Wilkes Honors College, where he is pursuing a concentration in Data Analytics. His research focuses on molecular docking and small-molecule drug discovery.

Deep Learning–Based Structural Modeling of YscF Mutants Reveals Determinants of Type III Secretion System Architecture in Yersinia pestis

JUN 12, 2025

Deep Learning–Based Structural Modeling of YscF Mutants Reveals Determinants of Type III Secretion System Architecture in Yersinia pestis

Abstract:

The global rise of antimicrobial resistance has reawakened concern over re-emerging pathogens such as Yersinia pestis, the bacterium responsible for plague. A critical virulence factor in Y. pestis is the Type III Secretion System (T3SS), which enables the injection of effector proteins directly into host cells. The needle-like structure responsible for translocation is formed by polymerization of the YscF protein, yet its complete oligomeric structure and the effects of function-disrupting mutations remain poorly understood.

In this study, we used AlphaFold 3 to model oligomeric assemblies of key mutations in the YscF protein. The mutants analyzed were found to have a large effect on needle formation and regulation during previous in-vitro studies. These include single, R73A double, D28A/D46A and multi-site mutants. Specifically, the constant secretion (CS) mutant (I13A, D17A, D28A, D46A) and the non-secretion (NS) mutant (N31A, V34A, D77A, I82A). Models were constructed for 21- and 24-subunit oligomers and evaluated using PyMOL and quantitative structural analysis.Comparative assessments using helical alignment, inter-subunit distances, and interface contact areas revealed that R73A and D28A/D46A mutants preserved elongation and symmetry, consistent with functional secretion. In contrast, CS and NS mutants produced structurally compacted or distorted assemblies, correlating with aberrant secretion phenotypes.

This data offers insight into the structural basis of T3SS assembly and underscore how specific amino acid substitutions can disrupt higher-order oligomerization. This work not only enhances our mechanistic understanding of secretion system architecture but also demonstrates the utility of deep learning models in predicting complex macromolecular behavior supporting future structural studies.

Speaker:

Stephanie Bellido, Nova Southeastern University

Stephanie Bellido, Lukah Varghese, & Dev Patel are undergraduate students from Nova Southeastern University and will be presenting on Deep Learning–Based Structural Modeling of YscF Mutants.

Discovery of Aza-stilbene as a Scaffold for a Histamine Receptor H2 Antagonist for the Treatment of Gastroesophageal Reflux Disease

JUN 12, 2025

Discovery of Aza-stilbene as a Scaffold for a Histamine Receptor H2 Antagonist for the Treatment of Gastroesophageal Reflux Disease

Abstract:

Gastroesophageal reflux disease (GERD) affects millions worldwide, causing chronic acid reflux, and can lead to frequent regurgitation and esophageal cancers. While dietary changes are a first-line treatment, current medications often face challenges such as limited efficacy, side effects, recalls, carcinogenicity, or complex synthesis. This project aims to computationally design a drug targeting the Histamine Receptor H2 (HRH2) in the gut as an alternative GERD treatment. Using Schrödinger Maestro and PDB 7UL3, a docking grid based on HRH2 was created to evaluate potential drug candidates. Four existing HRH2 antagonists were docked to establish benchmarks for docking score and pharmacokinetic properties including low CNS activity, reduced oral absorption, and minimal gut-to-blood permeability. A library of 3,057 ligands from the DUD-E database was screened for binding to key receptor amino acids, Asp98 and Asp186, and pharmacokinetic properties. Compounds with 2 or 3 six-membered rings showed the most promising profiles. A scaffold-based approach identified quinolines and azastilbenes as strong candidates. The optimal aza-stilbene demonstrated stronger docking scores, easier synthesis, and better pharmacokinetic properties than that of the quinolines, and in many aspects, better than known HRH2 antagonists. This compound was synthesized using a simple, one-step reaction with benzaldehyde and aniline in water at room temperature. The product was characterized by FTIR. This project demonstrates a new aza-stilbene scaffold that could be employed for new drugs to treat GERD.

Speaker:

Nihar Kummetha, North Carolina School of Science and Mathematics

Nihar Kummetha is an incoming undergraduate student at the University of
Pennsylvania planning to major in Biochemistry and minor in Statistics and Data Science as a
Vagelos Molecular Life Sciences Scholar.