Trends in modern hit discovery: How your ultra-large screens can benefit from machine learning
FEB 2, 2022
Trends in modern hit discovery: How your ultra-large screens can benefit from machine learning
Speaker:
Matt Repasky
Senior Vice President
Abstract:
While traditional structure-based virtual screening has been successful in finding diverse hits to advance projects there is significant room for improvement of hit rates, diversity of hit chemotypes, available IP space explored, and the potency of unoptimized hits. Ultra-large, on-demand synthesizable libraries from vendors have enabled ~100x expansion of purchasable compound space, now billions of compounds, while DNA encoded libraries (DEL) can be even larger. In order to screen these much larger chemical spaces in the billions of compounds, results of two machine learning enabled approaches are described that make it easy and cost effective to find novel hits through virtual and DEL screens of billion compound plus libraries. DNA encoded libraries (DEL) enable screening billions of synthesized compounds but are limited due to high rates of experimental false negatives and positives. Employing machine learning trained to experimental DEL results we demonstrate significantly reduced false negative rates while identifying byproducts in a more favorable property space. To enable efficient, extrapolative chemical space exploration with an accurate docking scoring function, we have developed an active learning-based method employing AutoQSAR/DC machine learning and Glide SP docking as the learner. Results from Active Learning Glide screening of 100 million to billion compound screens show increased chemical diversity and GlideScore of hits relative to brute force screening of subsets of the libraries. Results and costs from these two new methods suggest billion compound library screens could replace smaller, traditional screens commonly employed today.
Taking experimentation digital: Materials innovation using atomistic simulation and machine learning at-scale

MAY 29, 2024
Taking experimentation digital: Materials innovation using atomistic simulation and machine learning at-scale
Our world is evolving rapidly and with it comes a wide range of challenges, including the need for sustainable and energy-efficient solutions, advanced electronic devices, and durable, lightweight materials for transportation, aerospace, and construction. Traditional methods for materials discovery or selection are no longer viable for keeping pace with demands.
In this talk, we will introduce a modern approach to materials R&D using a digital chemistry platform for in silico analysis, optimization, and discovery. The platform enables materials design at-scale across a wide range of applications, including organic electronics, catalysis, energy capture and storage, polymeric materials, consumer packaged goods, pharmaceutical formulation and delivery, and thin film processing.
By combining both physics-based modeling approaches (e.g. DFT, molecular dynamics, coarse-graining) and machine learning, researchers can easily incorporate in silico methods into their day-to-day workflows to expedite R&D timelines. Moreover, automated solutions enable scaling from simple molecular property predictions on a local device to high-throughput calculations on the cloud.
We will present real-world case studies that were performed by both experienced modelers as well as novice experimentalists who are new to digital chemistry approaches.
Key Learning Objectives:
- Learn to leverage data from physics-based simulations and machine learning to accelerate materials R&D
- Hear practical case studies and customer stories across materials industries including organic electronics, catalysis, energy capture and storage, polymeric materials, consumer packaged goods, pharmaceutical formulation and delivery, and thin film processing
- Identify key areas in your R&D where physics-based simulation and machine learning can provide value

Michael Rauch, Ph.D.
Associate Director
Michael Rauch is an Associate Director at Schrödinger specializing in materials science and education. Michael earned his Ph.D. from Columbia University in synthetic organometallic chemistry as an NSF Graduate Research Fellow before pursuing a postdoctoral role in organic chemistry at the Weizmann Institute of Science as a Zuckerman Postdoctoral Scholar. Michael is particularly interested in green, sustainable chemistry and transforming the way that synthetic chemists utilize molecular modeling via practical education.
FEP augmentation as a means to solve data paucity problems for machine learning in chemical biology
Machine Learning for Formulations
Polymer Descriptors for Machine Learning
Molecular Dynamics Descriptors for Machine Learning
Machine Learning for Ionic Conductivity
Machine Learning for Sweetness
Cheminformatics Machine Learning for Homogeneous Catalysis
Machine Learning Property Prediction
Data-driven materials innovation: Where machine learning meets physics

OCT 10, 2023
Data-driven materials innovation: Where machine learning meets physics
Abstract:
The surge of machine learning (ML) in materials science and chemistry has been driven by advancements in deep learning methodologies. While many industrial scientists aspire to transition to a data-centric and AI-guided design paradigm, companies often deal with limited datasets and complex materials that require customised featurisation techniques. Moreover, commonly used ML techniques often grapple with issues of explainability and extrapolation into unexplored chemical spaces.
In this webinar, we demonstrate how Schrödinger’s tools can help overcome these common challenges by using a combination of physics-based simulation data, enterprise informatics, and chemistry-informed ML. We highlight how this synergistic approach can transform materials innovation across a wide-range of technology verticals. Specifically, we will highlight case studies in the following areas:
- Using molecular dynamics simulations to generate descriptors that enhance the accuracy of ML models for viscosity predictions
- Developing explainable ML models to predict the ionic conductivity of Li-ion battery electrolytes
- Augmenting the performance of ML models for predicting properties such as absorption and emission wavelengths, fluorescence lifetime, and extinction coefficients of organic electronics using descriptors rooted in density functional theory
This integrated approach signifies a new frontier in materials science and chemistry, combining the strengths of ML and physics-informed methods.

Anand Chandrasekaran
Senior Principal Scientist