Webinar Archives

Apr 21

Taking Hit Identification to the Next Level by Screening Billions of Compounds Efficiently and Cost Effectively with Machine Learning Enabled DNA Encoded Libraries and Virtual Screening

Dr. Steven Jerome

Product Manager, Hit Discovery

The chemical space available to drug discovery is vast, estimated conservatively at 1020-1024 compounds, yet traditional, structure-based experimental and virtual methods such as high throughput screening and docking have been limited to around ten million compounds per screening campaign. Examining only a tiny fraction of available chemical space limits chemotype diversity and decoration, available IP space, and scores and affinities from virtual screening and experimental screening, respectively. In order to cost effectively screen much larger chemical spaces in the billions of compounds, two machine learning enabled approaches have been developed. DNA encoded libraries (DEL) enable screening billions of synthesized compounds but are limited due to high rates of experimental false positives and negatives. Employing machine learning, we describe an approach using experimental DEL results that identifies false negatives and biproducts in a more favorable property space. Secondly, the advent of on-demand, synthesizable libraries has made multi-billion compound chemical spaces experimentally and virtually accessible. However brute force examination of such chemical libraries incurs significant experimental and computational costs, promoting the use of less accurate virtual screening techniques. To enable efficient chemical space exploration using an accurate scoring function, we have developed an active learning-based method employing AutoQSAR/DC machine learning and Glide SP docking as the learner. Results from Active Learning Glide screening of 100 million to billion compound screens show increased chemical diversity and GlideScore of hits relative to brute force screening of subsets of the libraries. Results and costs from these two new methods suggest billion compound library screens should replace the smaller (1-10 million compound), traditional screens commonly employed today. 

Back To Top