In this article, Canvas Product Manager Steve Dixon introduces two highly anticipated Canvas features, now available in Suite 2012. More information on Canvas can be found on the Canvas product page, and in the related web seminar. Related video tutorials can be found for KPLS here and for Hole-Filling here and here.
Background
Cheminformatics comprises a broad assortment of 2D concepts and methodologies that play critical roles in the early and middle stages of drug discovery. Library design, lead hopping, structure-activity analysis, and lead optimization are just a few of the areas that have been revolutionized by the advent of cheminformatics. While Canvas continues to offer a full spectrum of exceptional tools in this field, Suite 2012 is a breakthrough release that features a number of transformative technologies to address the most important needs of modelers and chemists. Two of those technologies are discussed here.
Hole-Filling and Library Optimization
Drug discovery programs usually commence with one or more primary screens that are designed to furnish high potency leads. The ability to find such leads, and their ultimate viability, depend on how well screening libraries cover the relevant areas of chemical space. Methods that improve library coverage are therefore in high demand, and Canvas offers a new, highly effective strategy1 to fill holes in a library by adding diverse compounds with desirable properties.
Figure 1 contains a schematic representation of hole-filling in a hypothetical chemical space. A reference library of compounds that fails to populate certain regions of that space is augmented with new compounds that are dissimilar to each other, dissimilar to the compounds in the reference library, and fill the empty regions of the reference library. Canvas Hole-Filling achieves this by selecting a set of compounds at random from a large pool, and then iteratively replacing members of that set with other compounds from the pool so as to minimize the average nearest neighbor fingerprint similarity in the combined collection. A greedy simulated annealing approach is used, which rapidly improves diversity without getting trapped in local minima.

Figure 1: Illustration of hole-filling. A reference library of compounds (red circles) is augmented with diverse compounds (blue squares), which occupy regions of chemical space not covered by the reference library.
