schrodinger.application.combinatorial_explorer.route_screener module

This module contains classes that allow fingerprint similarity screens within the combinatorial space of one or more synthetic routes.

Copyright Schrodinger LLC, All Rights Reserved.

class schrodinger.application.combinatorial_explorer.route_screener.RouteScreener(db_path, query_smiles, min_products=1000, max_combos=1000000, rand_seed=0)

Bases: object

Utilizes an empirical algorithm to select the most promising reactant combinations for a fingerprint similarity screen in a combinatorial space of one or more synthetic reaction routes. The basic approach involves sorting each set of reactants by decreasing Tversky similarity to a query structure, choosing relatively small numbers of high-ranking rectants and performing systematic enumeration until a desired number of products are obtained. Tversky similarities are weighted to favor reactants that are substructures (or near substructures) of the query, which tends to yield products that resemble the query to a much higher degree than occurs with random enumeration.

__init__(db_path, query_smiles, min_products=1000, max_combos=1000000, rand_seed=0)
Parameters
  • db_path (str) – Reactant fingerprint database file created by RfpDatabase (.rfpdb)

  • query_smiles (str) – SMILES string for query

  • min_products (int) – Minimum number of products per reaction route that must be successfully enumerated

  • max_combos (int) – Maximum number of reactant combinations to consider when attempting to enumerate the mininum number of products

  • rand_seed (int) – If a non-zero value is provided, reactants are selected randomly, rather than according to the empirical algorithm. This provides a means of comparing the algorithm to random enumeration.

screen(route_file, logger=None)

Given a JSON route file, this function yields unique products that tend to exhibit higher than average fingerprint similarities to the query. Yields until the minimum requested number of products have generated, or until the maximum number of combinations have been considered.

Parameters
  • route_file (str) – Synthetic route file with reagent sources of the form <class>.pfx, where <class> is a reactant class within the reactant fingerprint database supplied to the constructor

  • logger (logging.Logger) – Logger to which progress messages should be written

Yield

The next enumerated product, with the similarity to the query stored in SIM_PROP

Type

rdkit.Chem.rdchem.Mol

Raise

RuntimeError if the route contains any reactant classes that are not present in the database