schrodinger.application.combinatorial_explorer.driver_utils module

Provides supporting functionality for combinatorial_explorer_driver.py.

Copyright Schrodinger LLC, All Rights Reserved.

class schrodinger.application.combinatorial_explorer.driver_utils.RouteSourceType(value)

Bases: enum.Enum

An enumeration.

list = 1
zip = 2
dir = 3
invalid = 4
schrodinger.application.combinatorial_explorer.driver_utils.add_build_args(parser)

Adds arguments for TASK_BUILD.

Parameters

parser (argparse.ArgumentParser) – Argument parser object.

schrodinger.application.combinatorial_explorer.driver_utils.add_describe_args(parser)

Adds arguments for TASK_DESCRIBE.

Parameters

parser (argparse.ArgumentParser) – Argument parser object.

schrodinger.application.combinatorial_explorer.driver_utils.add_reactants_to_database(reactants, reactant_class, smiles_name, fp_generator, db)

Helper function that computes fingerprints/properties for the subset of the supplied reactants that aren’t already in the database, and then adds the resulting reactants to the database.

Parameters
  • reactants (list[phase.RfpReactant]) – The reactants to add. Each reactant is expected to contain a dummy fingerprint and dummy properties, which will be replaced by an actual fingerprint and, optionally, actual properties if the reactant is not already in the database.

  • reactant_class (str) – The class to which all supplied reactants belong

  • smiles_name (list[tuple(str, str)]) – Reactant SMILES, name tuples used to identify reactants which are already in the database

  • fp_generator (canvas.ChmProfiler) – Fingerprint and property generator, configured to generate properties only if storing properties in database

  • db (phase.RfpDatabase) – The database to which reactants are to be added

schrodinger.application.combinatorial_explorer.driver_utils.add_screen_args(parser)

Adds arguments for TASK_SCREEN.

Parameters

parser (argparse.ArgumentParser) – Argument parser object.

schrodinger.application.combinatorial_explorer.driver_utils.build_database(args, logger=None)

Adds records to a reactant fingerprint database. Creates a new database if args.new is True.

Parameters
  • args (argparse.Namespace) – Command line arguments

  • logger (logging.Logger) – Logger to which messages are to be written

Raise

RuntimeError if problems are encountered while building database

schrodinger.application.combinatorial_explorer.driver_utils.describe_database(args)

Returns a string containing a description of the reactant fingerprint database.

Parameters

args (argparse.Namespace) – Command line arguments

Returns

Database description

Return type

str

schrodinger.application.combinatorial_explorer.driver_utils.extract_route_files(zip_archive, dest_dir=None, look_dir=None)

Extracts JSON route files from a Zip archive and returns the sorted names of the route files.

Parameters
  • zip_archive (str) – Zip archive with JSON files at any level

  • dest_dir (str) – The directory to which route files are to be extracted. The default is CWD.

  • look_dir (str) – Look in this directory for extracted route files. This parameter would normally be supplied if extracting to the CWD and one knows that the subdirectory look_dir will be created.

Returns

The sorted names of the extracted route files

Return type

list[str]

schrodinger.application.combinatorial_explorer.driver_utils.filter_routes(route_files, min_depth, max_depth, db_path)

Given a list of route files, this function returns the subset whose reaction depths lie within the specified range and for which all required reactants are found in the reactant fingerprint database.

Parameters
  • route_files (list[str]) – The route files to check

  • min_depth (int) – Minimum reaction depth

  • max_depth (int) – Maximum reaction depth

  • db_path (str) – Reactant fingerprint database file

Returns

Subset of route_files satisfying aforementioned conditions

Return type

list[str]

schrodinger.application.combinatorial_explorer.driver_utils.get_depth_limits(depth)

Given a depth specification of the form <min>:<max>, this function returns the minimum and maximum depth.

Parameters

depth (str) – min and max reaction depths separated by ‘:’

Returns

tuple of min and max values

Return type

tuple[int, int]

Raise

ValueError if specification is invalid

schrodinger.application.combinatorial_explorer.driver_utils.get_distributed_screen_commands(args, nsub)

Returns subjob commands for running a distributed screen.

Parameters
  • args (argparse.Namespace) – Command line arguments

  • nsub (int) – Number of subjobs

Returns

List of subjob commands

Return type

list[list[str]]

schrodinger.application.combinatorial_explorer.driver_utils.get_filter_properties(filter_file)

Returns a sorted, unique list of properties utilized in the provided property filter file.

Parameters

filter_file (str) – Name of property filter file

Returns

Sorted unique property names

Return type

list[str]

schrodinger.application.combinatorial_explorer.driver_utils.get_jobname(db_path, task)

Determines the job name from SCHRODINGER_JOBNAME or from the base name of the reactant fingerprint database file.

Parameters
  • db_path (str) – Reactant fingerprint database file

  • task (str) – The task being performed

Returns

Job name

Return type

str

schrodinger.application.combinatorial_explorer.driver_utils.get_parser()

Creates argparse.ArgumentParser with supported command line options.

Returns

Argument parser object

Return type

argparse.ArgumentParser

schrodinger.application.combinatorial_explorer.driver_utils.get_pfx_files(args)

Returns the names of PathFinder reactant files if they are expected to exist on the current host. Does not actually check for the presence of the files.

Parameters

args (argparse.Namespace) – Command line arguments

Returns

List of file names or empty list

Return type

list[str]

schrodinger.application.combinatorial_explorer.driver_utils.get_reactant_classes_from_routes(db_path, source)

Given a reactant fingerprint database and a valid source of route files, this function returns a dictionary that maps each route file basename to the reactant classes utilized in that route. Ignores routes that contain any reactant classes not present in the database.

Parameters
  • db_path (str) – Reactant fingerprint database file

  • source (str) – Valid source of route files

Returns

Dictionary of route file basename to list of reactant classes

Return type

dict

schrodinger.application.combinatorial_explorer.driver_utils.get_route_files_from_zip(zip_archive)

Given a Zip archive purported to contain synthetic route files, this function returns a sorted list of the names of any JSON files in the archive.

Parameters

zip_archive (str) – Name of Zip archive

Returns

Sorted list of any JSON files in the archive

Return type

list[str]

schrodinger.application.combinatorial_explorer.driver_utils.get_route_input_files(source)

Given a valid source of synthetic route files, this function returns the sorted names of the the associated input files for the job. For a source that consists of a comma-separated list or a directory path, a list of JSON files is returned; for a Zip file, a list containing the name of the Zip file is returned.

Parameters

source (str) – Valid source of route files

Returns

Sorted list of JSON files or Zip file

Return type

list[str]

schrodinger.application.combinatorial_explorer.driver_utils.get_route_source_type(source)

Given the source of synthetic route files, this function determines whether it’s a comma-separated list of JSON files, a Zip archive, a directory or invalid.

Parameters

source (str) – Source of route files

Returns

Route source type

Return type

RouteSourceType

schrodinger.application.combinatorial_explorer.driver_utils.locate_pfx_files(args)

Determines whether PathFinder reactant files must exist and, if so, the directory in which they should be located. Does not actually check for the presence of the files.

Parameters

args (argparse.Namespace) – Command line arguments

Returns

tuple of whether files must exist and path to files

Return type

tuple[bool, str]

schrodinger.application.combinatorial_explorer.driver_utils.log_build_progress(logger, nproc, db)

Writes a build database progress message to the supplied logger if it exists.

Parameters
  • logger (logging.Logger or NoneType) – Logger to which progress message is to be written

  • nproc (int) – The total number of reactants processed thus far

  • db (phase.RfpDatabase) – Reactant database to which reactants are being added

schrodinger.application.combinatorial_explorer.driver_utils.split_route_files(args, cleanup=False)

Splits input route files over a series of Zip files named <jobname>_sub_<1>_routes.zip, <jobname>_sub_<2>_routes.zip, etc. The file <jobname>_sub_<i>_routes.zip holds a directory named <jobname>_sub_<n>_routes, which, in turn, holds one or more JSON route files. The number of Zip files created will be equal to the number of subjobs requested or the total number of usable input route files, whichever is smaller. Zip file names are returned.

Parameters
  • args (argparse.Namespace) – Command line arguments

  • cleanup (bool) – Whether to remove directories created to hold route files being zipped or unzipped

Returns

The names of the Zip files created

Return type

list[str]

Raise

RuntimeError if no route files remain after filtering on reaction depth and database reactant classes

schrodinger.application.combinatorial_explorer.driver_utils.validate_build_args(args)

Checks the validity of command line arguments for TASK_BUILD.

Parameters

args (argparse.Namespace) – Command line arguments

Returns

tuple of validity and non-empty error message if not valid

Return type

tuple[bool, str]

schrodinger.application.combinatorial_explorer.driver_utils.validate_database_args(args)

Checks the validity of the reactant fingerprint database.

Parameters

args (argparse.Namespace) – Command line arguments

Returns

tuple of validity and non-empty error message if not valid

Return type

tuple[bool, str]

schrodinger.application.combinatorial_explorer.driver_utils.validate_describe_args(args)

Checks the validity of command line arguments for TASK_DESCRIBE.

Parameters

args (argparse.Namespace) – Command line arguments

Returns

tuple of validity and non-empty error message if not valid

Return type

tuple[bool, str]

schrodinger.application.combinatorial_explorer.driver_utils.validate_pfx_files(args)

Checks for the presence of PathFinder reactant files if they are expected to exist.

Parameters

args (argparse.Namespace) – Command line arguments

Returns

tuple of validity and non-empty error message if not valid

Return type

tuple[bool, str]

schrodinger.application.combinatorial_explorer.driver_utils.validate_property_filters(args)

Verifies that the property filter file exists and, if not startup time, that all specified properties exist in the database.

Parameters

args (argparse.Namespace) – Command line arguments

Returns

tuple of validity and non-empty error message if not valid

Return type

tuple[bool, str]

schrodinger.application.combinatorial_explorer.driver_utils.validate_routes(source)

Verifies that source is a legal source of synthetic route files.

Parameters

source (str) – A comma-separated list of JSON route files, a Zip archive containing one or more route files or a path to a directory containing route files

Returns

tuple of validity and non-empty error message if not valid

Return type

tuple[bool, str]

schrodinger.application.combinatorial_explorer.driver_utils.validate_screen_args(args)

Checks the validity of command line arguments for TASK_SCREEN.

Parameters

args (argparse.Namespace) – Command line arguments

Returns

tuple of validity and non-empty error message if not valid

Return type

tuple[bool, str]

schrodinger.application.combinatorial_explorer.driver_utils.validate_args(args)

Checks the validity of command line arguments.

Parameters

args (argparse.Namespace) – Command line arguments

Returns

tuple of validity and non-empty error message if not valid

Return type

tuple[bool, str]