schrodinger.utils.ligfilter module

Support module for Ligfilter applications, including parsing functions, filtering criteria, constants, and setting up of the default composite SMARTS patterns.

The basic idea is to provide a set of criteria for filtering structures based on properties, function evaluation, or collections of SMARTS patterns. These criteria can be easily specified in an external file.

Examples of criteria definitons:

Molecular_weight < 300 A predefined criterion type i_qp_#amide >= 1 A property-based criterion Alcohols == 0 A SMARTS definition matching criterion s_sd_Asinex A check for the existence of a property

General terminology used in the documentation of this module:
  • SMARTS expression - a SMARTS string

  • DEFINITION - a named definition, which can be simple (i.e., just a SMARTS expression) or composite (including/excluding multiple definitions, whether simple or composite).

  • KEY - a definition name or predefined function (e.g., Num_atoms)

  • CRITERION - a filtering condition

Copyright Schrodinger, LLC. All rights reserved.

schrodinger.utils.ligfilter.mysplit(thestr)

Special version of thestr.split()

The following string: “criteria<value” will be split into: [“criteria”, “<”, “value”]

Implemented so that spaces would not longer be required in criteria.

class schrodinger.utils.ligfilter.Criterion(name=None, compstr=None)

Bases: object

A base class for a structure matching criterion. Each instance will test a structure for some property and indicate whether it passes or not.

Attributes

type - The classification of the Criterion. Can be PREDEFINED,

PROPERTY, or SMARTS.

__init__(name=None, compstr=None)

Parameters

name - the name of the Criterion. See subclasses for meaning, as

it depends on the implementation.

compstr - a comparison string for evaluating the value of the

named property. Examples are ‘VALUE < 300’ or ‘VALUE >= 1’.

If name or compstr are not specified, parseLine() method should be used

If a PROPERTY Criterion has no operator or value, the Criterion is just the existence of the property in the tested structure.

The reason why comstr is one string instead of two values (operator and number) is in order to support implementation of Ev:50600 - Add the ability to create criteria with multiple (boolean) conditions

setCompStr(compstr)

Set the compstr attributes according to specified line Raises RuntimeError if the string is invalid

parseLine(line)
Parse a line of the form:

<name> (Property criterion only)

or:

<name> <oper> <value>

or:

<name> <oper> <value> AND/OR <oper> <value>

Set the name and compstr attributes from the parsed line; Raises RuntimeError if the string is invalid

match_compstr(value)

Return True if the value matches self.compstr, False if not.

matches(st, addprops=False)

Return True if the structure ‘st’ matches the criterion, False if not. OVERWRITE this method in the subclass

st (Structure) - Structure object addprops (bool) - whether to add properties for each description

getvalue(st)

Return the value of this criterion in the structure ‘st’. OVERWRITE this method in the subclass

schrodinger.utils.ligfilter.Num_rings(st)

Return the number of rings in the structure ‘st’.

schrodinger.utils.ligfilter.Num_aromatic_rings(st)

Return the number of aromatic rings in the structure ‘st’.

schrodinger.utils.ligfilter.Num_aliphatic_rings(st)

Return the number of aliphatic rings in the structure ‘st’.

schrodinger.utils.ligfilter.Num_heteroaromatic_rings(st)

Return the number of aromatic rings containing heteroatoms (N, O, S) in the structure ‘st’.

schrodinger.utils.ligfilter.Num_rotatable_bonds(st)

Return the number of rotatable bonds in the structure ‘st’, as determined by structutils.analyze.get_num_rotatable_bonds().

schrodinger.utils.ligfilter.Num_atoms(st)

Return the number of atoms in the structure ‘st’.

schrodinger.utils.ligfilter.Num_heavy_atoms(st)

Return the number of non-hydrogen atoms in the structure

schrodinger.utils.ligfilter.Num_molecules(st)

Return number of molecules in the structure.

schrodinger.utils.ligfilter.Num_residues(st)

Return number of residues in the structure.

schrodinger.utils.ligfilter.Molecular_weight(st)

Return the total molecular weight of the structure ‘st’.

schrodinger.utils.ligfilter.Num_chiral_centers(st)

Return the number of chiral centers in the structure ‘st’, as determined by structutils.analyze.get_chiral_atoms().

schrodinger.utils.ligfilter.Total_charge(st)

Return the total formal charge of the structure ‘st’.

schrodinger.utils.ligfilter.Num_positive_atoms(st)

Return the number of positive atoms in the structure ‘st’.

schrodinger.utils.ligfilter.Num_negative_atoms(st)

Return the number of negative atoms in the structure ‘st’.

schrodinger.utils.ligfilter.Molecular_formula(st)
schrodinger.utils.ligfilter.Percent_helix(st)
schrodinger.utils.ligfilter.Percent_strand(st)
schrodinger.utils.ligfilter.Percent_loop(st)
class schrodinger.utils.ligfilter.PropertyCriterion(name=None, compstr=None)

Bases: schrodinger.utils.ligfilter.Criterion

A structure matching criterion that acts on the presence or value of a specific structure property.

If no comparison string is provided, the criterion will check for the presence of property ‘name’. Otherwise it will compare the value against the comparison string definition.

__init__(name=None, compstr=None)

Parameters

name - the name of the property being evaluated

compstr - the property comparison string to be used if present

currently in format “<operator> <value>”

If name or compstr are not specified, parseLine() method should be used

matches(st, addprops=False)

Return True if structure ‘st’ matches this criterion, False if not.

st (Structure) - Structure object addprops (bool) - ignored for property criterions

getvalue(st)

Return the value of the property for this structure. Returns None if the property does not exist.

match_compstr(value)

Return True if the value matches self.compstr, False if not.

parseLine(line)
Parse a line of the form:

<name> (Property criterion only)

or:

<name> <oper> <value>

or:

<name> <oper> <value> AND/OR <oper> <value>

Set the name and compstr attributes from the parsed line; Raises RuntimeError if the string is invalid

setCompStr(compstr)

Set the compstr attributes according to specified line Raises RuntimeError if the string is invalid

class schrodinger.utils.ligfilter.SmartsCriterion(definition, compstr=None)

Bases: schrodinger.utils.ligfilter.Criterion

A structure matching criterion that looks for a match to a Definition instance, which is comprised of a collection of SMARTS patterns.

For example, for the Definition ‘TwoCarbons’ that matches against the SMARTS patterns [#6][#6], the comparison string

TwoCarbons < 40

will match if there are less than 40 carbon-carbon bonds in the structure.

__init__(definition, compstr=None)

Parameters

definition - a Definition instance, specifying the SMARTS

pattern(s) to be included and excluded

compstr - the comparison string to be used.

Currently equal to “<operator> <value>” Used by the expand() method

If compstr are not specified, parseLine() method should be used

matches(st, addprops=False)

Return True if structure ‘st’ matches this criterion, False if not.

Current matching behavior is to count the number of matches in the definition.includes() list, that do not have any overlapping atoms with matches in the definition.excludes() list.

st (Structure) - Structure object addprops (bool) - whether to add properties for each description

getvalue(st)

Return the number of times that definition.includes() patterns match the structure but do not overlap with any definition.excludes() patterns.

expand(definitions)

Generate a new SmartsCriterion from the current one in which the definition.includes() and definition.excludes() are expanded from the definitions list.

match_compstr(value)

Return True if the value matches self.compstr, False if not.

parseLine(line)
Parse a line of the form:

<name> (Property criterion only)

or:

<name> <oper> <value>

or:

<name> <oper> <value> AND/OR <oper> <value>

Set the name and compstr attributes from the parsed line; Raises RuntimeError if the string is invalid

setCompStr(compstr)

Set the compstr attributes according to specified line Raises RuntimeError if the string is invalid

class schrodinger.utils.ligfilter.PredefinedCriterion(name=None, compstr=None)

Bases: schrodinger.utils.ligfilter.Criterion

A structure matching criterion that acts on the value of a predefined function applied to the structure.

Currently available functions are:

Num_rings Num_aromatic_rings Num_aliphatic_rings Num_heteroaromatic_rings Num_rotatable_bonds Num_atoms Molecular_weight Num_chiral_centers Total_charge Num_positive_atoms Num_negative_atoms

For example, one definition parseable from the external file is:

Num_rings == 0

__init__(name=None, compstr=None)

Parameters

name - the name of the function to use. Allowed values are those

in ligfilter.PREDEFINED_KEYS.

compstr - the comparison string to evaluate the result of the

predefined function against

matches(st, addprops=False)

Return True if structure ‘st’ matches this criterion, False if not.

st (Structure) - Structure object addprops (bool) - whether to add properties for each description

getvalue(st)

Return the value of the predefined function applied to ‘st’.

For example, return the number of rings, or number of rotatable bonds.

match_compstr(value)

Return True if the value matches self.compstr, False if not.

parseLine(line)
Parse a line of the form:

<name> (Property criterion only)

or:

<name> <oper> <value>

or:

<name> <oper> <value> AND/OR <oper> <value>

Set the name and compstr attributes from the parsed line; Raises RuntimeError if the string is invalid

setCompStr(compstr)

Set the compstr attributes according to specified line Raises RuntimeError if the string is invalid

class schrodinger.utils.ligfilter.AslCriterion(asl)

Bases: schrodinger.utils.ligfilter.Criterion

This criterion considers a Structure as matching if the stored ASL expresson match returns at least one atom.

__init__(asl)
Parameters

asl - the ASL expression string.

matches(st, addprops=False)

Return True if structure ‘st’ matches this ASL criterion, False if not.

st (Structure) - Structure object addprops (bool) - whether to add properties for each description

getvalue(st)

Return True if the structure ‘st’ matches this ASL. Flase otherwise.

match_compstr(value)

Return True if the value matches self.compstr, False if not.

parseLine(line)
Parse a line of the form:

<name> (Property criterion only)

or:

<name> <oper> <value>

or:

<name> <oper> <value> AND/OR <oper> <value>

Set the name and compstr attributes from the parsed line; Raises RuntimeError if the string is invalid

setCompStr(compstr)

Set the compstr attributes according to specified line Raises RuntimeError if the string is invalid

class schrodinger.utils.ligfilter.Definition(name, includes=[], excludes=[], group=None)

Bases: object

A class that defines a collection of SMARTS patterns for matching against. The includes() method returns a list of those patterns that should be matched, and the excludes() method returns those that shouldn’t.

__init__(name, includes=[], excludes=[], group=None)

Parameters

includes - a list of SMARTS patterns to count

excludes - a list of SMARTS patterns that can be used to exclude

matches in the includes list

group - name of the group that this definition is part of

(optional). See Ev:50599

addKey(key, positive=True)

Add the SMARTS pattern ‘key’ to the list of desired matches (includes) if ‘positive’ is True, and to the list of unwanted matches (excludes) if ‘positive’ is False.

removeKey(key)

Remove the SMARTS pattern ‘key’ from the wanted or unwanted matches list.

includes()

Return a list of wanted matches.

excludes()

Return a list of unwanted matches.

expand(definitions)

Generate a new Definition from the current one in which the includes and excludes are expanded from the provided ‘definitions’ dictionary.

class schrodinger.utils.ligfilter.CriterionParser(definitions_dict=None)

Bases: object

A class for parsing a general property or predefined matching criterion.

__init__(definitions_dict=None)
error(err)

Print the error and exit.

expression_error(msg)

Print an error about an invalid expression and exit.

parse(line, line_num=None)

Create a Criterion object from a string. The method expects an input line of the form

<name>

…or…

<name> <operator> <value>

The first form is valid only for property criteria.

If the instance has a ‘definitions_dict’, definition criteria will be checked against it for validity.

Returns a Criterion.

class schrodinger.utils.ligfilter.DefinitionParser

Bases: object

A class for parsing a (possibly multi-line) specification of a Definition.

__init__()
error(err)

Print an error and exit.

parse(lines, line_num=None, group=None)

Return a Definition from a list of lines. No expansion of definitions is done.

General pattern of the specification is

DEFINE <name> <SMARTS pattern>

or

DEFINE <name>

(+ include_definition)* (- exclude_definition)*

Where the asterisk indicates zero or more of each of the include and exclude definitions.

Options:

line_num - current line of the file being parsed (for error handling) group - name of the definition group (or None, if there is no group)

schrodinger.utils.ligfilter.read_keys(fh, validate=False, validdefinitions=None)

Generate lists of Definitions and Criteria from an iterator ‘fh’ that returns a line at a time of the Definition and Criteria specification. For example, this iterator can be an open file or a list of strings.

If ‘validate’ is True, definition names in criteria will be checked against known Definitions, including those previously read from ‘fh’ and passed in via ‘validdefinitions’. No expansion of Definitions is done.

Return a tuple of (Definition list, Criterion list).

schrodinger.utils.ligfilter.get_default_criterion_parser()

Returns a CriterionParser with default definitions

schrodinger.utils.ligfilter.generate_criterion(condition, cp=None)

Ev:55805 Returns a Criterion object for a specified condition string. Condition string may be something like: “Num_atoms < 100”

The returned criterion can be then used as follows:

if criterion.matches(st):

<do>

Optionally a CriterionParser (cp) may be specified; otherwise default definitions will be used.

schrodinger.utils.ligfilter.st_matches_criteria(st, criteria_list, match_any=False, addprops=False)

If the specified structure matches the criteria, returns None. If does not match, then a string is returned, explaining the reason.

match_any - if True, st is considers to match if it matches at least

one criteria; otherwise all criteria must be matched.

addprops - if True, properties for each descriptor is added to st.