schrodinger.application.matsci.automated_cg_mapping module

Map an all-atom structure to DPD coarse-grained one

Copyright Schrodinger, LLC. All rights reserved.

schrodinger.application.matsci.automated_cg_mapping.find_new_group_id(groupids)

Find new key that is not already contained in the dict provided

Parameters

groupids (list) – the list of groupids

Return type

int

Returns

id for a new key

exception schrodinger.application.matsci.automated_cg_mapping.MappingError

Bases: Exception

Exception raised when mapping fails.

__init__(*args, **kwargs)
args
with_traceback()

Exception.with_traceback(tb) – set self.__traceback__ to tb and return self.

class schrodinger.application.matsci.automated_cg_mapping.CGGraphHelper

Bases: object

Includes methods to generate and analyze graphs

static classifyNodeClusters(node_clusters, st_graph)

Sort the given node clusters. Clusters that produce isomorphic subgraphs are considered equivalent, and any nodes that seed equivalent clusters are assigned to the same category together.

Parameters
  • node_clusters (dict) – dictionary of nodes_clusters where the key is the node index and value is list of cluster nodes

  • st_graph (networkx.Graph) – The structure graph

Return type

Mapping

Returns

Mapping of symmetry ids and the nodes that belong to the same category

static getPathLength(bead_nodes, st_graph)

Get the longest path length between nodes in a bead

Parameters
  • bead_nodes (list) – list of nodes in the bead

  • st_graph (networkx.Graph) – input graph

Return type

float

Returns

The longest path length between the nodes

static getCGGraph(mapping_cg, st_graph)

Get coarse-grained graph from mapping

Parameters
  • mapping_cg (Mapping) – mapping object

  • st_graph (networkx.Graph) – input full graph

Return type

networkx.Graph

Returns

the coarse grained graph

static getDualGraph(st_graph, node_clusters=None)

In a dual graph, nodes i and j are connected only if their corresponding node clusters have no overlapping nodes. If node_clusters is None, then the dual graph is constructed by connecting nodes i and j if they are at least 3 bonds away from each other.

Parameters
  • st_graph (networkx.Graph) – input graph

  • node_clusters (dict) – dictionary of node clusters for each node in the graph

Return type

networkx.Graph

Returns

the dual graph

static getSwappingGraph(acceptors, donors, cg_graph)

Construct a graph used to determine which nodes should be swapped. Donors, the large beads will donate a node to the acceptor bead, which is a small bead. The graph is constructed by removing all edges but those between donors and acceptors. Connected components will be used to find bead pairs that will exchange nodes.

Parameters
  • acceptors (list) – list of acceptor beads

  • donors (list) – list of donor beads

  • cg_graph (networkx.Graph) – coarse-grained graph

Return type

networkx.Graph

Returns

swapping graph

static getEdgeList(mapping, st_graph)

Get list of edges for the coarse-grained network.

Parameters
  • mapping (Mapping) – CG mapping for a section of (or an entire) molecule

  • st_graph (networkx.Graph) – Structure graph for the underlying atomistic (not CG) representation of the molecule

Return type

tuple(list, dict)

Returns

List of all edges, and a dict containing edge attributes (bond orders for the CG bonds)

static matchIsomorphicGraphs(graphs)

Get a graph matchers object for a pair of graphs.

Parameters

graphs (tuple(Graph, Graph)) – Two graphs to be matched

Return type

iso.GraphMatcher

Returns

the object matching two graphs

class schrodinger.application.matsci.automated_cg_mapping.Mapping

Bases: object

Data structure to store CG mappings of atoms into beads.

__init__()

Create an instance.

createNewGroup(bead, bead_idx=None)

Create a new bead. If provided index is None or already exists, one is generated

Parameters
  • bead (list) – The new bead (list of nodes).

  • bead_idx (init) – The index of the new group

addNodeToGroup(node, bead)

Add a node to a bead and if not already present and update both group_node and node_group dictionaries.

Parameters
  • node (int) – The node to be added to the bead

  • bead (int) – The index of the bead that node is being added to

removeNodeFromGroup(node, bead)

Remove a node from a bead, if present.

Parameters
  • node (int) – The node to remove

  • bead (int) – The bead that node is being removed from

deleteGroup(bead)

Delete a bead from group_node and node_group dictionaries.

Parameters

bead (int) – The bead being deleted

deleteNode(node)

Delete a node from node_group and group_node dictionaries.

Parameters

node (init) – The node being deleted

setGroup(bead_idx, bead)

Set an existing bead group.

Parameters
  • bead_idx – The bead index

  • bead (list(int)) – The bead

static getBeadSize(mapping, bead_idx)

Get the size of a bead in the provided mapping.

Parameters
  • mapping (Mapping) – mapping object

  • bead_idx (int) – index of the bead

Return type

float

Returns

size of the bead

class schrodinger.application.matsci.automated_cg_mapping.StructureAnalyzer(struct, scale)

Bases: object

Analyzing input structure.

MAPPED_ATOM_INDEX = 'i_matsci_mapped_atom_index'
ORIGINAL_ATOM_INDEX = 'i_matsci_atom_index_before_extraction'
__init__(struct, scale)

Create an instance.

Parameters
  • struct (structure.Structure) – input structure

  • scale (int) – Coarse graining scale, number of AA atoms per CG bead

getMolInfo()

For each unique species in the structure, create a MoleculeData object holding the species information, and store them in a list.

getUniqueSpecies()

Get unique species in the structure. Structure are found using their stereochemically unaware SMILES.

Return type

list(SpeciesData)

Returns

list of all species found in the structure

getSpeciesMolNum(mol_species)

Get a sample molecule index for the given species.

Parameters

mol_species (SpeciesData) – The species object

Return type

int

Returns

The molecule index of the species

getMappedStruct()

Map the structure by adding grouping properties to the atoms.

Return type

tuple(structure.Structure, list(str))

Returns

mapped structure and the list of CG bead names

mapTaggedAtoms()

Map the atoms that are used for mapping in sample molecules to the corresponding atoms in the rest of the structure.

addGroupingProps()

Add grouping properties to the mapped structure. The grouping properties are retrieved from sample molecules which are mapped and passed on to the rest of the structure.

removeStaleProps()

Remove properties that are no longer needed.

class schrodinger.application.matsci.automated_cg_mapping.StructData(molecule_st, scale)

Bases: object

A dataclass to information related to the structure of a species.

NOT_HETERO_ATOM_NUMS = (6, 14)
PREDEFINED_PATTERNS = None
ALL_SMARTS = {'Acid_Chloride': 'C(=O)Cl', 'Aldehyde': '[CH;D2;!$(C-[!#6;!#1])]=O', 'Amine': '[N;$(N-[#6]);!$(N-[!#6;!#1]);!$(N-C=[O,N,S])]', 'Azide': '[N;H0;$(N-[#6]);D2]=[N;D2]=[N;D1]', 'Boronic_Acid': '[$(B-!@[#6])](O)(O)', 'Carboxylic_acid': 'C(=O)[O;H,-]', 'Halogen': '[$([F,Cl,Br,I]-!@[#6]);!$([F,Cl,Br,I]-!@C-!@[F,Cl,Br,I]);!$([F,Cl,Br,I]-[C,S](=[O,S,N]))]', 'Isocyanate': '[$(N-!@[#6])](=!@C=!@O)', 'Nitro': '[N;H0;$(N-[#6]);D3](=[O;D1])~[O;D1]', 'Sulfonyl_Chloride': '[$(S-!@[#6])](=O)(=O)(Cl)', 'Terminal_Alkyne': '[C;$(C#[CH])]', 'dangling_atoms': '[!#1;X4&H3,X3&H2,X2&H1,X1&H0]-[!#1]', 'higher_order_bonds': '[!#1]=,#[!#1]'}
__init__(molecule_st, scale)

Create an instance.

Parameters
  • molecule_st (structure.Structure) – molecule structure

  • scale (int) – Coarse graining scale, number of AA atoms per CG bead

findExclusiveAtoms()

Find unsharable atoms in a structure.

Return type

list(int)

Returns

list of atom indexes

property sorted_func_groups

Sort functional groups based on complexity.

Return type

list(tuple(str, str))

Returns

list of functional groups sorted by complexity

createGroupMapping(struct_groups, node_to_atom_idx)

Create a mapping for all atoms that are found to belong to a defined group.

Parameters
  • struct_groups (list(tuple(str, str))) – A list of atom indexes that belong to a the definied groups.

  • node_to_atom_idx (dict) – A dictionary of node number to atom index conversion

Return type

Mapping

Returns

A mapping of functional groups

findGroupsMapping(node_to_atom_idx)

Find functional groups and pre-defined groups (if requested) in the structure.

Parameters

node_to_atom_idx (dict) – A dictionary of node number to atom index conversion

Return type

tuple(Mapping, Mapping)

Returns

A mapping of functional groups and a mapping of predefined groups. Each can be an empty mapping.

updateFunctionalGroups(func_groups_indexes, predefined_indexes)

Update functional groups by removing groups that include atoms that already belong to a predefined group. And add all predefined groups as a functional group.

Parameters
  • func_groups_indexes (list(list(int))) – list of atom indexes that match the functional groups.

  • predefined_indexes (list(list(int))) – List of atom indexes that match the predefined groups.

Return type

list(list(int))

Returns

List of atom indexes forming functional groups.

findSMARTSMatchingIndexes(smarts_info, allow_partial_overlap=True, match_scale=False)

Find indexes of atoms that match the provided smarts pattern.

Parameters
  • smarts_info (list) – SMARTS pattern and their name to be matched.

  • allow_partial_overlap (bool) – If True, allow atoms to belong to more than one group. If False, atoms can only belong to one group.

  • match_scale (bool) – If True, compare the size of the groups against the CG scale, and filter out groups that are larger than the scale.

Return type

list(list(int))

Returns

List of lists of atom indexes that match the smarts pattern.

class schrodinger.application.matsci.automated_cg_mapping.MoleculeData(molecule_st, scale, struct, mol_idx, species, mol_count)

Bases: schrodinger.application.matsci.automated_cg_mapping.StructData

A dataclass to hold Molecule information.

MIN_HEAVY_ATOMS = 3
UNK = 'UNK'
RESIDUE

alias of schrodinger.application.matsci.automated_cg_mapping.RESIDUES

class BEAD_TYPE(bead_graph, name, beads)

Bases: tuple

__contains__(key, /)

Return key in self.

__len__()

Return len(self).

bead_graph

Alias for field number 0

beads

Alias for field number 2

count(value, /)

Return number of occurrences of value.

index(value, start=0, stop=9223372036854775807, /)

Return first index of value.

Raises ValueError if the value is not present.

name

Alias for field number 1

class NODE_GROUP_TYPE(atom_idx, bead_names, bead_nums, atom_nums)

Bases: tuple

__contains__(key, /)

Return key in self.

__len__()

Return len(self).

atom_idx

Alias for field number 0

atom_nums

Alias for field number 3

bead_names

Alias for field number 1

bead_nums

Alias for field number 2

count(value, /)

Return number of occurrences of value.

index(value, start=0, stop=9223372036854775807, /)

Return first index of value.

Raises ValueError if the value is not present.

class MAPPED_BEAD_INFO(name, smarts, charge)

Bases: tuple

__contains__(key, /)

Return key in self.

__len__()

Return len(self).

charge

Alias for field number 2

count(value, /)

Return number of occurrences of value.

index(value, start=0, stop=9223372036854775807, /)

Return first index of value.

Raises ValueError if the value is not present.

name

Alias for field number 0

smarts

Alias for field number 1

__init__(molecule_st, scale, struct, mol_idx, species, mol_count)

Create an instance.

Parameters
  • molecule_st (structure.Structure) – molecule structure

  • scale (int) – Coarse graining scale, number of AA atoms per CG bead

  • struct (structure.Structure) – The input structure

  • mol_idx (int) – molecule index in the input structure

  • species (str) – species display name

  • mol_count (int) – molecule count in the unique species

checkRingsAndBonds()

Identify rings and aromatic bonds in the molecule structure. Create a mapping object of identified rings and their nodes. And a list of node pairs that have an aromatic bond.

getAugmentedStructGraph()

Generate a networkx.Graph object of the molecule structure. Heavy atoms are used as nodes in the graph. The Graph is augmented using the chemical information of the structure: atomic numbers are added as node attributions and bond orders are used as edge attributes.

getBondsOrderAttr(atom)

Get bond pairs and the bond orders for the given atom to be added to networkx.Graph in dict((int,int): {‘bond_order’: int}) format

Parameters

atom (structure.Structure.atom) – the atom

Return type

dict

Returns

dictionary of bond pairs and their orders

getAtomicNumAttr(atom)

Get atomic number of the atom to be added to networkx.Graph in dict(int:{‘atomic_number’}) format

Return type

dict

Returns

a dictionary of the atom index and its atomic number

getAtomicChargeAttr(atom)

Get formal charge of the atom to be added to networkx.Graph in dict(int:{‘formal_charge’}) format

Return type

dict

Returns

a dictionary of the atom index and its atomic charge

getResidues()

Find residues/polymers/repeating units in the molecule. Create dictionaries of all residues by their name and index and all the graph nodes that belong to them.

Return type

dict

Returns

dictionary of all residues by their name and index and list of their graph nodes

getSampleResidues()

Get the unique residues in the molecule and used them as samples to be mapped. In case the size of the same of type of residues are different, e.g. N-terminal or C-terminal residues in a protein sequence, we will only keep the largest residue as the sample.

Return type

dict

Returns

a dictionary of the sample residues and their nodes

getIsomorphicFragments()

Create an isomorphic mapping for the molecule, to be used to translate the sample residue mapping to all other residues.

Return type

dict

Returns

a dictionary of the isomorphic mapping between the sample residues and all other residues

getNodeToAtomConversion()

Find mapping between self.st_graph nodes indexes to self.struct atom indexes.

Return type

dict

Returns

a dictionary of the atom index and the graph node index

translateMappingToProps()

Translate the mapping solution to the atom properties on the mapped structure. That is each atom in this molecule will have a property of the bead name and number it belongs to, and its order in the bead. This is used to create the CG structure.

addAtomProps(shared_node_types)

Add the grouping atom properties to the mapped structure, for each atom based on its node type.

getBeadsType()

Bead types for each mapped bead in mapping is defined. This includes bead name, the count number of the bead in the molecule. Beads are compared based on their netwrokx graphs.

getMappedBeadInfo(bead_smarts, bead_charge)

Get the information of a mapped bead as requested.

Parameters
  • bead_smarts (bool) – if True, the smarts for each bead is found.

  • bead_charge (bool) – if True, the total charge of the beads is calculated.

Return type

list(MAPPED_BEAD_INFO)

Returns

List of namedtuples of the beads name, smarts, and charge. If smarts and charge are not requested, they are None.

ALL_SMARTS = {'Acid_Chloride': 'C(=O)Cl', 'Aldehyde': '[CH;D2;!$(C-[!#6;!#1])]=O', 'Amine': '[N;$(N-[#6]);!$(N-[!#6;!#1]);!$(N-C=[O,N,S])]', 'Azide': '[N;H0;$(N-[#6]);D2]=[N;D2]=[N;D1]', 'Boronic_Acid': '[$(B-!@[#6])](O)(O)', 'Carboxylic_acid': 'C(=O)[O;H,-]', 'Halogen': '[$([F,Cl,Br,I]-!@[#6]);!$([F,Cl,Br,I]-!@C-!@[F,Cl,Br,I]);!$([F,Cl,Br,I]-[C,S](=[O,S,N]))]', 'Isocyanate': '[$(N-!@[#6])](=!@C=!@O)', 'Nitro': '[N;H0;$(N-[#6]);D3](=[O;D1])~[O;D1]', 'Sulfonyl_Chloride': '[$(S-!@[#6])](=O)(=O)(Cl)', 'Terminal_Alkyne': '[C;$(C#[CH])]', 'dangling_atoms': '[!#1;X4&H3,X3&H2,X2&H1,X1&H0]-[!#1]', 'higher_order_bonds': '[!#1]=,#[!#1]'}
NOT_HETERO_ATOM_NUMS = (6, 14)
PREDEFINED_PATTERNS = None
createGroupMapping(struct_groups, node_to_atom_idx)

Create a mapping for all atoms that are found to belong to a defined group.

Parameters
  • struct_groups (list(tuple(str, str))) – A list of atom indexes that belong to a the definied groups.

  • node_to_atom_idx (dict) – A dictionary of node number to atom index conversion

Return type

Mapping

Returns

A mapping of functional groups

findExclusiveAtoms()

Find unsharable atoms in a structure.

Return type

list(int)

Returns

list of atom indexes

findGroupsMapping(node_to_atom_idx)

Find functional groups and pre-defined groups (if requested) in the structure.

Parameters

node_to_atom_idx (dict) – A dictionary of node number to atom index conversion

Return type

tuple(Mapping, Mapping)

Returns

A mapping of functional groups and a mapping of predefined groups. Each can be an empty mapping.

findSMARTSMatchingIndexes(smarts_info, allow_partial_overlap=True, match_scale=False)

Find indexes of atoms that match the provided smarts pattern.

Parameters
  • smarts_info (list) – SMARTS pattern and their name to be matched.

  • allow_partial_overlap (bool) – If True, allow atoms to belong to more than one group. If False, atoms can only belong to one group.

  • match_scale (bool) – If True, compare the size of the groups against the CG scale, and filter out groups that are larger than the scale.

Return type

list(list(int))

Returns

List of lists of atom indexes that match the smarts pattern.

property sorted_func_groups

Sort functional groups based on complexity.

Return type

list(tuple(str, str))

Returns

list of functional groups sorted by complexity

updateFunctionalGroups(func_groups_indexes, predefined_indexes)

Update functional groups by removing groups that include atoms that already belong to a predefined group. And add all predefined groups as a functional group.

Parameters
  • func_groups_indexes (list(list(int))) – list of atom indexes that match the functional groups.

  • predefined_indexes (list(list(int))) – List of atom indexes that match the predefined groups.

Return type

list(list(int))

Returns

List of atom indexes forming functional groups.

class schrodinger.application.matsci.automated_cg_mapping.ResidueData(name, graph, mol_st, ring_mapping, scale, mol_func_groups, exclusive_atoms, node_to_atom_idx, predefined_mapping)

Bases: schrodinger.application.matsci.automated_cg_mapping.StructData

A class to store data for a residue (repeating unit) in the molecule

__init__(name, graph, mol_st, ring_mapping, scale, mol_func_groups, exclusive_atoms, node_to_atom_idx, predefined_mapping)

Create an instance.

Parameters
  • name (str) – the name of the residue

  • graph (networkx.Graph) – the residue graph

  • mol_st (structure.Structure) – Molecule structure

  • ring_mapping (Mapping) – The mapping of nodes that belong to a ring

  • scale (int) – Coarse graining scale, number of AA atoms per CG bead

  • mol_func_groups (Mapping) – The mapping of all functional groups in the molecule

  • exclusive_atoms (list(int)) – list of all unsharable atoms in the molecule

findGroupsFromMolecule(mol_mapping)

Find groups that are mapped in the molecule and match the residue graph.

Parameters

mol_mapping (Mapping) – The mapping of groups in the molecule

Return type

Mapping

Returns

A mapping of groups in the residue

findFuncGroupsFromMolecule()

Find matches of functional groups and predefined groups in the residue.

getContractedGraphAndMapping()

Contracted graph is generated where functional groups are contracted into a single node, in the contracted graph. Contracted mapping is generated to be able to map back contracted graph to the full graph.

getContractedMapping()

Contract any functional groups to a single node, and record the mapping for the contraction

Return type

Mapping

Returns

The contracted mapping for the contraction

getNodeClusterSets()

Clustering nodes with their neighbors at different scales. This is to account for nodes topological and chemical environment. Starting from a seed node and growing the cluster using multiple scales. The set and the scale that achieves best mapping will be used in the driver.

Return type

list(dict)

Returns

A list of dictionaries, one dict for each scale. Each dict contains the seed node as the key and a list of nodes that are similar to the seed as the value.

growNodeCluster(seed_node, scale, graph, mapping, use_degree=False)

Growing a cluster of nodes starting from a seed node and adding its neighbors until the cluster’s size reaches the input scale.

Here the contracted graph should be used to ensure that functional groups are not split. Contracted mapping is needed to find the actual size of the cluster.

Parameters
  • seed_node (int) – The seed node to start growing the cluster from.

  • scale (int) – The scale of the node cluster.

  • graph (networkx.Graph) – The contracted graph.

  • mapping (Mapping) – The contracted mapping.

  • use_degree (bool) – Whether to use the degree of the nodes to calculate the cluster size

Return type

list(int)

Returns

The node cluster as a list of nodes

getClusterSize(cluster, graph, mapping, use_degree)

Calculating the size of a node cluster. The size is the number of nodes in the cluster. If use_degree is True, the sum of the degrees of the nodes minus 2, if use_degree is True.

Parameters
  • cluster (list(int)) – The cluster.

  • graph (networkx.Graph) – The graph.

  • mapping (Mapping) – The mapping.

  • use_degree (bool) – Whether to use the degree of the nodes

Return type

int

Returns

The size of the cluster

expandNodeClusters(mapping, clusters)

For node clusters calculated the full cluster by expanding functional groups beads to their constituents nodes.

Parameters
  • mapping (Mapping) – The contracted mapping of functional groups to the full structure nodes

  • clusters (dict) – Dictionary of node clusters

Return type

dict

Returns

Dictionary of node clusters with the functional groups expanded

getColoredMapping(node_clusters)

Get a mapping, based on an initial guess from the vertex coloring solution to the dual graph of the input graph of CG functional groups.

Parameters

node_clusters (dict) – dictionary of nodes in the graph and their clusters

Return type

Mapping

Returns

The new mapping

ALL_SMARTS = {'Acid_Chloride': 'C(=O)Cl', 'Aldehyde': '[CH;D2;!$(C-[!#6;!#1])]=O', 'Amine': '[N;$(N-[#6]);!$(N-[!#6;!#1]);!$(N-C=[O,N,S])]', 'Azide': '[N;H0;$(N-[#6]);D2]=[N;D2]=[N;D1]', 'Boronic_Acid': '[$(B-!@[#6])](O)(O)', 'Carboxylic_acid': 'C(=O)[O;H,-]', 'Halogen': '[$([F,Cl,Br,I]-!@[#6]);!$([F,Cl,Br,I]-!@C-!@[F,Cl,Br,I]);!$([F,Cl,Br,I]-[C,S](=[O,S,N]))]', 'Isocyanate': '[$(N-!@[#6])](=!@C=!@O)', 'Nitro': '[N;H0;$(N-[#6]);D3](=[O;D1])~[O;D1]', 'Sulfonyl_Chloride': '[$(S-!@[#6])](=O)(=O)(Cl)', 'Terminal_Alkyne': '[C;$(C#[CH])]', 'dangling_atoms': '[!#1;X4&H3,X3&H2,X2&H1,X1&H0]-[!#1]', 'higher_order_bonds': '[!#1]=,#[!#1]'}
NOT_HETERO_ATOM_NUMS = (6, 14)
PREDEFINED_PATTERNS = None
createGroupMapping(struct_groups, node_to_atom_idx)

Create a mapping for all atoms that are found to belong to a defined group.

Parameters
  • struct_groups (list(tuple(str, str))) – A list of atom indexes that belong to a the definied groups.

  • node_to_atom_idx (dict) – A dictionary of node number to atom index conversion

Return type

Mapping

Returns

A mapping of functional groups

findExclusiveAtoms()

Find unsharable atoms in a structure.

Return type

list(int)

Returns

list of atom indexes

findGroupsMapping(node_to_atom_idx)

Find functional groups and pre-defined groups (if requested) in the structure.

Parameters

node_to_atom_idx (dict) – A dictionary of node number to atom index conversion

Return type

tuple(Mapping, Mapping)

Returns

A mapping of functional groups and a mapping of predefined groups. Each can be an empty mapping.

findSMARTSMatchingIndexes(smarts_info, allow_partial_overlap=True, match_scale=False)

Find indexes of atoms that match the provided smarts pattern.

Parameters
  • smarts_info (list) – SMARTS pattern and their name to be matched.

  • allow_partial_overlap (bool) – If True, allow atoms to belong to more than one group. If False, atoms can only belong to one group.

  • match_scale (bool) – If True, compare the size of the groups against the CG scale, and filter out groups that are larger than the scale.

Return type

list(list(int))

Returns

List of lists of atom indexes that match the smarts pattern.

property sorted_func_groups

Sort functional groups based on complexity.

Return type

list(tuple(str, str))

Returns

list of functional groups sorted by complexity

updateFunctionalGroups(func_groups_indexes, predefined_indexes)

Update functional groups by removing groups that include atoms that already belong to a predefined group. And add all predefined groups as a functional group.

Parameters
  • func_groups_indexes (list(list(int))) – list of atom indexes that match the functional groups.

  • predefined_indexes (list(list(int))) – List of atom indexes that match the predefined groups.

Return type

list(list(int))

Returns

List of atom indexes forming functional groups.

class schrodinger.application.matsci.automated_cg_mapping.MappingValidator(fragment, scale)

Bases: object

A validator object used to score CG mappings.

__init__(fragment, scale)

Create an instance. :param fragment: The fragment that is being mapped, can be a whole molecule or a residue. :type fragment: MoleculeData or ResidueData

Parameters

scale (int) – Coarse graining scale, number of AA atoms per CG bead

validateMapping(mapping)

Validate a mapping.

Parameters

mapping (Mapping) – the mapping to be validated

Return type

bool

Returns

whether the mapping is valid

validateNodes(mapping)

validate nodes in the mapping

Parameters

mapping (Mapping) – the mapping to be validated

Return type

bool

Returns

whether the mapping is valid

validateBeads(mapping)

Validate beads in the mapping

Parameters

mapping (Mapping) – the mapping to be validated

Return type

bool

Returns

whether the mapping is valid

validateFuncGroups(mapping)

Validate functional groups are intact

Parameters

mapping (Mapping) – The mapping to be validated

Return type

bool

Returns

whether functional groups are intact

validatePredefinedMapping(mapping)

Validate that pre-defined groups are not expanded.

isScoreImproved(new_score, best_score, new_mapping, best_mapping)

Validate if the score and the new mapping is improved compared to the best mapping from the previous iteration.

Parameters
  • new_score (float) – the new score

  • best_score (float) – the best score from previous iterations

  • new_mapping (Mapping) – the new mapping

  • best_mapping (Mapping) – the best mapping from previous iterations

Return type

bool

Returns

whether the score and mapping have improved

numsOfSharedNodes(mapping, graph)

Find number of shared nodes in the mapping

Parameters
  • mapping (Mapping) – The mapping

  • graph (networkx.Graph) – The graph

Return type

int

Returns

number of shared nodes in the mapping

validateSharesWithNeighbors(node, neighbors, mapping, residue_graph)

Validate number of shared nodes between two beads against node degrees in the graph.

Parameters
  • node (int) – the shared node

  • neighbors (list(int)) – shared nodes neighbors

  • mapping (Mapping) – the mapping to be validated

  • residue_graph (networkx.Graph) – the graph of the residue (or the whole molecule)

Return type

bool

Returns

whether the mapping is valid

validateUnsharedNeighbors(node, neighbors_list, mapping)

Validate whether shared nodes have at least one unshared neighbors in each bead that they belong to.

Parameters
  • node (int) – the shared node

  • neighbors_list (list(int)) – shared nodes neighbors

  • mapping (Mapping) – the mapping to be validated

Return type

bool

Returns

whether the mapping is valid

computeCostFunction(mapping)

Compute the objective function for a given group mapping.

Inspired from T. Bereau and K. Kremer, JCTC 2015. See eq. 1 and the surrounding discussion

Parameters

mapping (Mapping) – CG mapping dict for a section of (or an entire) molecule

Return type

float

Returns

Objective function score

getNumBrokenRings(mapping)

Calculate the number of rings whose CG representation involves more than one group. Attempts to favor CG mappings that center groups on rings (if rings are present)

Parameters

mapping (Mapping) – CG mapping for a section of (or an entire) molecule

Return type

int

Returns

number of broken rings

class schrodinger.application.matsci.automated_cg_mapping.DpdMapper(molecule, scale)

Bases: object

Class for mapping a molecule to a DPD coarse-grained model

__init__(molecule, scale)

Create an instance.

Parameters
  • scale (int) – Coarse graining scale, number of AA atoms per CG bead

  • molecule (MoleculeData) – the molecule to be mapped

map()

Map the molecule to a DPD coarse-grained model

mapResidue(residue)

Map a residue to a DPD coarse-grained model

Parameters

residue (ResidueData) – the residue to be mapped

Return type

tuple(Mapping, dict)

Returns

the best mapping and the best expanded node clusters for the residue

expandResidualMapping(residue_mapping, residue_node_clusters)

Expand the mapping of the sample residue to all other residues it represents.

Parameters
  • residue_mapping (dict) – Mapping for each sample residue

  • residue_node_clusters (dict) – Best node cluster found for each residue

Return type

Mapping

Returns

Mapping for all residues

homogenizeBeadSizes(residue, mapping, symmetry_mapping, validator)

Homogenize the bead sizes of a residue in iterations until the score stops improving. The homogenization is done by joining small beads with their neighbors, or swapping nodes from the largest neighbors to the smallest beads.

Parameters
  • residue (ResidueData) – Residue object that holds the residue data

  • mapping (Mapping) – the mapping to be homogenized

  • symmetry_mapping (dict) – a dictionary of classified node clusters

  • validator (MappingValidator) – a validator object to validate the mapping

Return type

Mapping

Returns

Homogenized mapping

moveNodesInIteration(mapping, score, residue, symmetry_mapping, validator, join=True)

Move nodes between beads in iteration in order to homogenize bead sizes. Finds the smallest groups (acceptors) in the current mapping, sorts the set of acceptors according to their symmetry groups, and then for each symmetric subset iterates over symmetric sets and eligible neighboring groups and moves nodes between beads by either joining them or swapping one of their nodes to another bead.

Parameters
  • mapping (Mapping) – the mapping to be homogenized

  • score (float) – the score of the mapping

  • residue (ResidueData) – Residue object that holds the residue data

  • symmetry_mapping (dict) – a dictionary of classified node clusters

  • validator (MappingValidator) – a validator object to validate the mapping

  • join (bool) – whether to join beads or swap nodes between beads

Return type

tuple(Mapping or NoneType, float)

Returns

The best mapping and score for the iteration over acceptor groups. If score is not improved, returned mapping will be None.

swapBeads(temp_mapping, acceptors, donors, residue, cg_graph, validator, best_score)

Performs one iteration of the homogenizeGroupSizes node-swap stage. Finds the smallest groups (acceptors) in the current mapping, sorts the set of acceptors according to their symmetry, and then for each symmetric subset iterates over symmetric sets of the largest neighboring groups (donors) to find the best choice for the node-swap.

Parameters
  • temp_mapping (Mapping) – temporary mapping at the current iteration

  • acceptors (list) – list of acceptor beads

  • donors (list) – list of donor beads

  • residue (ResidueData) – The residue object

  • cg_graph (networkx.Graph) – The coarse-grained graph

  • validator (MappingValidator) – a validator object to validate the mapping

  • best_score (float) – the best score so far

Return type

tuple(Mapping or NoneType, float)

Returns

The best swap mapping and swap score for the iteration over donor groups. If score is not improved, returned mapping will be None.

joinBeads(temp_mapping, joiners, joinees, cg_graph, validator, best_score)

Join the smallest beads in the mapping with their smallest neighbors.

Parameters
  • temp_mapping (Mapping) – the temporary mapping at this iteration

  • joiners (list) – list of joiner beads

  • joinees (list) – list of joinee beads

  • cg_graph (networkx.Graph) – the coarse-grained graph

  • validator (MappingValidator) – a validator object to validate the mapping

  • best_score (float) – the best score achieved so far

Return type

tuple(Mapping or None, float)

Returns

The new mapping with small beads joined together and the score for this mapping. If score is not improved, returned mapping will be None.

getSwaps(graph)

For a given graph, decompose nodes to list of pairs of beads that will undergo swaps.

Parameters

graph (networkx.Graph) – The subgraph

performSwapping(acceptors, swap, mapping, residue)

Carry out the given swap move. If any nodes are already shared between the swapping pairs, the shared node will be donated to the acceptor. Otherwise, a node will be shared between the two.

Parameters
  • acceptors (list(int)) – List of acceptor beads

  • mapping (Mapping) – The cg mapping at the current iteration

  • swap (list) – The pair of group ids that will undergo node swap

  • residue (ResidueData) – The residue object that is being mapped

Return type

Mapping

Returns

The new mapping

fixDisconnectedDonors(acceptor_idx, donor_idx, mapping, residue)

Check if swapping nodes caused the donor group being broken (becoming disconnected). If so, the largest connected component of the resulting donor group becomes the new donor group, and any remaining connected components of the broken donor group become part of the acceptor group.

Parameters
  • acceptor_idx (int) – Group id for the acceptor

  • donor_idx (int) – Group id for the donor

  • mapping (Mapping) – The current mapping

  • residue (ResidueData) – The residue object that is being mapped

Return type

mapping

Returns

The new mapping

shareNodes(accept_idx, donor_idx, swap, mapping, residue)

Any nodes that are to be swapped and are not already shared, become shared here, unless that are marked as ‘exclusive’ (i.e. not to be shared). If a node is exclusive, it is instead completely donated from the donor to the acceptor group. In this case, must also check if the node is part of a functional group, in which case the entire functional group is donated from the donor to the acceptor.

Parameters
  • accept_idx (int) – The acceptor bead index

  • donor_idx (int) – The donor bead index

  • swap (list(int)) – The list of nodes that are already shared between donor and acceptor

  • mapping (Mapping) – The cg mapping at the current iteration

  • residue (ResidueData) – The residue object that is being mapped

Return type

Mapping

Returns

The new mapping

donateNodes(accept_idx, donor_idx, shared_nodes, mapping)

Any nodes that are already shared between the acceptor and the donor groups, become exclusive to the acceptor group. If a given shared node belongs to a functional group, the entire functional group must be transferred from the donor to the acceptor.

Parameters
  • accept_idx (int) – The acceptor bead index

  • donor_idx (int) – The donor bead index

  • shared_nodes (list(int)) – The list of nodes that are already shared between donor and acceptor

  • mapping (Mapping) – The cg mapping at the current iteration

Return type

Mapping

Returns

The new mapping

moveNodesBetweenBeads(bead, acceptor, donor, mapping)

Move all nodes in the bead (functional group) from donor to the acceptor.

Parameters
  • bead (int) – the functional group bead

  • acceptor (int) – The acceptor bead index

  • donor (int) – The donor bead index

  • mapping (Mapping) – The mapping that should be modified

Return type

Mapping

Returns

The new mapping

categorizeSymmetryMappings(mapping, symmetry_mapping)

Categorize CG beads into same groups based on the symmetry mapping. This is to ensure that symmetrical beads are all homogenized the same.

Parameters
  • mapping (Mapping) – bead mapping

  • symmetry_mapping (Mapping) – Mapping of the category ids of individual nodes

Return type

dict

Returns

dictionary of beads and the sorted symmetry ids of the nodes that are grouped in that bead.

groupSymmetricBeads(beads, cg_symmetry_dict)

Sort and group beads based on their symmetry ids.

Parameters
  • beads (list) – beads to be sorted

  • cg_symmetry_dict (dict) – dictionary of beads and the symmetry ids of the nodes that are grouped in that bead.

Return type

dict

Returns

dictionary of symmetry ids and the beads that belong to the symmetry group

findGroupsInSet(mapping, beads=None, smallest=True)

Find group of beads in the mapping based on their size.

Parameters
  • mapping (Mapping) – bead mapping

  • beads (list) – list of beads to search within. If None, search all beads

  • smallest (bool) – if True, find the smallest beads, else find the largest

Return type

tuple(list, int)

Returns

the eligible beads and their size

class schrodinger.application.matsci.automated_cg_mapping.AutomatedCGMapping(struct, scale, bead_smarts=False, bead_charge=False, predefined_patterns=None)

Bases: object

Main class to map the atomistic structure to coarse-grained

__init__(struct, scale, bead_smarts=False, bead_charge=False, predefined_patterns=None)

Create an instance.

Parameters
  • struct (structure.Structure) – All-atom structure to be mapped to CG

  • scale (int) – Coarse graining scale, number of AA atoms per CG bead.

  • bead_smarts (bool) – If True, will get SMARTS for found beads.

  • bead_charge (bool) – If True, will get charges for found beads.

  • predefined_patterns (dict) – dictionary of names and pre-defined SMARTS patterns that need to be mapped in one bead

map()

Run the workflow.