schrodinger.application.desmond.kinetics.utils module

class schrodinger.application.desmond.kinetics.utils.RmsdSelection(aligned_weights, displaced_weights, atom_idcs)

Bases: tuple

aligned_weights: List[float]

Alias for field number 0

displaced_weights: List[float]

Alias for field number 1

atom_idcs: List[int]

Alias for field number 2

__contains__(key, /)

Return key in self.

__len__()

Return len(self).

count(value, /)

Return number of occurrences of value.

index(value, start=0, stop=9223372036854775807, /)

Return first index of value.

Raises ValueError if the value is not present.

class schrodinger.application.desmond.kinetics.utils.MtdInfo(value)

Bases: enum.Enum

An enumeration.

KILL_NOW = 'kill_now'
LAST_INDEX = 'last_index'
MINDIST = 'mindist'
TAU_ESTIMATE_SEC = 'tau_estimate_sec'
SIM_TIME = 'sim_time'
TOO_SLOW = 'too_slow'
STATUS = 'status'
class schrodinger.application.desmond.kinetics.utils.MtdStatus(value)

Bases: enum.Enum

The Status of each MtD unbinding job

OK = 'OK'
TOO_SLOW = 'Too Slow'
BOUND = 'Bound'
schrodinger.application.desmond.kinetics.utils.find_ligand_sites(st: schrodinger.structure._structure.Structure, asl: str, nsites: int, sampling_size: int = 5000) List[int]

Find a nsites atoms which are furthest away from each other on a provided small molecule. These atoms (sites) are then used in distance calculations.

schrodinger.application.desmond.kinetics.utils.find_protein_sites(st: schrodinger.structure._structure.Structure, asl: str, area_cutoff: Optional[float] = None) List[int]

Find solvent exposed protein residues which will be used to check if the ligand is still bound to the protein.

Parameters
  • st – Input system structure.

  • asl – ASL to identify protein residues eg. ‘protein’.

  • area_cutoff – SASA threshold where residues with SASA above this value have their C-alpha added to the set of protein sites. If no cutoff is given, return all C-alpha atom indices.

Returns

A list of C-alpha atoms indices passing the area_cutoff.

schrodinger.application.desmond.kinetics.utils.plot_descriptors(rmsd: numpy.array, mindist: numpy.array, endpoints: Tuple[Optional[int], Optional[int]], rmsd_limits: Tuple[Optional[float], Optional[float]], mindist_limits: Tuple[Optional[float], Optional[float]])

Plot descriptors from a RAMD trajectory

schrodinger.application.desmond.kinetics.utils.extract_reactive_path(cms_filename: str, asl_receptor: str, asl_ligand: str, rmsd_on: float, mindist_on: float, rmsd_off: Optional[float] = None, mindist_off: Optional[float] = None, strict: Optional[bool] = None, pad_traj_start: int = 0, max_traj_frames: int = 2000, debug: bool = False) List[str]

Reactive path is equivelent is a the unbinding path of the ligand

Parameters
  • rmsd_on – value to start recording ligand detachment cutoff

  • rmsd_off – value to stop recording ligand detachment cutoff, this value should be larger than rmsd_on

  • mindist_on – minimal distance between receptor and ligand to start recording ligand detachment cutoff

  • mindist_off – minimal distance between receptor and ligand to stop recording ligand detachment cutoff. This value should be larger than rmsd_on

  • strict – mindist_off and rmsd_off must be both satisfied for reactive trajectory extraction

  • pad_traj_start – number of initial trajectory frames to prepend to the reactive trajectory frames

  • max_traj_frames – maximum number of frames to load. If trajectory contains more frames than this value, then we will slice the trajectory such that the specified number of frames is used in the downstream analysis

Returns

a list of filenames to be saved in the analsysis stage of the workflow

schrodinger.application.desmond.kinetics.utils.calc_tension(rmsd_matrix: numpy.array, ivec: Optional[List[int]] = None) float
schrodinger.application.desmond.kinetics.utils.path_rmsd_matrix(sts: List[schrodinger.structure._structure.Structure], rmsd_atoms: List[int]) numpy.array

calculate 2d RMSD of the ligands on prealigned structures

schrodinger.application.desmond.kinetics.utils.path_optimize(current_guess: List[int], rmsd_matrix: numpy.array, step: float) List[int]

Provided that you already have a rough path (init.mae) you can now optimize it, whiere the best set of available frames can be used. :return: a list with frame IDs, which will be used to extract the path

schrodinger.application.desmond.kinetics.utils.path_satisfies_mindist(sts: List[schrodinger.structure._structure.Structure], asl_receptor: str, asl_ligand: str, mindist_off: float) bool

Evaluate if the provided path satisfies the mindist criteria

schrodinger.application.desmond.kinetics.utils.trj2sts(cms_model: cms.Cms, tr: List[traj.Frame], path_frames: Optional[List[int]] = None) List[schrodinger.structure._structure.Structure]

Convert trajectory frames to to structure objects. path_frames is a list of frames to convert to structures.

schrodinger.application.desmond.kinetics.utils.path_prune(sts: List[schrodinger.structure._structure.Structure], rmsd_matrix: numpy.array, threshold: float = 1.5) List[schrodinger.structure._structure.Structure]

Remove waypoints that are too close to non-adjacent waypoints. This is treated as a shortest path problem using Dijkstra’s algorithm.

Parameters
  • sts – Waypoint structures

  • rmsd_matrix – Matrix of RMSDs between input waypoints

  • threshold – Factor used to set how far adjacent waypoints can be. This is multiplied by the median of adjacent waypoint RMSDs to set the maximum for when waypoints may be connected in the shortest path

Returns

Subset of input waypoint structures composing the shortest path

schrodinger.application.desmond.kinetics.utils.path_smooth(sts: List[schrodinger.structure._structure.Structure], nwaypoints: int, receptor_atom_idcs: List[int], ligand_atom_idcs: List[int]) List[schrodinger.structure._structure.Structure]

As final path optimization step, we will shuffle each waypoint towards or away from its neightbors such that RMSD between all the waypoints is equal. If nwaypoints > len(sts), additional waypoints will be interpolated. This approach is using a string-like method to optimize the distances between each waypoint.

Parameters
  • sts – Waypoint structures, may have length < nwaypoints

  • nwaypoints – Desired number of waypoints

  • receptor_atom_idcs – List of receptor atom indices

  • ligand_atom_idcs – List of ligand atom inices

Returns

a new optimized path

schrodinger.application.desmond.kinetics.utils.plot_path_matrices(title_to_matrix_map: Dict[str, numpy.array], fname: str) None

Write a png showing the RMSD matrix for each path processing step

schrodinger.application.desmond.kinetics.utils.reduce_and_optimize_path(asl_receptor: str, asl_ligand: str, nwaypoints: int, mindist_off: float, debug: bool = False) List[str]

Reduce the reactive path to a given number of waypoints, optimize which frames are selected, and smooth the optimized path so waypoints are evenly spaced. Returns an empty list when no reactive path is found.

Parameters

nwaypoints – the number of snapshots/waypoints in the unbinding path to generate

Returns

a list of filenames to be saved in the analysis stage

schrodinger.application.desmond.kinetics.utils.summarize_traj_with_com_st(cms_filename: str, asl_receptor: str, asl_ligand: str, tau: Optional[float] = None) Optional[schrodinger.structure._structure.Structure]

Read in a trajectory and create a summary structure with dummy atoms representing the ligand at its center of mass.

Parameters
  • cms_filename – CMS file to read linked trajectory and convert.

  • asl_receptor – Atom selection language for the receptor

  • asl_ligand – Atom selection language for the ligand

  • tau – Predicted residence time to save as structure property

schrodinger.application.desmond.kinetics.utils.setup_ramd_system(asl_receptor: str, asl_ligand: str, cms_filename: str, ligand_rmsd_cutoff: float, debug: bool = False) Optional[Cms]

Measure Ligand RMSD and determine if its suitable for downstream RAMD Calculation. Setup subsequent RAMD systems.

Returns

CMS file if the ligand does not unbinds during the relaxation stage (ligand RMSD is less than ligand_rmsd_cutoff). If it does unbinds then we return None.

schrodinger.application.desmond.kinetics.utils.cluster_unbinding_paths(paths: List[List[schrodinger.structure._structure.Structure]], asl_receptor: str, asl_ligand: str, use_weights: bool, max_iter: int) Tuple[List[Tuple[int, int, int]], List[int]]

Given a list of paths, cluster them using AffinityPropogation methon. Return information about the clusters as well as their centroids.

Returns

for each cluster return a (size, centroid, label) tuple and a “paths label” path label is a cluster all paths belong in.

schrodinger.application.desmond.kinetics.utils.get_unbinding_pathway(paths_labels: List[int], cluster_info: List[Tuple[int, int, int]], unbound_labels: Optional[List[bool]] = None) List[List[schrodinger.structure._structure.Structure]]

Using the clustering information, write the centroid unbinding paths. If the first N clusters have the the same number of members then we will return N centroids for these clusters.

Parameters
  • cluster_info – a list that contatins (size, center, label) of clusters

  • unbound_labels – use these labels to overwrite the centroids to to return. Here we will suppliment centroid information to prioritize unbound paths that meet MinDist criteria (MINDIST_GOOD_PROP in the first waypoint of the path should be set to True)

schrodinger.application.desmond.kinetics.utils.calculate_path_msd(path_sts: List[schrodinger.structure._structure.Structure], align_asl: str, displace_asl: str) float

Function to calculate ‘lambda’ variable as described in the reference below. This variable is comparable to the inverse of mean square displacement between neigboring waypoints. See for details: -Branduardi D, et. al., JChemPhys 126(5), 054103; doi:10.1063/1.2432340

Parameters

path_sts – structrures defining unbinding path

schrodinger.application.desmond.kinetics.utils.process_atom_selection(path_sts: List[schrodinger.structure._structure.Structure], align_asl: str, displace_asl: str) schrodinger.application.desmond.kinetics.utils.RmsdSelection

This function generates three atom lists used with required weights for M-expression RMSD calculation. The length of each list should be the same. Here we perform the bookkeeping to ensure the the length of these list are the same.

Parameters
  • path_sts – A list of structures that define unbinding path. These structures should not have any hydrogen atoms.

  • align_asl – Selection to align the system on

  • displace_asl – Selection to measure RMSD on

Returns

A tuple of three lists, all the same sizes, with weights for RMSD alignment/measurement for the first to variables, and atom indices that correspond to atoms used in RMSD calculation

schrodinger.application.desmond.kinetics.utils.insert_path_into_cms(cms_fname: str, path_sts_fn: str, combined_cms_fname: str)

This function reads in cms and path_structures and combines the two by inserting the coordinates for all the path waypoint (wp) structures as atom properties in full-system CT of the cms file. The path’s waypoint coordinates are stored in the properties like this: * r_tau_path_waypt00_{x,y,z} Which corresponds to the xyz coordinates in the first waypoint for that atom.

This is done due to a limitation that MSJ workflow can pass only one structure as an input.

Note: Hydrogen coordinates are not stored.

schrodinger.application.desmond.kinetics.utils.extract_path_from_cms(cms_fname: str) Tuple[List[schrodinger.structure._structure.Structure], cms.Cms]

This function reads a cms file with stored path waypoints (see insert_path_in_cms()) in atom properties and extracts its coordinates into separate structures.

Parameters

cms_fname – filename to the CMS file with the waypoint coordinates

schrodinger.application.desmond.kinetics.utils.get_mtd_results(cvseq_filename: str) pandas.core.frame.DataFrame

Read in cvseq data from a filename and then convert it to a DataFrame. Since the results of mtd can get quite large we are going to change precision and/or change data types of several fields, for more efficient storage.

schrodinger.application.desmond.kinetics.utils.get_mtd_info(df: pandas.core.frame.DataFrame, residence_time_cutoff: Optional[float]) dict

Return a dict with key info about the MtD data in the provided DataFrame.

Parameters
  • df – dataframe with all the unbinding descriptors

  • residence_time_cutoff – a cutoff value to determine if the is too slow to unbind, in which case the workflow is terminated early

Returns

a dictionary with various descriptors calculated from the job. The

keys and their values of this dict are the following.

MtdInfo.TAU_ESTIMATE_SEC: Predicted resedence time (tau) in Seconds

MtdInfo.SIM_TIME: Total simulation time (in nanoseconds).

MtdInfo.MINDIST: Distance between the receptor and the ligand when the

simulation was terminated

MtdInfo.TOO_SLOW: A bool indicating if the unbinding event had

occured or not. A value of True suggests that no unbinding happend.

MtdInfo.STATUS: A string of the final status: ‘OK’, ‘BOUND’ ‘TOO SLOW’

schrodinger.application.desmond.kinetics.utils.calc_mean_tau_and_pval(res_times: List[float], ntrials: int = 100) Tuple[schrodinger.application.desmond.measurement.Measurement, schrodinger.application.desmond.measurement.Measurement]

Return mean Tau and P-val values as a Measurement object

schrodinger.application.desmond.kinetics.utils.plot_tau_fitting(res_times: List[float], tau: schrodinger.application.desmond.measurement.Measurement, pvalue: schrodinger.application.desmond.measurement.Measurement, jobname: str)

Plot predicted residence time (tau) from the replicas and the fitted value. The plot will be written to the current directory with the jobname as a filename.

Parameters
  • res_times – Residence times from previous subjobs

  • tau – Fitted residence time for the ligand

  • pvalue – P-value for the fitted residence time

  • jobname – Master jobname, will be used to save plot in PNG format

schrodinger.application.desmond.kinetics.utils.plot_mtd_report(results: pandas.core.frame.DataFrame, jobname: str)

Generate a series of plots to expore MtD unbinding simulations.

schrodinger.application.desmond.kinetics.utils.read_structs_from_tgz(file_str: str, tar_path: pathlib.Path) List[schrodinger.structure._structure.Structure]

Extract files from a tgz archive. Warning: This will find and extract the first file matching file_str inside any tarred subdirectory. Additional matches will be ignored.

Parameters
  • file_str – File name to seach for within the tar file

  • tar_path – Path of the tar file to extract structures from

schrodinger.application.desmond.kinetics.utils.get_indices_from_unbinding_subjob(subjob_name: str)

Retrieve the indices for the path CV and the replica from a subjob name

Parameters

subjob_name – The subjob name to extract indices from Some examples are cdk8_4f6s_full_mtd_replica-0-01 or cdk8_4f6s_full_mtd_replica-0-01_3 where the path cv index is 0 and the replica index is 01.

schrodinger.application.desmond.kinetics.utils.write_mtd_unbinding_path_summary(subjob_to_unbinding_path_map: Dict[str, List[schrodinger.structure._structure.Structure]], path_cv_idx_to_structure_map: Dict[str, List[schrodinger.structure._structure.Structure]], asl_receptor: str, asl_ligand: str)

Write a summary structure showing the metadynamics paths sampled for each path CV represented using their ligand center of mass. The structures will contain the receptor, the ligand, each path CV used, and the metadynamics paths grouped with their associated path CV. The mtd paths are also annotated with their simulation time and predicted residence time.

Parameters
  • subjob_to_unbinding_path_map – Dict mapping unbinding path subjob names to path structures

  • path_cv_idx_to_structure_map – Dict mapping path CV (equivalently RAMD cluster) number to path CV structures

  • asl_receptor – Atom selection language for the receptor

  • asl_ligand – Atom selection language for the ligand

schrodinger.application.desmond.kinetics.utils.pad_asl_if_invalid(asl: str) str

Pad ASL strings with spaces to ensure they are recognized as valid. This addresses the issue in DESMOND-13855 where ASL strings containing SMARTS will not be recognized if they are within parentheses unless they are padded.

Parameters

asl – Input ASL string to pad with spaces if needed