Core Concepts

Structures

The Structure class is the fundamental class in our modules, and will probably be used in all of the code you write. Structure objects can be single molecules or groups of molecules. They provide access to atoms, bonds, properties, and a number of substructure elements.

Like any other Python object, Structure objects can be stored in arrays or dictionaries, assigned to variables, and passed between functions. (However, they cannot be pickled because they wrap an underlying C library.)

In principle, Structure objects can be created programmatically, by creating a zero-atom structure, adding the desired atoms and connecting them with bonds. However, this usage pattern is atypical. In most cases a structure will be loaded from a file or retrieved from the Maestro Workspace or the Maestro Project Table.

Most Schrödinger calculations will produce a Maestro-format output file (with either a mae or maegz file extension). Creating a Structure object from one of these files will allow you to investigate the properties and structure of the resulting molecule or molecules.

Structure Class Organization

Structure objects expose many attributes as iterators, including atoms, bonds, and substructure elements. In addition to attributes that are part of the class definition of these objects, structures, atoms, and bonds each have general dynamic dictionary-like property attributes that can store properties associated with the specific object.

See the API documentation for more details on the properties and methods of the Structure class.

First we’ll set up a structure for what follows. “st” for “structure” is commonly used to name Structure instances.

>>> from schrodinger import structure
>>> from schrodinger.test import mmshare_data_file
>>> st = structure.StructureReader.read(mmshare_data_file('r_group_enumeration_library/Diverse_R-groups.maegz'))

Note

The >>> prefix in the examples that follow is the interactive prompt. Examples without the prompt are snippets of scripts.

Atoms

All Structure objects have a list-like atom attribute that can be used to iterate over all atoms or to access them by index. For example:

>>> for atom in st.atom:
...     name = atom.name
...     atomic_number = atom.atomic_number
...     # do something with these attributes

It is also possible to index into the atom container (we do not currently support slicing). Indexing starts at 1.

>>> # Print the name and atomic number of the first atom in the structure.
>>> atom = st.atom[4]
>>> name = atom.atom_type_name
>>> atomic_number = atom.atomic_number
>>> print(f"{name}: {atomic_number}")
H1: 1

Each atom is represented by an instance of the _StructureAtom class.

Some attributes (actually Python properties) of the _StructureAtom objects include name, atomic_number, formal_charge, and the Cartesian coordinates in x, y, and z. See the _StructureAtom properties for a full list.

Note that atom indices can change if the structure is modified and so can’t be safely relied on in many contexts. If you need to reidentify atoms after performing an operation that modifies the structure, you can use the _StructureAtom instance to ensure that you continue to refer to the correct atom. The _StructureAtom instance has an index attribute that will remain up-to-date through any such changes.

Bonds

Each atom also has a list-like bond attribute:

for atom in st.atom:
    print(f"{atom} is bonded to:")
    for bond in atom.bond:
        print(f"  atom {bond.atom2}")

Bonds are represented by the _StructureBond class. Important attributes of the bond class include order, atom1, and atom2. See the _StructureBond properties for full documentation.

Bonds within the structure are also accessible from a list-like attribute of a Structure object called bond. This access is useful for cases where you want to iterate over all bonds in a structure exactly once.

# It's possible to iterate over all bonds in a structure:
for bond in st.bond:
    print(f"Bonded atoms: {bond.atom1} and {bond.atom2}")

Substructures

A number of “substructure iterators” are available from each Structure object. Each of these iterators returns an instance of a non-public class that is a view on the substructure contained within the Structure object. Each substructure class has an extractStructure method that can be used to create a new and independent Structure object with the atoms in the substructure. They also have getAtomList methods to return a list of atom indices corresponding to the substructure and an atom iterator.

molecule

A Structure may have multiple unconnected molecules which can be iterated over using the molecule attribute. Returns an iterator that iterates over _Molecule objects.

chain

Iterates over protein chains in the Structure object. Returns a _Chain instance.

residue

Iterates over protein residues in the Structure object. Returns a _Residue instance.

ring

Iterates over all rings in the Structure object, as found by SSSR. Returns a _Ring instance.

Some example usages:

for res in st.residue:
  resname = res.pdbres.strip()
  print(f"{res.chain}:{resname}{res.resnum}")

 # A molecule is just a connected graph of atoms
 for mol in st.molecule:
     num_atoms = len({st.atom[i] for i in mol.getAtomIndices()})
     print(f"Mol {mol.number} has {num_atoms} residues")

 for chain in st.chain:
     print(f"Chain: {chain.name}")

 print(f"The structure has {len(st.ring))} rings")
 for ring in st.ring:
     if ring.isAromatic():
         print(list(ring.atom))

 print(f"The structure has {len(st.molecule)} molecules.")
 for mol in st.molecule:
     print("Molecule {mol.number} has {len(mol.atom)} atoms.")

The _Molecule and _Chain instances also support their own residue iterators. For example:

for chain in st.chain:
    residues = "".join([res.getCode() for res in chain.residue])
    print(f"chain {chain.name}: {residues}")

A few things are worth noting. First, you can’t index into a _Residuecontainer in the way that you can an atom or molecule container. If you’d like to do this, pass the residue container to a Python list and index into that list, remembering that Python lists are 0-based:

first_res = list(ct.residue)[0]

Note that when you’re iterating over a structure, you should not add or delete atoms or bonds.

Interface

The Structure class has a rich interface for performing common tasks, such as getting and settings atomic coordinates, searching for substructures, measuring distances and angles, etc. Many of these will be covered in the Cookbook section.

Properties

Structures and atoms can store properties in a dictionary-like attribute named property. Structure properties can be viewed in the Maestro Project Table, and are used by product backends to store results and intermediate data.

The property names in this property object must follow a pattern that is required for storage in Maestro-format files. The required naming scheme is type_author_property_name, where type is a data type prefix, author is a source specification, and property_name is the actual name of the data. The type prefix must be b for boolean, i for integer, r for real, and s for string. The source specification is typically a Schrödinger program abbreviation (e.g. m for Maestro and j for Jaguar) and the appropriate user-level source specification is user. (In Maestro-format files, the Structure object property names correspond to the properties listed under the f_m_ct { line.)

This example shows how to access, set, and delete Structure object properties:

# 'r_j_Gas_Phase_Energy' is a real property set by Jaguar.
gas_phase_energy = st.property['r_j_Gas_Phase_Energy']

# Properties stored by the user should use an "author" of 'user'.
st.property['r_user_Energy_Plus_Two'] = gas_phase_energy + 2.0

# Delete the new 'r_user_Energy_Plus_Two' property.
del st.property['r_user_Energy_Plus_Two']

Because the property objects are dictionary subclasses, the standard dictionary methods like keys and items also work.

Properties of atoms work the same way. For example, you could assign a property to all carbon atoms:

for atom in st.atom:
    if atom.atomic_number == 6:
        atom.property['b_user_is_carbon'] = True

Structure I/O

Reading a Structure from a File

The schrodinger.structure.StructureReader_ class creates Structure objects from molecular data stored in a number of standard file formats. Supported file types are Maestro, MDL SD, PDB, and Sybyl Mol2. Because these files may contain multiple molecules, the StructureReader is an iterator, and molecule files are presented as a sequence of Structure objects.

from schrodinger import structure

#Input can be a .mae, .sdf, .sd, .pdb, or .mol2 file.
input_file = "input.mae"

for st in structure.StructureReader(input_file):
    # Do something with the Structure...
    result = process_structure(st)

# To read only the first structure from a file, pass the handle to next.
reader = structure.StructureReader(input_file)
st = next(reader)

If you’re interested in a specific structure in the file and know the index, the Structure class also has a read classmethod for convenience:

# selects the first structure
st = structure.StructureReader.read(input_file)
# select the #nth structure, counting from 1
st = structure.StructureReader.read(input_file, index=3)

SMILES format files and CSV files with SMILES data are also supported, but because these have no structural data, resulting structures are SmilesStructures, which have less functionality than standard Structures. See the SmilesReader and SmilesCsvReader documentation.

Saving a Structure to a File

The StructureWriter class is the counterpart to the schrodinger.structure.StructureReader. It can write the same file formats as the StructureReader but mae is recommended as the least lossy.

This is an example of a typical read, process, and write script:

from schrodinger import structure

with structure.StructureReader("input.mae") as reader:
    with structure.StructureWriter("output.mae") as writer:
        for st in reader:
            # Do the required processing
            result_structure = do_processing(st)
            # Save the result to the output file
            writer.append(result_structure)

# Use reader and writer as context managers to ensure that the files are
closed after we're done with them.

Alternatively, if only a single structure is being written to a file, you can use the write staticmethod of StructureWriter:

from schrodinger import structure
# select the first structure
st = structure.StructureReader.read(input_file)
# do something here . . .
# . . . and then write it to a separate file using the staticmethod
structure.StructureWriter.write(st, output_file)

Structure Operations

In addition to the functionality provided in the schrodinger.structure module itself, much is provided in the schrodinger.structutils package.

This section lists some additional Structure features and a few highlights of the structutils package.

Structure Minimization

Structures can be minimized using one of the OPLS_2005 or OPLS3e force fields by using the minimize_structure function. This operation requires a valid product license from MacroModel, GLIDE, Impact, or PLOP. Note that minimization will not hold on to a license; a license is checked out to ensure that one is available, then immediately checked back in.

For example, to compare the energy of a molecule before and after minimization:

from schrodinger.structutils.minimize import minimize_structure

# Do a 0-step "minimization" to get the initial energy.
min_res = minimize_structure(st, max_steps=0)
original_energy = min_res.potential_energy

min_res = minimize_structure(st)
minimized_energy = min_res.potential_energy
energy_diff = original_energy - minimized_energy

print(f"The minimized energy is {energy_diff} kcal/mol lower than the original.")

Substructure Searching or Specification

Generate SMILES, SMARTS, or ASL strings based on a set of atom indices via the generate_smiles, generate_smarts, and generate_asl functions. Documentation on ASL can be found in the Maestro Command Reference Manual.

Evaluate SMARTS or ASL strings and return a list of matching atom indices via the evaluate_smarts and evaluate_asl functions.

This example finds the set of unique SMILES strings in a structure file:

from schrodinger.structutils.analyze import generate_smiles

unique_smiles = set()
for st in reader:
    pattern = generate_smiles(st)
    unique_smiles.add(pattern)

Structure Measurement

The schrodinger.structutils.measure module provides functions for measuring distances, angles, dihedral angles, and plane angles. It also offers the get_close_atoms method to find all pairs of atoms within a specified distance in less than O(N 2) time.

Structure Superimposition or Comparison

The in-place RMSD of two structures can be determined via the calculate_in_place_rmsd function. The ConformerRmsd class offers more complete RMSD comparison tools for conformers.

Two structures can be superimposed based on all atoms or a subset of atoms with the superimpose function.

Conversion Between 1D/2D and 3D Structures

To convert a 3D structure to a 1D structure (SMILES or SMARTS), use the appropriate function from schrodinger.structutils.analyze:

from schrodinger.structutils import analyze
smiles_list = []
smarts_list = []
for st in reader:
    smiles_list.append(analyze.generate_smiles(st))
    smarts_list.append(analyze.generate_smarts(st))

It is possible to convert a file of 1D SMILES strings to 3D structures.:

from schrodinger import structure

3d_sts = []
with structure.StructureReader.fromString('smiles_input') as reader:
    for 1d_st in reader:
        3d_sts.append(1d_st.generate3dConformation())

To convert a 3D structure to a 2D structure, use the canvasConvert utility from the command line:

$SCHRODINGER/utilities/canvasConvert -imae input.mae -2D -osd output.sd

The resulting SD file can then be read back in with the StructureReader class.

Modifying a Structure

Atoms can be added via the Structure.addAtoms method.

Individual atoms can be deleted with standard Python list syntax:

>>> st_copy = st.copy()
>>> len(st.atom)
5
>>> del st.atom[2]
>>> len(st.atom)
4

Note

Deleting atoms changes the indices of the atoms remaining in the Structure object.

Because deleting atoms renumbers the remaining atoms, multiple atoms should be deleted via the Structure.deleteAtoms method.

>>> len(st.atom)
4
>>> st.deleteAtoms([1, 2])
>>> len(st.atom)
2

Charges and atom identity can be modified by making assignments to the proper _StructureAtom attributes:

>>> st = structure.StructureReader.read(mmshare_data_file('r_group_enumeration_library/Diverse_R-groups.maegz'))
>>> at = st.atom[1]
>>> at.element
'C'
>>> at.atomic_number
6
>>> at.formal_charge
0
>>> at.element = 'N'
>>> at.formal_charge = 1
>>> at.formal_charge
1
>>> at.atomic_number
7
>>> at.atomic_number = 6
>>> at.element
'C'

>>> other_atom = st.atom[5]
>>> other_atom.index
5
>>> del st.atom[2]
>>> other_atom.index # This index is updated
4

As can be seen from the above examples, changing the atomic_number or element attributes automatically updates the associated value.

Bonds can be broken or created. For example:

# To avoid modifying the original structure, make a copy.
st = st_orig.copy()

# Break and re-join the first bond on the first atom.
bond = st.atom[1].bond[1]

atom1 = bond.atom1.index
atom2 = bond.atom2.index
order = bond.order

st.deleteBond(atom1, atom2)     # Delete the bond.
st.addBond(atom1, atom2, order) # Recreate bond with same bond order.

Hydrogens can be added via the schrodinger.structutils.build.add_hydrogens function, or deleted via the delete_hydrogens function.