schrodinger.livedesign.biologics.registration module

This file serves as the primary entry point for Live Design biologics. In particular, it provides an API for: (1) Converting user input data into a canonical data structure for storage and database manipulation. (2) Computing a default set of properties and descriptors.

class schrodinger.livedesign.biologics.registration.BioPolymer(polymer_id: str, monomer_string: str, bio_class: schrodinger.livedesign.entity_type.EntityClass, requested_hash: int, connections: Set[ForwardRef('BioPolymer')] = <factory>, neighbors: Set[ForwardRef('BioPolymer')] = <factory>)

Bases: object

polymer_id: str
monomer_string: str
bio_class: schrodinger.livedesign.entity_type.EntityClass
requested_hash: int
connections: Set[schrodinger.livedesign.biologics.registration.BioPolymer]
neighbors: Set[schrodinger.livedesign.biologics.registration.BioPolymer]
addConnection(neighbor: schrodinger.livedesign.biologics.registration.BioPolymer)

Instead of tracking inter-polymer connections directly, maintain a connection list to find non-self neighbors.

__init__(polymer_id: str, monomer_string: str, bio_class: schrodinger.livedesign.entity_type.EntityClass, requested_hash: int, connections: Set[schrodinger.livedesign.biologics.registration.BioPolymer] = <factory>, neighbors: Set[schrodinger.livedesign.biologics.registration.BioPolymer] = <factory>) None
schrodinger.livedesign.biologics.registration.fasta_registration_process(fasta_str: str) Iterator[schrodinger.livedesign.registration.RegistrationData]

Given an input in the form of fasta text, yields either one RegistrationData per sequence, or, if given a specially-formatted single-entity fasta, processes the sequences hierarchically and yields data for the subunits and their larger construct

Parameters

fasta_str – the text of a plain fasta file or a special single-entity fasta file

Returns

an iterator of RegistrationData for the entries, either one per sequence for standard fasta formats, or one for each hierarchical biologic entity in the single-entity fasta format in increasing complexity.

schrodinger.livedesign.biologics.registration.helm_registration_process(helm_str: str) Iterator[schrodinger.livedesign.registration.RegistrationData]

Given an input HELM string, returns the following: (1) The RegistrationData of the full helm model defined by the string (2) RegistrationData for each large oligomeric subunit (e.g. the antibody in an antibody-drug conjugate) (3) RegistrationData for each simple polymer

Notably, this function uses the Bioluminate antibody detection modules to detect antibody sequences and classify all HELM protein chains (“peptides”) into their respective antibody class, if any, and then combines the antibody chains into well-defined larger constructs.

Parameters

canonical_helm_model – input HELM model object. Assumed to be a connected collection of simple polymers, with no disconnected subunits.

Returns

an iterator over the hierarchy of biologic entities in increasing complexity. For example, an antibody-drug conjugate would return: 1. antibody heavy and light chains 2. a small molecule 3. the arms of the antibody 4. the antibody 5. the antibody-drug conjugate

schrodinger.livedesign.biologics.registration.create_registration_data(helm_model: schrodinger.protein.helm._helm_parser.HelmModel, bio_class: schrodinger.livedesign.entity_type.EntityClass, child_hashes: List[str]) schrodinger.livedesign.registration.RegistrationData

Package the relevant information from the input helm model and return a RegistrationData object.

Parameters
  • helm_model – input helm model from which all properties are derived

  • bio_class – the entity class of the inpuyt model

  • child_hashes – the list of children entities, by hash

Returns

RegistrationData with the relevant fields populated.

schrodinger.livedesign.biologics.registration.count_residues_in_model(helm_model: schrodinger.protein.helm._helm_parser.HelmModel) int

Simple function to count residues in a helm model.

schrodinger.livedesign.biologics.registration.extract_registration_data(polymer: schrodinger.protein.helm._helm_parser.HelmPolymer, helm_model: schrodinger.protein.helm._helm_parser.HelmModel) schrodinger.livedesign.registration.RegistrationData

Helper function to extract the connectivity metadata and the Registration data for a particular polymer from given helm model.

Parameters
  • polymer – the helm polymer to extract, in RegistrationData format

  • helm_model – the helm model to extract the RegistrationData from

Returns

the RegistrationData instance for the polymer

schrodinger.livedesign.biologics.registration.build_polymer_graph(connections: Iterable[schrodinger.protein.helm._helm_parser.HelmConnection], polymer_dict: Dict[str, schrodinger.livedesign.biologics.registration.BioPolymer]) None

Simple function to store neighbors as a list in each BioPolymer instead of storing connections, in order to explore polymer neighborhoods efficiently.

Parameters
  • connections – list of HelmConnections defining the BioPolymer graph

  • polymer_dict – the dictionary of BioPolymers for easy BioPolymer management

schrodinger.livedesign.biologics.registration.find_antibodies(polymers: List[schrodinger.livedesign.biologics.registration.BioPolymer], helm_model: schrodinger.protein.helm._helm_parser.HelmModel) List[schrodinger.livedesign.registration.RegistrationData]

Find and classify all contiguous antibody polymer combinations if connectivity is provided, otherwise just assemble them into one large antibody.

Parameters
  • polymers – list of BioPolymers to search for antibodies

  • helm_model – source HelmModel, used to generate new HelmModels

Returns

a list of RegistrationData for each found antibody construct

schrodinger.livedesign.biologics.registration.is_antibody_subunit(polymer: schrodinger.livedesign.biologics.registration.BioPolymer) bool

Utility function to increase readability.

schrodinger.livedesign.biologics.registration.find_abs_by_connectivity(polymer_graph: List[schrodinger.livedesign.biologics.registration.BioPolymer], helm_model: schrodinger.protein.helm._helm_parser.HelmModel) List[schrodinger.livedesign.registration.RegistrationData]

Depth-first search approach to finding all directly connected antibody components in a helm polymer network.

Parameters
  • helm_model – source helm model to query for connectivity and extract HelmPolymers from for output subunits

  • polymer_graph – the list of all BioPolymers

Returns

a list of RegistrationData for each found (and recognized) antibody constructs

schrodinger.livedesign.biologics.registration.find_ab_chain(biopolymer: schrodinger.livedesign.biologics.registration.BioPolymer, used_polymers: Set[schrodinger.livedesign.biologics.registration.BioPolymer]) List[schrodinger.livedesign.biologics.registration.BioPolymer]

Recursive function to grow an antibody chain from one polymer to reach all reachable antibody fragments.

Parameters

used_polymers – the set of “seen” polymers to avoid infinite recursion

schrodinger.livedesign.biologics.registration.create_combined_ab_data(ab_polymer_chain: List[schrodinger.livedesign.biologics.registration.BioPolymer], helm_model: schrodinger.protein.helm._helm_parser.HelmModel) schrodinger.livedesign.registration.RegistrationData

Given a set of BioPolymers, extract them from the given helm_model and create a new RegistrationData object consisting only of the set.

Parameters
  • ab_polymer_chain – the collection of antibody Biopolymers to combine

  • helm_model – the source helm_model containing connections and HelmPolymer objects to extract

Returns

the RegistrationData of the final combined object

schrodinger.livedesign.biologics.registration.classify_ab_assembly(ab_polymer_chain: List[schrodinger.livedesign.biologics.registration.BioPolymer]) schrodinger.livedesign.entity_type.EntityClass

Given a bunch of antibodies, find a matching, recognized antibody construct and return it.

Parameters

ab_polymer_chain – the chain of connected antibody subunuits

Returns

the overall antibody class the collective subunits create, if any

schrodinger.livedesign.biologics.registration.get_arm_data(ab_polymer_chain: List[schrodinger.livedesign.biologics.registration.BioPolymer], helm_model: schrodinger.protein.helm._helm_parser.HelmModel) Iterable[schrodinger.livedesign.registration.RegistrationData]

Given a list of connected BioPolymers, find the arms (light/heavy pairs) and return the registration data for each arm. Each arm must be from a F(ab’)2, monospecific antibody, or bispecific antibody. An arm thus consists of a light Fab domain, connected to a heavy full or heavy Fab’ chain. The heavy chain can connect to other heavy chains, but two light chains sharing a heavy chain would not fall into any of the currently supported classes.

Parameters
  • ab_polymer_chain – the full set of connected antibody chains

  • helm_model – source helm model to extract helm strings and connections from.

Returns

generator over the available arms in the antibody polymer chain.

schrodinger.livedesign.biologics.registration.replace_base_chains(subunit: schrodinger.livedesign.registration.RegistrationData, child_hashes: List[str], hash_dict: Dict[str, schrodinger.livedesign.registration.RegistrationData]) None

Helper function to recursively traverse the set of returned RegistrationData and replace base chain data with their combined subunit in the top-level child hash list.

Parameters
  • child_hashes – top-level hash list

  • hash_dict – all returned child hashes so far

  • subunit – the current subunit to insert into child_hashes

schrodinger.livedesign.biologics.registration.get_display_string(canonical_helm_model: schrodinger.protein.helm._helm_parser.HelmModel) str

Externally available serialization API for HELM serialization.

Parameters

canonical_helm_model – canonicalized helm model

Returns

a HELM string

schrodinger.livedesign.biologics.registration.get_requested_hash(canonical_helm_model: schrodinger.protein.helm._helm_parser.HelmModel) int

Unified HELM model hasher for biologics registration

Parameters

canonicalized_helm_model – a HelmModel to hash

Returns

sha1 hashed HELM string.

schrodinger.livedesign.biologics.registration.combine_helm(biopolymers: List[schrodinger.livedesign.biologics.registration.BioPolymer], helm_model: schrodinger.protein.helm._helm_parser.HelmModel) schrodinger.protein.helm._helm_parser.HelmModel

Function to ensure recombined HELM strings always canonicalize the same way. Sometimes indistinguishable simple polymers (other than who they’re connected to) can get swapped, changing the connection section and the resulting final hash.

Parameters

biopolymers – BioPolymer instances whose corresponding polymers will be extracted

Returns

a HelmModel consisting only of the specified biopolymers

schrodinger.livedesign.biologics.registration.light_heavy_heavy_light(light_chains: List[schrodinger.livedesign.biologics.registration.BioPolymer], heavy_chains: List[schrodinger.livedesign.biologics.registration.BioPolymer]) bool

Ensure that the connectivity is light-heavy-heavy-light, which correspondes to a full antibody.

Parameters
  • light_chains – the list of connected light chains

  • heavy_chains – the list of connected heavy chains

schrodinger.livedesign.biologics.registration.is_deoxyribose(helm_monomer: schrodinger.protein.helm._helm_parser.HelmMonomer) bool
schrodinger.livedesign.biologics.registration.is_standard_ribose(helm_monomer: schrodinger.protein.helm._helm_parser.HelmMonomer) bool
schrodinger.livedesign.biologics.registration.classify_polymer(helm_polymer: schrodinger.protein.helm._helm_parser.HelmPolymer) schrodinger.livedesign.entity_type.EntityClass

Given a HelmPolymer (a polymer consisting of only one type), infer its type.

Parameters

helm_polymer – helm_polymer to determine class of

schrodinger.livedesign.biologics.registration.classify_nucleotide_polymer(helm_polymer) schrodinger.livedesign.entity_type.EntityClass
schrodinger.livedesign.biologics.registration.classify_protein(helm_polymer) schrodinger.livedesign.entity_type.EntityClass
schrodinger.livedesign.biologics.registration.classify_antibody(helm_polymer, ab_classification) schrodinger.livedesign.entity_type.EntityClass
schrodinger.livedesign.biologics.registration.classify_overall_molecule(polymer_chains: List[schrodinger.livedesign.biologics.registration.BioPolymer]) schrodinger.livedesign.entity_type.EntityClass

Given a collection of polymers, return the corresponding EntityClass.

Parameters

polymer_chains – the set of polymers to classify

schrodinger.livedesign.biologics.registration.helm_mol_to_binary(helm_model: schrodinger.protein.helm._helm_parser.HelmModel) bytes

Converts given Helm model to rdmol binary, and adds sequence annotation data for any antibodies detected.

Parameters

helm_model – input helm model to convert to rdmol binary

Returns

serialized binary with sequence annotations embedded as serialized json in the “antibody_regions” property.

schrodinger.livedesign.biologics.registration.extract_sequence_data_from_binary(serialized_mol: bytes) Dict[str, Union[Dict[str, Union[Tuple[int, int], List[str]]], Dict[str, str]]]

Inflates serialized rdmol binaries, extracts the serialized sequence annotation dictionary, adds in the FASTA sequences, and returns the resulting dictionary.

Parameters

serialized_mol – input rdmol binary

Returns

json-serialized SequenceAnnotations