How can I remove duplicates from a set of ligand-like structures?
To remove duplicates, you can use one of the following methods:
- In Maestro, use the Merge Duplicates panel (Tools → Merge Duplicate Ligands). This produces a list of SMILES strings for the unique structures and writes them to a new file.
- In Canvas, you can remove duplicate structures from a file when you import it, by selecting Skip duplicate structures. Any structure that duplicates a structure that has already been read is discarded. If the structure already exists in the Canvas project, you can update it with the structure (and properties) from the file by selecting Update existing records. You can then export the structures in the project to obtain a file without duplicates.
- In Canvas, you can identify duplicate structures by choosing Structure → Detect Duplicates. Two properties, "Has duplicate" and "Duplicate of" are added to the project. The first property shows the Canvas UID value for the "main" structure in each set of duplicates (the one that has the lowest Canvas UID), and is otherwise not set. The second property has a value for all the other structures in a set of duplicates, which is the Canvas UID of the main structure, and is otherwise not set. You can then select the rows that are not duplicates and export the unique structures to a file.
- Run $SCHRODINGER/utilities/uniquesmiles. This utility has three options for handling duplicates.
- Use the Generate Unique Smiles node in KNIME, which is under Schrödinger → Tools. The node has two output ports, one of which is for the duplicates. There is also a duplicates option in the node configuration.
- Use the Python script "Compare ligands in different files" on the Scripts menu in Maestro. This script has four options for structure retention.
Back to Search Results