Article ID: 955 - Last Modified: August 27, 2013
How can I remove duplicates from a set of ligand-like structures?
To remove duplicates, you can use one of the following methods:
- Run $SCHRODINGER/utilities/uniquesmiles. This utility has three options for handling duplicates.
- Use the Unique Smiles node in KNIME, which is under Schrödinger → Tools. The node has two output ports, one of which is for the duplicates. There is also a duplicates option in the node configuration.
- Use the Python script "Compare ligands in different files" from the Script Center. This script has four options for structure retention.
Several other methods are available as of Suite 2012:
- In Maestro, use the Merge Duplicates panel (Tools → Merge Duplicate Ligands). This produces a list of SMILES strings for the unique structures and writes them to a new file.
- In Canvas, you can remove duplicate structures from a file when you import it, by selecting Skip duplicate structures. Any structure that duplicates a structure that has already been read is discarded. If the structure already exists in the Canvas project, you can update it with the structure (and properties) from the file by selecting Update existing records. You can then export the structures in the project to obtain a file without duplicates.
- In Canvas, you can identify duplicate structures by choosing Structure → Detect Duplicates. Two properties, "Has duplicate" and "Duplicate of" are added to the project. The first property shows the Canvas UID value for the "main" structure in each set of duplicates (the one that has the lowest Canvas UID), and is otherwise not set. The second property has a value for all the other structures in a set of duplicates, which is the Canvas UID of the main structure, and is otherwise not set. You can then select the rows that are not duplicates and export the unique structures to a file.
Type the words or phrases on which you would like to search, or click here to view a list of all
Knowledge Base articles