Knowledge Base

Article ID: 1620 - Last Modified:

Starting with a set of query molecules, I would like to find nearest-neighbor, similar compounds based on 2D fingerprints, from a larger database of compounds. How should I approach this task?

This feature is available from the Canvas user interface at Applications → Similarity/Distance Screen.

For a larger database of compounds, running from the command line is also an option, with the following set of commands. It is a good idea to start with a small, example set of compounds for the database before scaling up to a larger set.

  1. Generate fingerprints for your database of compounds. For example, with an SD file for the database, use the following command:

    $SCHRODINGER/utilities/canvasFPGen -isd database.sdf -o database.fp -fptype molprint2D -atomtype 5 -minpath 2 -path 2 -scaling 0
  2. Generate fingerprints for your query molecules. For example, with the query molecules in an SD file, use the following command:

    $SCHRODINGER/utilities/canvasFPGen -isd query.sdf -o query.fp -fptype molprint2D -atomtype 5 -minpath 2 -path 2 -scaling 0
  3. Create the similarity matrix:

    $SCHRODINGER/utilities/canvasFPMatrix -ifp database.fp -ifp2 query.fp -ocsv output_fpscreen.csv -metric tanimoto -filter cutoff > output.log

    You may need to experiment with the cutoff (which should be between 0 and 1) to filter out the least similar structures: for example, start with a higher value and decrease it if you need more structures.
  4. The output is written to the csv file output_fpscreen.csv. You can import this file into a spreadsheet and sort it to identify the most similar compounds to each query molecule.

    Back to Search Results

Was this information helpful?

What can we do to improve this information?


To ask a question or get help, please submit a support ticket or email us at help@schrodinger.com.
Knowledge Base Search

Type the words or phrases on which you would like to search, or click here to view a list of all
Knowledge Base articles