Knowledge Base

Article ID: 209 - Last Modified:

When using Canvas Diversity Analysis, how big of a subset can I select, and how do I choose the value of the exclusion sphere size ?

When performing Diversity Analysis on a large data set, we recommend sphere exclusion ("sphere") or directed sphere exclusion ("dise") as the diversity selection method. These methods do not store the matrix of similarities or distances. The memory scales linearly with the size of the full data set.

On the other hand, the computational expense is quadratic with respect to the subset size. So the subset should be kept reasonably small. We recommend that the subset remain below the square root of the number of compounds in the full data set.

To determine the exclusion sphere size that will result in such a subset, we suggest that you select a subset of the total pool, and experiment to find a suitable value of the sphere size. A subset of size sqrt(N) is good for this exercise.

Keywords: Canvas, DBCS, diversity analysis, exclusion sphere

Back to Search Results

Was this information helpful?

What can we do to improve this information?

To ask a question or get help, please submit a support ticket or email us at
Knowledge Base Search

Type the words or phrases on which you would like to search, or click here to view a list of all
Knowledge Base articles