A long-term view of the docking and scoring problem
Professor Friesner
is a founder of Schrödinger and Professor of Chemistry and Director of
the Center for Biomolecular Simulations at Columbia University. As
chairman of Schrödinger's Scientific Advisory Board, Professor Friesner
provides strategic vision and guidance for Schrödinger's scientific
advancements. In this installment of Rich's column, he describes
ongoing and future research aimed at creating a viable approach for
rank-ordering diverse sets of active compounds.
In a previous newsletter article,
I discussed Schrödinger's efforts to develop the improved docking and
scoring methods currently embodied in the XP Glide
methodology. While XP Glide contains important advances compared to
earlier scoring functions, it is clearly not, in and of itself, a
complete solution to the docking and scoring problem. In particular,
the parameterization of XP Glide has been focused on complexes for
which the ligand is judged to “fit” into the receptor structure. When
there are large scale induced fit effects, this assumption is typically
not valid, and ligands that are incompatible with the particular
receptor conformation used in a given docking study will often receive
very poor scores. For flexible receptors, we have observed that as many
as 50% of randomly chosen active ligands will fail to correctly fit
into a single given receptor conformation. Clearly, a solution to this
problem is needed before docking and scoring methods can be deployed
robustly to investigate the large numbers of diverse active compounds
located in a major drug discovery project.
Another
problem not yet fully addressed by Glide XP is the ability to rank
order compounds. To date, XP has been focused on separating “active”
from “inactive” compounds. However, relative binding affinities of
active compounds are not always rendered with high accuracy. For
compounds that form congeneric series, MM-GBSA and FEP methods
provide a useful approach to rank ordering. However, these approaches
are poorly suited for highly diverse sets of compounds (e.g., those
with significant structural differences, or those where the net charge
on the ligand varies significantly throughout the data set). Again,
progress on this problem would have a major impact on the value of
docking and scoring approaches in a drug discovery context.
In
this article, I will briefly outline the approach that Schrödinger is
taking over the next several years to address these problems. In the
initial phase of the project, we are performing studies primarily using
publicly available data from the Protein Data Bank. As solutions that
display efficaciousness for these data sets are developed, objective
testing can be performed by pharmaceutical and biotechnology companies
using proprietary in-house data sets. Note that in this approach,
intellectual property concerns can generally be avoided as all that is
necessary is a report of the accuracy of structural and binding
affinity prediction for the proprietary data sets.
As
a first step, we have chosen to study self-docking of co-crystallized
complexes from the PDB. The development version of XP, augmented when
necessary by quantum-polarized ligand docking (QPLD),
in which QM/MM methods are used to generate charges for the ligand in
the protein environment, is capable of achieving greater than 90%
accuracy (RMSD under 2.0 Å) for self-docking. Our initial focus has
been on data sets where there are 20 or more co-crystallized structures
for a given receptor. For these cases, comparisons can be made for
relative binding energies of the ligands docked into their cognate
receptor conformations, and the accuracy of the docking can be
rigorously evaluated. Hence, errors in the scoring function can
typically be well separated from errors in the docked structures.
We
are still in the process of completing these studies; however, the
initial results look very promising. We have found that rank ordering
of ligands is significantly improved by incorporating terms that take
into account the strain energy of the protein-ligand complex, a
quantity that has historically been difficult to construct models for.
Our new model is performing well in preliminary tests, achieving r2
values for correlations of theoretical and experimental binding
affinities in the range of 0.45 - 0.65, and average errors on the order
of 1 kcal/mole. While this is far from perfect, it does represent
significant progress in modeling binding affinities for diverse data
sets, containing a substantial number of different chemotypes – as is
the case for most of the data sets derivable from the PDB as discussed
above. It should be remembered, however, that these are in essence
training set results, and that independent test set evaluations will be
required for validation.
Our
studies also indicate that the accuracy of binding affinity prediction
is highly sensitive to the quality of the structures produced by the
docking. This means that robust and accurate incorporation of induced
fit effects is going to be necessary if rank ordering of compounds is
to be extended to realistic situations involving cross-docking (as
opposed to the self-docking studies carried out to date). The current
Schrödinger induced fit methodology can deliver significant improvements for the structure of complexes where induced fit is important1,
but the present version is not fast enough for extensive virtual
screening, and also needs to be tested on a much larger data set.
We
are in the process of performing such tests, using the co-crystallized
PDB complexes discussed above. If there are N complexes in a data set,
there are in principle N2/2 cross-docking test cases, and of
these, a substantial fraction are likely to exhibit induced fit
effects. By studying this much larger data set, improving the
efficiency of the algorithm, and assessing how well the induced fit
complex reproduces the binding affinity score of the self-docking
studies discussed previously, we believe it will be possible to create
a qualitatively enhanced version of our induced fit approach which, in
conjunction with improved scoring functions, will yield important
progress in both rank ordering compounds for lead optimization, and in
virtual screening.
Crystal
structures remain the best types of data sets for testing the
methodology, with regard to both the ability to accurately perform
cross-docking and for evaluating the accuracy of scoring functions. As
mentioned above, many biotechnology and pharmaceutical companies have
proprietary data sets available, which should facilitate testing of
this type. Once the reliability of cross-docking is established, there
are large numbers of known active compounds in the open literature that
can be profitably investigated.
We
view the improvement of docking and scoring methods as a long range
basic research project that is necessary for computational methods to
become a full partner with experiment in drug discovery projects. In
addition to the approaches described above, we are also working on
incorporating the results of alternative calculations, such as MM-GBSA,
linear response type methods, free energy perturbation, or molecular
dynamics simulations, into the sampling and scoring processes.
Ultimately a hierarchy of methods of correspondingly increasing
accuracy, robustness, and computational cost is required to assemble a
compelling computational platform for drug discovery. Schrödinger is
committed to investing substantial resources in all of these areas and
we expect significant progress to occur over the next several years as
the full panoply of methods comes on line and is refined against an
increasingly large and sophisticated repository of experimental data.
1Sherman,
W.; Day, T.; Jacobson, M. P.; Friesner, R. A.; Farid, R., “Novel
Procedure for Modeling Ligand/Receptor Induced Fit Effects”, J. Med. Chem., 2006, 49, 534 -553.
Comments and questions on Dr. Friesner's column are welcome. Please send these via email to ask-rich@schrodinger.com, and we'll address particularly interesting topics in future newsletters.