The Role of Conformations in DFT-based pKa Prediction

Background

In contrast to empirical pKa prediction, where the information about the molecule is normally conveyed by the types of atoms it contains and connectivities between those atoms (the so-called 2D structure), DFT-based pKa prediction deals with actual 3D structures. This means that the pKa values computed by a program that conducts DFT calculations might depend on the conformation of the input structure. How strong is the effect and how important is it to use low energy conformations as the input? In order to answer these questions let us consider the workflow of Schrödinger’s DFT-based pKa predictor, Jaguar pKa, which should have the same essential dependence on conformational effects as other DFT-based pKa predictors.

An important part of the Jaguar pKa workflow1 is the computation of the so-called “raw” pKa value which is assembled directly from DFT energies. This raw pKa is not the pKa that can be directly compared to the experimental value. In order to be on the same footing with the experimentally measured pKa the raw pKa still needs to be adjusted by empirical parameters that compensate for various errors of DFT and the associated solvation model. Because these empirical parameters can hardly account for conformational phenomena it is important to absorb conformational effects in raw pKa.

The formula for the raw pKa is:

where G is the free energy of the deprotonation of the molecule in solution, R is the universal gas constant, and T is the temperature. G is approximated by essentially a combination of the gas phase and solution phase DFT energies of the protonated and the deprotonated forms of the molecule:

Here E(A-)g, E(AH)g, E(AH)solv, and E(A-)solv are the gas phase DFT energy of the deprotonated form, the gas phase energy of the protonated form, the solvation energy of the protonated form, and the solvation energy of the deprotonated form, respectively. The non-essential for this discussion Constis unchanged from one pKa calculation to another and absorbs the effects related to the solvation of the proton.

The denominator in Equation 1 at room temperature (T = 298K) equals approximately 1.36 kcal/mol. This implies that an error in G equal to 1.36 kcal/mol will introduce an error of one pKa unit to the raw pKa value. Errors coming from inherent inaccuracies in DFT energies and the solvation model are responsible for the raw pKa being typically several pKa units larger than the final, corrected pKa value. The errors of DFT and the solvation model which do not fortuitously cancel in Equation 2 are more or less systematic and depend mostly on the local environment of the atom involved in the protonation/deprotonation process. This is why a parameterization model based on the linear empirical parameters a and b

tied to a specific functional group is successful in predicting pKa’s with high accuracy on the basis of Equations 1-2. Jaguar pKa contains sets of parameters a and b for over a hundred different functional groups, ranging from very specific to generic.

The errors related to conformational changes are much harder to correct. The nature of the conformational effects is global, not local, as the change in the global shape of the molecule might have a profound, unbalanced, and hardly predictable effect on the energies from Equation 2. As such, conformational errors cannot be accounted for by the simple empirical parameters associated with the functional group which undergoes a protonation event. Working erroneously with high-energy conformations that do not sufficiently characterize the protonated and the deprotonated forms of the molecule, as they exist in solution, might introduce unexpected errors in some or all of the energy terms of Equation 2. As we have seen from the previous paragraph, a G error on the order of several kcal/mol will result in an error of several pKa units in the final, empirically adjusted, pKa value. In practice, an erroneous choice of a conformation that turns out to be several kcal/mol higher in energy than the lowest conformation in the solution phase will often result in a outlier pKa value. Therefore, it is essential that DFT-based pKa calculations should be conducted with lowest energy conformations.

Conclusion

In order to evaluate the impact of the improvements described in the previous section, we chose two sets of flexible molecules which had previously posed problems in DFT and empirical pKa predictions, apparently due to an inadequate treatment of the conformational effects.

The first test set contained flexible amidines which were considered as BACE-1 inhibitors with the common chemical formula.5

This set of molecules used to be a challenge for the old Jaguar pKa protocol which was suffering from its inability to account for errors (i)-(iii). Even though the overall performance of the old Jaguar pKa protocol was quite good (with the MUE on the order of 0.6 pKa units; see the table below) two types of problems were still evident. The first one was that the predicted pKa might depend strongly on the initial input conformation (see the values for the conformations A and B shown in the table; conformations A and B differ by a 180o flip of the amide bond in the chemical formula shown above). For example, the input conformation A of structure 67 yielded pKa = 6.0 whereas the input conformation B of the same structure produced pKa = 7.0 even when preceded with the built-in MacroModel conformational search. Obviously, this effect is highly undesirable as it creates a significant uncertainty in the final results and poses barriers for the reproducibility of the results by different groups of scientists. The second problem was that there were still several outliers (values with an error of more than 1 pKa unit; shown in red in the table). We believe that these problems were mostly the consequence of deficiencies (i) and (ii). Computational experiments with the new protocol suggested that deficiency (iii) had played a less prominent role.
 

Structure

Experimental pKa

Old Jaguar pKa, conf A

Old Jaguar pKa, conf B

New Jaguar pKa, conf A

New Jaguar pKa, conf B

1b

9.7

9.0

8.9

9.7

9.6

66

8.1

7.7

7.8

8.2

8.2

67

7.4

6.0

7.0

7.4

7.5

69

7.3

7.3

7.8

7.7

7.7

109

7.3

5.4

5.4

7.0

7.0

89

7.0

6.8

6.9

7.3

7.3

68

6.7

5.0

7.0

7.3

7.3

88

6.3

6.1

5.9

6.3

6.3

111

5.9

5.8

5.7

6.0

6.4

110

5.8

5.2

5.3

6.0

6.1

14d

5.8

5.8

6.1

6.3

6.3

70

5.1

4.3

3.9

4.7

4.5

1a

9.8

9.4

9.1

9.8

9.8

14a

6.3

6.2

6.3

6.4

6.4

112

9.0

9.5

8.1

9.0

9.1

113

6.1

6.1

6.0

6.3

6.4

MUE

 

0.56

0.54

0.20

0.27

 

The last two columns of the table show the performance of the new, conformationally-aware protocol which was described in the previous section. For these calculations, we used a pool of ten protonated and ten deprotonated conformations. From the table, we can see that both problems plaguing the previous results have now been essentially eliminated. The dependence of the final pKa results on the initial conformation is negligible (with the biggest difference being 0.4 pKa units for structure 111), and the number of outliers is now zero. Not surprisingly, the overall performance improved dramatically: the mean unsigned errors (MUE) have now dropped to the impressive low error of 0.2 - 0.3 pKa units, which was impossible to achieve with any of the previous protocols, empirical or non-empirical, that we have tried before.

The second test of the new protocol set a goal of being able to describe the subtle effects of stereochemistry on the pKa. Although quite rare, the effect is nevertheless observable in the experiment and is created by the different electrostatic environment in the vicinity of the dissociating proton, due to different multipole moments and even hydrogen bond patterns of the stereoconformations that are not mirror images of one another. The better performance of the new Jaguar pKa protocol with respect to describing these subtle effects could already be visible from the BACE-1 results above, as structures 66 and 67 are stereoisomers:

 

Structure

67

67

Experimental pKa

8.1

7.4

Old Jaguar pKa

7.7, 7.8

6.0, 7.0

New Jaguar pKa

8.2, 8.2

7.4, 7.5

 

The following four molecules are two pairs of stereoisomers, where each pair has a different experimental pKa, doubtless owing to the neighboring OH group that can interact with the protonatable secondary or tertiary amine group via a hydrogen bond. One stereoconfiguration can form such a bond directly whereas the other one, in which the amino-group and the hydroxy-group are separated by a greater distance, cannot.

 

Experimental pKa

8.0

8.8

8.1

8.6

Old Jaguar pKa

8.2

8.4

8.5

7.9

New Jaguar pKa

7.7

9.4

8.7

9.2


Comparing the results produced by the old and the new Jaguar pKa protocols we can conclude that the old protocol, although quite accurate in predicting the absolute values (MUE = 0.43), fails to reproduce the trends in the pKa dependence on the stereoconfiguration. The new protocol, making use of ten protonated and ten deprotonated conformations, reproduces the pKa dependence correctly and yields accurate absolute pKa values. The MUE of the new protocol on these four molecules is 0.53, a little larger than that observed for the set of the BACE inhibitors, is probably due to the non-trivial electrostatic and solvation effects associated with the hydrogen bonds. Nevertheless, the performance of the new, conformationally-aware workflow is impressive. More testing of the new protocol would be desirable and will be reported in an upcoming publication.

 

References

  1. Klicić, J. J.; Friesner, R. A.; Liu, S.-Y.; Guida, W. C., Accurate prediction of acidity constants in aqueous solution via density functional theory and self-consistent reaction field methods. J. Phys. Chem. A, 2002, 106, 1327–1335.

  2. Liao, C.; Nicklaus M. C. Comparison of Nine Programs Predicting pKa Values of Pharmaceutical Substances. J. Chem. Inf. Model. 2009, 49, 2801–2812.

  3. Marten, B.; Kim, K.; Cortis, C.; Friesner, R. A.; Murphy, R. B.; Ringnalda, M. N.; Sitkoff, D.; Honig, B., New Model for Calculation of Solvation Free Energies: Correction of Self-Consistent Reaction Field Continuum Dielectric Theory for Short-Range Hydrogen-Bonding Effects. J. Phys. Chem. 1996, 100, 11775-11778.

  4. Bochevarov, A. D.; Watson, M. A.; Greenwood, J. R.; Philipp, D. M., Multiconformation, Density Functional Theory-Based pKa Prediction in Application to Large, Flexible Organic Molecules with Diverse Functional Groups, J. Chem. Theory Comput., 2016, 12 (12), 6001–6019.

  5. Hilpert, H. et al., β‑Secretase (BACE1) Inhibitors with High in Vivo Efficacy Suitable for Clinical Evaluation in Alzheimer’s Disease. J. Med. Chem. 2013, 56, 3980−3995.

Back To Top