Docsity
Docsity

Prepare-se para as provas
Prepare-se para as provas

Estude fácil! Tem muito documento disponível na Docsity


Ganhe pontos para baixar
Ganhe pontos para baixar

Ganhe pontos ajudando outros esrudantes ou compre um plano Premium


Guias e Dicas
Guias e Dicas

A Transferable H-Bonding Correction for Semiempirical, Notas de estudo de Engenharia Elétrica

A Transferable H-Bonding Correction for Semiempirical

Tipologia: Notas de estudo

2010

Compartilhado em 11/01/2010

igor-donini-9
igor-donini-9 🇧🇷

4.5

(4)

419 documentos

Pré-visualização parcial do texto

Baixe A Transferable H-Bonding Correction for Semiempirical e outras Notas de estudo em PDF para Engenharia Elétrica, somente na Docsity! A Transferable H-Bonding Correction for Semiempirical Quantum-Chemical Methods Martin Korth,† Michal Pitoňák,†,‡ Jan Řezáč,,† and Pavel Hobza*,†,§ Institute of Organic Chemistry and Biochemistry, Academy of Sciences of the Czech Republic and Center for Biomolecules and Complex Systems, 16610 Prague 6, Czech Republic, Department of Physical and Theoretical Chemistry, Faculty of Natural Sciences, Comenius UniVersity, 84215 BratislaVa 4, SloVak Republic, and Department of Physical Chemistry, Palacky UniVersity, 771 46 Olomouc, Czech Republic Received October 13, 2009 Abstract: Semiempirical methods could offer a feasible compromise between ab initio and empirical approaches for the calculation of large molecules with biological relevance. A key problem for attempts in this direction is the rather bad performance of current semiempirical methods for noncovalent interactions, especially hydrogen-bonding. On the basis of the recently introduced PM6-DH method, which includes empirical corrections for dispersion (D) and hydrogen-bond (H) interactions, we have developed an improved and transferable H-bonding correction for semiempirical quantum chemical methods. The performance of the improved correction is evaluated for PM6, AM1, OM3, and SCC-DFTB (enhanced by standard empirical dispersion corrections) with several test sets for noncovalent interactions and is shown to reach the quality of current DFT-D approaches for these types of problems. 1. Introduction The ability to perform fast and accurate computer simulations of biomolecular systems has the potential to bring new insight and application opportunities in several scientific fields, for example, the development of selective receptors, catalysts, and enzyme inhibitors in computational drug design. Comple- mentary computational methods for de novo drug design and virtual screening have already made striking successes possible, for example, through computer-aided drug lead generation and optimization.1,2 Although these approaches can support and complement drug design, they can not be seen as fully mature, because both the modeling tools used and our understanding of protein-ligand recognition prin- ciples are still limited, especially regarding the effects of protein flexibility and solvation.3 Even though many advanced and accurate computational methods exist, their application to large-scale simulations of biomolecules is not possible, because these methods are computationally too demanding. As a result, the method of choice for these applications is molecular mechanics (MM). Although MM performs well in many cases, it has several drawbacks: By design, it cannot describe quantum effects like, for example, changes in electronic structure, such as chemical reactions or charge transfer, and most MM models also neglect polarization effects, which were shown to be important, for example, for the solvation of biomolecules.4 Promising tools to overcome these limitations while main- taining efficiency (allowing extensive sampling of biologi- cally relevant molecular systems) are semiempirical (SE) quantum mechanical methods. The application of current SE methods to biochemical problems is unfortunately not straightforward, because the structure and function of biomacromolecules are dominantly influenced by noncovalent interactions like dispersion and hydrogen-bonding,5 that generally need very high-level quantum chemical methods to be modeled with sufficient accuracy.6 Despite this, the past few years have seen great success with the incorporation of dispersion effects via * To whom correspondence should be addressed. E-mail: pavel.hobza@marge.uochb.cas.cz. † Academy of Sciences of the Czech Republic and Center for Biomolecules and Complex Systems. ‡ Comenius University. § Palacky University. J. Chem. Theory Comput. XXXX, xxx, 000 A 10.1021/ct900541n  XXXX American Chemical Society empirical corrections for a wide range of DFT7,8 and also SE (e.g., PM3-D, AM1-D9) methods. But because substrate recognition and binding is most often dominated by elec- trostatics, the accurate description of these effects and especially the hydrogen-bond interactions are also of fun- damental importance for any biomolecular modeling ap- proach. Examples for the importance of hydrogen-bonding for molecular recognition are, for example, DNA base pairing, protein folding, enzyme activity, crystal structures, properties of liquids, and pharmaceutical drug solubility and activity. While electrostatics in general are not a problem for SE methods, current SE methods are known to be deficient in the description of hydrogen-bonding (with hydrogen core-core terms, missing polarization functions on hydrogen, missing orthogonalization corrections, and in general parametrization as discussed reasons, see refs 10 and 11 and references therein). We see this to be the major obstacle limiting the accuracy of SE methods when applied to biomolecules. As classical modeling approaches are further pushed to their limit, and more and more pitfalls are coming to light,12 the interest in improving SE for biomolecular modeling purposes grew substantially over the past few years and has led to a number of related publications: As a result of the first biomolecular application attempts with OMn13 and SCC- DFTB14 and explorative approaches to describe protein ligand docking with PM315,16 and AM1,17 it became clear that the (earlier known) deficiencies of SE methods for the description of hydrogen bonding18,19 are of crucial impor- tance in these applications.16,20 On the other hand, first large- scale SE modeling of protein structures gave promising results21-23 and showed that the capability of SE methods to detect native structures from collections of decoys is quite remarkable.12 In order to surpass the accuracy of the description of noncovalent interactions by MM force fields, improving the description of hydrogen-bonding interactions in SE methods is clearly necessary. A number of approaches offering improvement in this direction have been suggested in the literature so far, for example, on the basis of additional or modified core-core terms (like PM3-PIF24,25 and PDDG/PM326), third-order terms, and modified parameters for SCC-DFTB27 and also reformulated QM/MM interaction terms (to improve hydro- gen bonding at the QM/MM interface28). An overview of the problem and the proposed solutions can be found in refs 10 and 11. While a significantly better performance is observed when applying these techniques, the results still leave large space for further improvements. (It is nevertheless hard to understand why a recent comparison of the perfor- mance of semiempirical QM/MM approaches with force fields29 ignores all developments except the PDDG ap- proaches.) Concerning force field and ab initio results, the following has to be kept in mind: A recent study that evaluated the performance of a set of widely used force fields by calculating the geometries and stabilization energies for a large collection of intermolecular complexes showed that the magnitude of hydrogen-bonding interactions are severely underestimated by all of the force fields tested.30 And albeit much better, also the performance of DFT methods for the calculation of (especially the relative) strength of hydrogen- bond interactions is not always of satisfactorily high accuracy (see ref 31 and references therein). Recently, our group managed to successfully open up a new path to improve SE methods for hydrogen-bonding interactions: We augmented the new PM6 method32 with empirical corrections for dispersion and hydrogen-bonding interactions (referred to as PM6-DH1 in the following)33 and were able to achieve large improvements in accuracy for interaction energies of biologically relevant, noncovalently bound systems. PM6 was chosen, because this model is parametrized for 80 elements and was shown to be one of the most accurate SE approaches for a wide range of problems.32 Furthermore, PM6 is implemented also as a linear-scaling, localized molecular orbital algorithm (termed MOZYME34) in Mopac200935 and VAMP 10.0, which allows the modeling of most of the proteins in the PDB (with less than about 5000 atoms) on standard desktop computers.34 While our first-generation H-bonding correction was already a major step forward in accuracy, we have found further improvement possible, to be presented in the following. 2. Empirical H-Bonding Corrections for Semiempirical Quantum Chemical Methods The First-Generation Correction. To incorporate the major characteristics of hydrogen-bond interactions, the first- generation correction made use of the charges q on the acceptor (A) and hydrogen (H) atoms, the H-bond distance r between these atoms, and a cosine term that promotes a 180° bonding situation for the A · · ·H-D (with the donor atom D) angle: EH-bond ) a[qA × qHr2 × cos(θ) + b × cr] (1) The parameters a, b, and c were optimized for eight different bond types, leading to overall 24 parameters for the description of common H bonds involving nitrogen and oxygen acceptor and donor atoms. As the discussion of the results in section 4 will illustrate, this approach leads to a significantly improved performance of PM6 for the descrip- tion of H-bond interactions. An in-depth analysis of our correction revealed the following improvement opportunities: While the H-bond distance and the 180° condition for the A · · ·H-D angle are the most important geometrical features of hydrogen- bonding, two additional internal coordinates are needed to complete the sterical description by taking care of the “orientation of the lone pair” at the acceptor atom. We will show later that the full description of all important geo- metrical features of hydrogen-bonding in the second-genera- tion correction is the major reason for its improved accuracy and reliability. It turns out that the change to a physically more sound description of hydrogen bonding allows us to fix two other problems of the first-generation correction: First, the second term in eq 1 is only dependent on the H-bond distance coordinate. This leads to discontinuous potentials around values of 90° for the A · · ·H-D angle. Second, for some H-bond types, the secondsmeant to be B J. Chem. Theory Comput., Vol. xxx, No. xx, XXXX Korth et al. systems investigated here but become significant, for ex- ample, for large, saturated hydrocarbon chains. We decided to nevertheless include these changes here to avoid spreading the details of PM6-DH2 over multiple publications. As a training set for the stepwise H-bonding correction parameter optimization procedure, the equilibrium structures of the original H-correction training set33 (105 hydrogen- bond interaction energies) were extended with the (non- charged) 37 hydrogen-bonded DNA base pairs and 13 peptide interaction energies from the JSCH2005 set.6 Another optimization run on the much smaller S26 test led to essentially the same parameter values, showing on one hand the stability of our approach with respect to the chosen parameters and on the other hand the usefulness of this test set for the parametrization of new methodological develop- ments for H-bonding interactions. (Due to the implications of the limited size of the S26 complexes we do not assume this to be equally true for empirical dispersion corrections). Because the fitting of correction terms only to equilibrium structure data is prone to result in problems in real-life applications, we extended in the next step the S26 set with the S22 × 4 set,37 which contains four nonequilibrium structures with high-level reference data for each of the S22 complexes. While not necessary at all for the description of the equilibrium structures, the second (constrained to be repulsive) term in eq 2 was included to further improve the accuracy at short distances, especially in the case of very strong hydrogen bonds like those found in the formic acid dimer. The inclusion of more parameters for a (reasonably) larger number of acceptor atom types or additional parameters for donor atom types led to no significant improvements. In addition, it was found that method-independent values can be chosen for the parameters b, c, and d in eq 2, because slight differences are absorbed by the method-dependent a parameters. For the final parametrization of the new correc- tion for the PM6-D, AM1-D*, OM3-D, and SCC-DFTB-D methods on the combined S26/S22 × 4 set, fixed values of b, c, d, and five acceptor atom types with different a parameters were chosen. Albeit a significant improvement can already be found with only one a parameter, the additional increase of accuracy (especially for water and peptides in the case of PM6-D) outweighed our concerns of using five different acceptor-atom-type-based a parameters. This view was further supported by the rather well-behaved nature of the parameters (that nicely reflect the capabilities of the underlying SE methods to describe hydrogen bonding) and the overall number of parameters in SE methods. As a result, the outlined procedure led to three global and five method-dependent parameters for the description of common H bonds involving nitrogen and oxygen acceptor and donor atoms. One additional method-dependent param- eter for the description of H bonds involving sulfur acceptor atoms was generated accordingly for every method (except OM3, where no sulfur SE parameters were available to us), using the sulfur hydrogen-bonded DNA base pairs from the JSCH2005 set.6 The final parameters are shown in Table 2. We believe that the differences of the individual parameter values are more likely to reflect advantages and deficiencies of the parametrization of the underlying SE methods, rather than physical issues of different H-bonding interactions. As noted before, the qualitative parameter differences between methods nicely reflect the initial capability of the underlying SE methods to describe hydrogen bonding (with a rather bad performance of AM1 and a quite good performance of OM3 at the two ends of the scale). We also tested our correction with third-order SCC-DFTB with and without a modified γ parameter for an improved description of hydrogen bonding27 but ended up with the same accuracy as with SCC-DFTB-DH2 (which is signifi- cantly higher than third-order SCC-DFTB with a modified γ parameter, giving an MAD of 1.0 and an error span of 6.0 kcal/mol for the S26 test set). As the last step, an analytical gradient for the proposed correction was implemented. This was done analogously to the first-generation correction, that is, without derivatives of the atom charges. This approximation was found to have only a minor impact for the cases investigated here and allows us to keep our approach simple and fast but surely needs deeper investigation in the future. 3. Computational Details Semiempirical PM6 and AM1 calculations applying the MOZYME algorithm were done with MOPAC2009,35 OM3 calculations with MNDO2005, and SCC-DFTB calculations with DFTB+.38 TPSS39 and B3-LYP40,41 DFT calculations with empirical dispersion corrections of the Jurecka type8 were done with Turbomole 5.1042 using TZVPP43 Gaussian AO basis sets and the RI approximation44,45 for two-electron integrals. The second-generation H-bonding correction is implemented as an add-on correction to MOPAC2009, mndo99 and DFTB+ in our own development code (the latest version of this software can be obtained from the authors upon request), and will be included in a future release of MOPAC2009. 4. Results and Discussion Tables 3-8 show results of PM6, AM1, OM3, and SCC- DFTB calculations with dispersion correction and first- and second-generation H-bonding corrections for the S26 (Table 3), S22 (Table 4, in additional comparison to literature data), and S26+S22 × 4 benchmark sets (Table 5), the PM6-DH1 training set of 105 small hydrogen-bonded complexes (Table Table 2. Final Parameters parameter element PM6 AM1 OM3 DFTB Global b all 3.0 c all 0.65 d all 5.0 Method-Dependent a N 1.48 4.54 0.86 4.41 O 1.56 3.75 0.75 1.84 Oacid 1.55 5.55 1.51 1.15 Opeptide 0.96 3.46 0.78 1.56 Owater 0.76 3.52 0.49 1.57 S 0.85 1.05 -a 0.53 a OM3 sulfur parameters unavailable. H-Bonding Correction for Quantum-Chemical Methods J. Chem. Theory Comput., Vol. xxx, No. xx, XXXX E 6), the 37 noncharged, H-bonded DNA base pair complexes from the JSCH2005 set (table 7), and the 13 noncharged, H-bonded peptide-structures from the JSCH2005 test set (Table 8) with corresponding TPSS-D/TZVP and B3-LYP/ TZVP data for comparison. Each table shows the reference interaction energy at the CCSD(T)/CBS level, errors relative to these values for the investigated methods, followed by statistical measures overs these errors: the mean signed error (MSE), mean unsigned error (MUE), root-mean-square error (RMSE), and the error span (∆Max-Min). The errors are calculated so that a positive error means that the investigated method underestimates the binding energy and vice versa. The general trends for the different benchmark sets are very similar, so that the observations can be summarized altogether in the following way: The standard SE semiem- pirical methods perform quite badly for both dispersion and hydrogen-bonding interactions, but PM6, OM3, and SCC- DFTB (S26 MUEs around 3 kcal/mol) are significantly more accurate than AM1 (S26 MUE around 6 kcal/mol). The inclusion of empirical dispersion corrections is a great improvement for all tested semiempirical methods. With these corrections, the semiempirical methods are able to model dispersion bound complexes with comparably high accuracy (MUEs between 1 and 3 kcal/mol), so that the largest remaining errors are found for hydrogen-bond interactions. Table 3. Results for the S26 Seta S26 entry CCSD(T)/ CBS PM6 PM6- D PM6- DH1 PM6- DH2 TPSS- Da B3LYP- Da AM1 AM1- D* AM1- DH2 DFTB DFTB- D DFTB- DH2 OM3 OM3- D OM3- DH2 ammonia dimer -3.17 0.86 0.57 -0.57 -0.04 0.57 0.71 2.38 1.45 0.67 2.82 2.57 1.12 1.24 0.49 0.06 water dimer -5.02 1.08 0.91 0.35 0.12 1.31 1.49 2.13 1.16 -1.43 1.82 1.80 -0.63 0.93 0.28 0.09 formic acid dimer -18.61 7.47 6.76 1.23 -0.03 0.85 0.70 20.14 18.26 1.09 2.74 2.49 -1.11 7.06 5.44 0.26 formamide dimer -15.96 3.40 2.49 0.56 0.09 0.17 0.47 10.23 7.74 -0.97 3.40 2.67 -0.32 4.28 2.53 1.07 uracil dimer C2h -20.65 7.32 5.88 1.82 -0.56 -0.14 0.37 14.85 11.76 -0.30 4.98 3.74 -1.64 4.47 2.12 0.28 2-pyridoxine/2-aminopyridine -16.71 6.72 4.97 -0.64 0.37 1.16 1.06 12.25 8.09 0.15 6.45 4.84 -0.91 5.36 2.47 1.51 adenine/thymine Watson/Crick -16.37 7.30 5.37 -1.46 -0.09 0.71 0.78 12.08 7.65 -2.27 7.58 5.80 -1.54 5.03 1.84 0.39 methane dimer -0.53 0.56 0.18 -0.10 0.18 0.17 -0.03 0.85 -0.18 -0.18 0.54 0.06 0.06 0.67 -0.08 -0.08 ethene dimer -1.51 1.11 0.45 -0.01 0.45 0.17 0.11 1.38 -1.25 -1.25 1.32 0.70 0.70 1.65 -0.17 -0.17 benzene/methane -1.50 1.02 0.11 -0.25 0.11 -0.39 -0.41 1.90 -0.70 -0.70 1.32 0.29 0.29 1.63 -0.01 -0.01 benzene dimer stacked -2.73 2.84 -0.85 -0.90 -0.85 -0.31 -0.88 6.24 -0.38 -0.38 3.10 -0.37 -0.37 3.86 -1.14 -1.14 pyrazine dimer -4.42 2.60 -0.93 -1.00 -0.93 -0.75 -0.98 6.91 -0.16 -0.16 4.11 0.79 0.79 3.74 -1.47 -1.47 uracil dimer C2 -10.12 5.66 0.70 0.42 0.67 -1.04 -0.52 10.23 -0.02 -0.06 6.17 1.94 1.92 6.16 -1.29 -1.30 indole/benzene stacked -5.22 5.28 0.16 0.01 0.16 -1.00 -1.64 10.60 0.77 0.77 5.46 0.63 0.63 6.60 -0.76 -0.76 adenine/thymine stacked -12.23 7.29 0.57 -0.55 0.54 -1.07 -0.84 15.14 0.08 0.04 8.34 2.10 2.08 8.93 -1.87 -1.89 ethene/ethine -1.53 0.98 0.58 0.42 0.58 0.02 -0.09 1.18 0.13 0.13 0.99 0.54 0.54 0.85 0.10 0.10 benzene/water -3.28 1.00 0.10 -0.13 0.10 0.49 0.43 2.59 0.54 0.54 2.00 1.62 1.62 1.53 -0.01 -0.01 benzene/ammonia -2.35 0.82 -0.20 -0.42 -0.20 -0.10 -0.13 2.02 -0.41 -0.41 1.84 0.77 0.77 1.58 -0.04 -0.04 benzene/HCN -4.46 2.48 1.48 1.27 1.48 -0.64 -0.77 3.65 1.53 1.53 2.73 1.64 1.64 2.78 0.77 0.77 benzene dimer T-shaped -2.74 1.98 0.15 -0.10 0.15 -0.81 -0.95 3.10 -0.53 -0.53 2.42 0.69 0.69 2.85 0.09 0.09 indole/benzene T-shaped -5.73 3.32 0.79 0.42 0.79 -0.84 -1.25 4.67 0.44 0.44 4.05 1.71 1.71 4.56 0.79 0.79 phenol dimer -7.05 3.67 1.84 0.31 -0.01 0.11 0.52 5.69 0.96 -0.85 4.25 3.02 1.19 3.89 0.60 0.24 methanol dimer -5.70 2.20 1.72 0.26 -0.55 1.28 1.51 4.00 2.45 -0.10 2.66 2.45 0.07 2.58 1.42 1.11 methanol/formaldehyde -5.31 1.89 1.51 0.47 0.10 0.12 0.61 3.39 1.78 0.16 2.77 2.46 1.29 3.28 2.27 2.38 methylamide dimer (R) -6.69 1.76 0.74 -0.34 0.31 0.08 0.34 3.81 1.59 0.49 1.98 0.92 0.39 2.44 0.85 0.53 methylamide dimer () -7.65 1.78 0.99 -0.05 -0.02 -0.15 0.36 5.85 3.98 0.36 2.38 1.59 0.40 1.93 0.68 0.03 Complete S26 Set MSE 3.17 1.42 0.04 0.11 -0.00 0.04 6.43 2.57 -0.12 3.39 1.83 0.44 3.46 0.61 0.11 MUE 3.17 1.58 0.54 0.36 0.56 0.69 6.43 2.85 0.61 3.39 1.85 0.94 3.46 1.14 0.64 RMSE 3.94 2.46 0.71 0.51 0.69 0.82 8.18 5.17 0.81 3.93 2.31 1.10 4.04 1.63 0.91 ∆Max-Min 6.91 7.69 3.28 2.41 2.38 3.15 19.29 19.51 3.80 7.80 6.17 3.72 8.26 7.31 4.27 Hydrogen-Bonded Systems MSE 3.79 2.81 0.16 -0.03 0.51 0.74 8.07 5.57 -0.25 3.65 2.86 -0.14 3.54 1.75 0.66 MUE 3.79 2.81 0.67 0.19 0.55 0.74 8.07 5.57 0.74 3.65 2.86 0.88 3.54 1.75 0.66 RMSE 4.56 3.55 0.85 0.27 0.73 0.84 9.77 7.57 0.96 4.05 3.15 1.01 3.95 2.22 0.96 ∆Max-Min 6.61 6.19 3.28 0.93 1.46 1.17 18.01 17.30 3.36 5.76 4.88 2.93 6.13 5.16 2.35 a TZVP basis set. a Errors, mean signed error (MSE), mean unsigned error (MUE), root mean square error (RMSE), and the error span ∆Max-Min with respect to the benchmark CCSD(T)/CBS interaction energies are presented. All values in kcal/mol. Table 4. Results for the S22 Seta method MUE MP2/CBS 0.8c B3LYP-D/TZVP 0.7 TPSS-D/TZVP 0.6 M08-HX/6-311+G(3df,2p)/CP 0.5c M06-2X/6-311+G(3df,2p)/CP 0.4c TPSS-D/6-311++G(3df,3pdf) 0.3d B2-PLYP-D/TZVPP/0.5CP 0.3e PM3BP 5.2f SCC-DFTB-D 1.9 OM3-D 1.1 PM3-Db 0.9f AM1-Db 0.9f PM6-DH1 0.6 SCC-DFTB-DH2 1.0 AM1-DH2 0.7 OM3-DH2 0.6 PM6-DH2 0.4 a Comparison of mean unsigned errors (MUE) with respect to the benchmark CCSD(T)/CBS interaction energies for various wave function theory, density-function theory and enhanced semiempirical quantum chemical methods are presented. All values in kcal/mol. (0.5)CP stands for (half) counter-poise corrected values. b With 18 adjusted AM1/PM3 parameters, see ref 9. c From ref 46. d From ref 8. e From ref 47. f From ref 9. F J. Chem. Theory Comput., Vol. xxx, No. xx, XXXX Korth et al. As mentioned before, the inclusion of our first-generation H-bonding correction in PM6-DH1 is already a major step toward a higher accuracy for these interactions (with a MUE of 0.7 kcal/mol for the hydrogen-bonding interactions in the S26 set). The largest errors are found for double hydrogen bonds, because of the parametrization to single hydrogen bonds and the higher likeliness of nonphysical contributions to the H-bonding correction in these cases. The new correction manages to reach even higher accuracy (with a corresponding MUE of 0.2 kcal/mol) and greatly reduced error span (from 3.3 to 0.8 kcal/mol). Furthermore, it can be seen that the new H-bonding correction does not lead to nonphysical interaction energy contributions for purely dispersion-bound complexes (the values for PM6-DH being essentially the same as for PM6-D). Albeit a less accurate final performance is found for AM1, OM3, and DFTB when compared to PM6, the large decrease of errors (especially for AM1) is still impressive for these SE methods. As we focused on PM6 during the initial development phase, we do not want to exclude the possibility that further improvements are possible, especially for AM1, for which the chosen repulsive term seems to fit least well. While the results for the hydrogen-bonded complexes of the PM6-DH1 training set and the hydrogen-bonded JSCH2005 peptides (both with smaller interaction energies of -6.2 and -4.4 kcal/mol and quite good values already for the “pure” SE methods) are less impressive, the hydrogen- bonded JSCH2005 DNA base pairs set (with an average interaction energy of -19.5 kcal/mol) shows how large the gain of applying the second generation H-bonding correction can be, if H-bonding interaction energies become larger. We believe that the rather poor performance for the peptide test set stems at least partly from an unbalanced description of dispersion and hydrogen-bond interactions through the combination of the two empirical corrections, which will be addressed in our future work. It can nevertheless be stated that the obtained quality of the PM6-DH2, AM1-DH2, OM3-DH2, and DFTB-DH2 calculations reaches the accuracy of DFT-D methods (with TPSS-D/TZVP being one of the most accurate for the noncovalent interactions) for a large part of the investigated cases, while being several orders of magnitude faster. For the S22 set (included in our fit set, but also used as fit set for DFT corrections), PM6-DH2 (MUE 0.4 kcal/mol) nearly Table 5. Results for the Combined S26+S22 × 4 Sets (114 Entries)a PM6 PM6-D PM6-DH2 TPSS-Db B3LYP-Db AM1 AM1-D* AM1-DH2 DFTB DFTB-D DFTB-DH2 OM3 OM3-D OM3-DH2 MSE 2.23 1.04 0.14 -0.56 -0.63 4.43 1.86 0.06 2.42 1.34 0.38 2.20 0.51 -0.10 MUE 2.23 1.17 0.36 0.58 0.63 4.43 2.23 0.89 2.42 1.49 0.68 2.20 0.88 0.72 RMSE 3.21 2.07 0.61 0.69 0.72 7.00 4.87 1.72 3.20 2.18 0.88 2.98 1.49 1.29 ∆Max-Min 9.62 9.69 4.96 1.68 1.46 28.52 29.15 14.40 10.21 10.64 4.35 8.97 9.76 7.71 a Mean signed error (MSE), mean unsigned error (MUE), root mean square error (RMSE), and the error span ∆Max-Min with respect to the benchmark CCSD(T)/CBS interaction energies are presented. All values in kcal/mol. b TZVP basis set. Table 6. Results for 105 Small, Hydrogen-Bonded Complexes of the PM6-DH1 Training Seta PM6 PM6-D PM6-DH2 TPSS-Db B3LYP-Db AM1 AM1-D* AM1-DH2 DFTB DFTB-D DFTB-DH2 OM3 OM3-D OM3-DH2 MSE -2.63 -1.66 -0.43 0.90 1.01 -5.06 -2.55 -0.12 -3.19 -2.33 -0.40 -2.67 -0.88 -0.51 MUE 2.64 1.76 1.15 0.91 1.01 5.06 2.70 1.59 3.21 2.36 0.85 2.67 0.91 0.66 RMSE 3.16 2.35 1.54 1.08 1.15 6.07 4.03 2.12 3.56 2.79 1.06 2.87 1.14 0.86 ∆Max-Min 9.22 9.61 7.37 3.66 3.52 22.75 22.64 12.13 10.32 10.47 5.15 6.73 6.52 4.48 a Mean signed error (MSE), mean unsigned error (MUE), root mean square error (RMSE), and the error span ∆Max-Min with respect to the benchmark CCSD(T)/CBS interaction energies are presented. All values in kcal/mol. b TZVP basis set. Table 7. Results for the Noncharged Hydrogen-Bonded JSCH2005 DNA Base Pairs (37 Entries)a PM6 PM6-D PM6- DH2 TPSS-Db B3LYP-Db AM1 AM1-D* AM1- DH2 DFTBc DFTB-Dc DFTB- DH2c OM3d OM3-Dd OM3- DH2d MSE -8.07 -6.06 -0.83 0.42 0.55 -14.08 -9.56 -0.01 -7.21 -5.34 2.09 -5.67 -2.41 -0.80 MUE 8.07 6.06 1.85 0.72 0.70 14.08 9.56 2.20 7.21 5.34 2.78 5.67 2.49 1.21 RMSE 8.23 6.23 2.36 0.97 0.97 14.71 10.33 2.84 7.47 5.64 3.25 5.89 2.79 1.44 ∆Max-Min 8.73 7.67 8.84 3.97 3.86 18.85 16.43 12.43 7.88 6.85 9.58 7.01 6.81 4.53 a Mean signed error (MSE), mean unsigned error (MUE), root mean square error (RMSE), and the error span ∆Max-Min with respect to the benchmark CCSD(T)/CBS interaction energies are presented. All values in kcal/mol. b TZVP basis set. c Without adenine/fluorotoluene Watson/Crick complex because of missing fluorine parameters. d Without seven thio base pairs because of missing sulfur parameters. Table 8. Results for the Hydrogen-Bonded JSCH2005 Peptides (13 Entries)a PM6 PM6-D PM6- DH2 TPSS-Db B3LYP-Db AM1 AM1-D* AM1- DH2 DFTB DFTB-D DFTB- DH2 OM3c OM3-Dc OM3- DH2c MSE -2.97 -0.19 -0.13 -0.35 -0.45 -4.49 1.40 1.47 -3.77 -0.80 -0.72 -3.96 0.33 0.36 MUE 2.97 0.68 0.71 0.67 0.60 4.49 1.49 1.56 3.77 0.87 0.79 3.96 0.60 0.62 RMSE 3.24 0.87 0.89 0.80 0.82 4.95 1.91 2.00 4.01 1.01 0.94 4.21 0.80 0.81 ∆Max-Min 4.56 3.44 3.41 2.53 2.59 7.31 4.30 4.27 5.34 2.91 2.84 5.20 2.81 2.79 a Mean signed error (MSE), mean unsigned error (MUE), root mean square error (RMSE), and the error span ∆Max-Min with respect to the benchmark CCSD(T)/CBS interaction energies are presented. All values in kcal/mol. b TZVP basis set. c Without seven thio base pairs because of missing sulfur parameters. H-Bonding Correction for Quantum-Chemical Methods J. Chem. Theory Comput., Vol. xxx, No. xx, XXXX G
Docsity logo



Copyright © 2024 Ladybird Srl - Via Leonardo da Vinci 16, 10126, Torino, Italy - VAT 10816460017 - All rights reserved