File Download

There are no files associated with this item.

  Links for fulltext
     (May Require Subscription)
Supplementary

Book Chapter: Machine Learning Corrections for DFT Noncovalent Interactions

TitleMachine Learning Corrections for DFT Noncovalent Interactions
Authors
Issue Date2021
PublisherSpringer
Citation
Machine Learning Corrections for DFT Noncovalent Interactions. In Shankar, S ... et al (Eds.), Computational Materials, Chemistry, and Biochemistry: From Bold Initiatives to the Last Mile, p. 183-212. Cham: Springer, 2021 How to Cite?
AbstractNoncovalent interactions (NCIs) play crucial roles in supramolecular chemistries; however, they are difficult to measure and compute. Currently, reliable computational methods are being pursued to meet this challenge, but the accuracy of calculations based on low levels of theory is not satisfactory and calculations based on high levels of theory are often too costly. Accordingly, to reduce the cost and increase the accuracy of low-level theoretical calculations to describe NCIs, an efficient approach is proposed to correct NCI calculations based on the benchmark databases S22, S66, and X40. In this approach, machine learning methods, general regression neural network (GRNN), and support vector machine (SVM) are used to perform the correction for DFT methods on the basis of DFT calculations. Various DFT methods, including M06-2X, B3LYP, B3LYP-D3, PBE, PBE-D3, and ωB97XD, with two small basis sets (i.e., 6-31G* and 6-31+G*) were investigated. Moreover, the conductor-like polarizable continuum model (C-PCM) with two types of solvents (water and pentylamine) was considered in some DFT calculations. With the correction, the root mean square errors (RMSEs) of all DFT calculations were improved by at least 70%. Relative to CCSD(T)/CBS benchmark values (used as experimental NCI values because of its high accuracy), the mean absolute error (MAE) of the best GRNN result was 0.33 kcal/mol, which is comparable to high-level ab initio methods or DFT methods with fairly large basis sets. Notably, this level of accuracy is achieved within a fraction of the time required by other methods. Additionally, SVM is applied on datasets in the gas phase, which gave similar correction accuracy as GRNN. For all of the correction models based on various DFT approaches, the validation parameters according to OECD principles (i.e., the correlation coefficient R, the predictive squared correlation coefficient q2 and q cv 2 from cross-validation) were greater than 0.92, which suggests that the correction model has good stability, robustness, and predictive power. The correction can be added following DFT calculations. With the obtained molecular descriptors, the NCIs produced by DFT methods can be improved to achieve high-level accuracy. Moreover, only one parameter is introduced into the GRNN correction model, which makes it easily applicable. Overall, this work demonstrates that the machine learning correction model may be an alternative to the traditional means of correcting for NCIs.
DescriptionChapter 10
Persistent Identifierhttp://hdl.handle.net/10722/306160
ISBN
Series/Report no.Springer Series in Materials Science (SSMATERIALS) ; v. 284

 

DC FieldValueLanguage
dc.contributor.authorLi, W-
dc.contributor.authorLiu, J-
dc.contributor.authorLi, L-
dc.contributor.authorHu, LH-
dc.contributor.authorSu, ZM-
dc.contributor.authorChen, G-
dc.date.accessioned2021-10-20T10:19:38Z-
dc.date.available2021-10-20T10:19:38Z-
dc.date.issued2021-
dc.identifier.citationMachine Learning Corrections for DFT Noncovalent Interactions. In Shankar, S ... et al (Eds.), Computational Materials, Chemistry, and Biochemistry: From Bold Initiatives to the Last Mile, p. 183-212. Cham: Springer, 2021-
dc.identifier.isbn9783030187774-
dc.identifier.urihttp://hdl.handle.net/10722/306160-
dc.descriptionChapter 10-
dc.description.abstractNoncovalent interactions (NCIs) play crucial roles in supramolecular chemistries; however, they are difficult to measure and compute. Currently, reliable computational methods are being pursued to meet this challenge, but the accuracy of calculations based on low levels of theory is not satisfactory and calculations based on high levels of theory are often too costly. Accordingly, to reduce the cost and increase the accuracy of low-level theoretical calculations to describe NCIs, an efficient approach is proposed to correct NCI calculations based on the benchmark databases S22, S66, and X40. In this approach, machine learning methods, general regression neural network (GRNN), and support vector machine (SVM) are used to perform the correction for DFT methods on the basis of DFT calculations. Various DFT methods, including M06-2X, B3LYP, B3LYP-D3, PBE, PBE-D3, and ωB97XD, with two small basis sets (i.e., 6-31G* and 6-31+G*) were investigated. Moreover, the conductor-like polarizable continuum model (C-PCM) with two types of solvents (water and pentylamine) was considered in some DFT calculations. With the correction, the root mean square errors (RMSEs) of all DFT calculations were improved by at least 70%. Relative to CCSD(T)/CBS benchmark values (used as experimental NCI values because of its high accuracy), the mean absolute error (MAE) of the best GRNN result was 0.33 kcal/mol, which is comparable to high-level ab initio methods or DFT methods with fairly large basis sets. Notably, this level of accuracy is achieved within a fraction of the time required by other methods. Additionally, SVM is applied on datasets in the gas phase, which gave similar correction accuracy as GRNN. For all of the correction models based on various DFT approaches, the validation parameters according to OECD principles (i.e., the correlation coefficient R, the predictive squared correlation coefficient q2 and q cv 2 from cross-validation) were greater than 0.92, which suggests that the correction model has good stability, robustness, and predictive power. The correction can be added following DFT calculations. With the obtained molecular descriptors, the NCIs produced by DFT methods can be improved to achieve high-level accuracy. Moreover, only one parameter is introduced into the GRNN correction model, which makes it easily applicable. Overall, this work demonstrates that the machine learning correction model may be an alternative to the traditional means of correcting for NCIs.-
dc.languageeng-
dc.publisherSpringer-
dc.relation.ispartofComputational Materials, Chemistry, and Biochemistry: From Bold Initiatives to the Last Mile-
dc.relation.ispartofseriesSpringer Series in Materials Science (SSMATERIALS) ; v. 284-
dc.titleMachine Learning Corrections for DFT Noncovalent Interactions-
dc.typeBook_Chapter-
dc.identifier.emailChen, G: ghchen@hku.hk-
dc.identifier.authorityChen, G=rp00671-
dc.description.naturelink_to_subscribed_fulltext-
dc.identifier.doi10.1007/978-3-030-18778-1_10-
dc.identifier.scopuseid_2-s2.0-85101147152-
dc.identifier.hkuros327787-
dc.identifier.spage183-
dc.identifier.epage212-
dc.publisher.placeCham-

Export via OAI-PMH Interface in XML Formats


OR


Export to Other Non-XML Formats