File Download
Supplementary

Conference Paper: Benign Overfitting in Two-Layer ReLU Convolutional Neural Networks for XOR Data

TitleBenign Overfitting in Two-Layer ReLU Convolutional Neural Networks for XOR Data
Authors
Issue Date7-May-2024
Abstract

Modern deep learning models are usually highly over-parameterized so that they can overfit the training data. Surprisingly, such overfitting neural networks can usually still achieve high prediction accuracy. To study this ``benign overfitting'' phenomenon, a line of recent works has theoretically studied the learning of linear models and two-layer neural networks. However, most of these analyses are still limited to the very simple learning problems where the Bayes-optimal classifier is linear. In this work, we investigate a class of XOR-type classification tasks with label-flipping noises. We show that, under a certain condition on the sample complexity and signal-to-noise ratio, an over-parameterized ReLU CNN trained by gradient descent can achieve near Bayes-optimal accuracy. Moreover, we also establish a matching lower bound result showing that when the previous condition is not satisfied, the prediction accuracy of the obtained CNN is an absolute constant away from the Bayes-optimal rate. Our result demonstrates that CNNs have a remarkable capacity to efficiently learn XOR problems, even in the presence of highly correlated features.


Persistent Identifierhttp://hdl.handle.net/10722/347151

 

DC FieldValueLanguage
dc.contributor.authorMeng, Xuran-
dc.contributor.authorZou, Difan-
dc.contributor.authorCao, Yuan-
dc.date.accessioned2024-09-18T00:30:41Z-
dc.date.available2024-09-18T00:30:41Z-
dc.date.issued2024-05-07-
dc.identifier.urihttp://hdl.handle.net/10722/347151-
dc.description.abstract<p>Modern deep learning models are usually highly over-parameterized so that they can overfit the training data. Surprisingly, such overfitting neural networks can usually still achieve high prediction accuracy. To study this ``benign overfitting'' phenomenon, a line of recent works has theoretically studied the learning of linear models and two-layer neural networks. However, most of these analyses are still limited to the very simple learning problems where the Bayes-optimal classifier is linear. In this work, we investigate a class of XOR-type classification tasks with label-flipping noises. We show that, under a certain condition on the sample complexity and signal-to-noise ratio, an over-parameterized ReLU CNN trained by gradient descent can achieve near Bayes-optimal accuracy. Moreover, we also establish a matching lower bound result showing that when the previous condition is not satisfied, the prediction accuracy of the obtained CNN is an absolute constant away from the Bayes-optimal rate. Our result demonstrates that CNNs have a remarkable capacity to efficiently learn XOR problems, even in the presence of highly correlated features.<br></p>-
dc.languageeng-
dc.relation.ispartofThe Forty-first International Conference on Machine Learning (21/07/2024-27/07/2024, Vienna)-
dc.titleBenign Overfitting in Two-Layer ReLU Convolutional Neural Networks for XOR Data-
dc.typeConference_Paper-
dc.description.naturepublished_or_final_version-

Export via OAI-PMH Interface in XML Formats


OR


Export to Other Non-XML Formats