Benign Overfitting in Two-Layer ReLU Convolutional Neural Networks for XOR Data

Meng, Xuran; Zou, Difan; Cao, Yuan

File Download

content.pdf

Supplementary

Citations:
Appears in Collections:
- Computer Science: Conference papers

Conference Paper: Benign Overfitting in Two-Layer ReLU Convolutional Neural Networks for XOR Data

Title	Benign Overfitting in Two-Layer ReLU Convolutional Neural Networks for XOR Data
Authors	Meng, Xuran Zou, Difan Cao, Yuan
Issue Date	7-May-2024
Abstract	Modern deep learning models are usually highly over-parameterized so that they can overfit the training data. Surprisingly, such overfitting neural networks can usually still achieve high prediction accuracy. To study this ``benign overfitting'' phenomenon, a line of recent works has theoretically studied the learning of linear models and two-layer neural networks. However, most of these analyses are still limited to the very simple learning problems where the Bayes-optimal classifier is linear. In this work, we investigate a class of XOR-type classification tasks with label-flipping noises. We show that, under a certain condition on the sample complexity and signal-to-noise ratio, an over-parameterized ReLU CNN trained by gradient descent can achieve near Bayes-optimal accuracy. Moreover, we also establish a matching lower bound result showing that when the previous condition is not satisfied, the prediction accuracy of the obtained CNN is an absolute constant away from the Bayes-optimal rate. Our result demonstrates that CNNs have a remarkable capacity to efficiently learn XOR problems, even in the presence of highly correlated features.
Persistent Identifier	http://hdl.handle.net/10722/347151

DC Field	Value	Language
dc.contributor.author	Meng, Xuran	-
dc.contributor.author	Zou, Difan	-
dc.contributor.author	Cao, Yuan	-
dc.date.accessioned	2024-09-18T00:30:41Z	-
dc.date.available	2024-09-18T00:30:41Z	-
dc.date.issued	2024-05-07	-
dc.identifier.uri	http://hdl.handle.net/10722/347151	-
dc.description.abstract	<p>Modern deep learning models are usually highly over-parameterized so that they can overfit the training data. Surprisingly, such overfitting neural networks can usually still achieve high prediction accuracy. To study this ``benign overfitting'' phenomenon, a line of recent works has theoretically studied the learning of linear models and two-layer neural networks. However, most of these analyses are still limited to the very simple learning problems where the Bayes-optimal classifier is linear. In this work, we investigate a class of XOR-type classification tasks with label-flipping noises. We show that, under a certain condition on the sample complexity and signal-to-noise ratio, an over-parameterized ReLU CNN trained by gradient descent can achieve near Bayes-optimal accuracy. Moreover, we also establish a matching lower bound result showing that when the previous condition is not satisfied, the prediction accuracy of the obtained CNN is an absolute constant away from the Bayes-optimal rate. Our result demonstrates that CNNs have a remarkable capacity to efficiently learn XOR problems, even in the presence of highly correlated features.<br></p>	-
dc.language	eng	-
dc.relation.ispartof	The Forty-first International Conference on Machine Learning (21/07/2024-27/07/2024, Vienna)	-
dc.title	Benign Overfitting in Two-Layer ReLU Convolutional Neural Networks for XOR Data	-
dc.type	Conference_Paper	-
dc.description.nature	published_or_final_version	-

File Download

Supplementary

Conference Paper: Benign Overfitting in Two-Layer ReLU Convolutional Neural Networks for XOR Data

Export via OAI-PMH Interface in XML Formats

OR

Export to Other Non-XML Formats