A shrinkage principle for heavy-tailed data: High-dimensional robust low-rank matrix recovery

Fan, Jianqing; Wang, Weichen; Zhu, Ziwei

File Download

There are no files associated with this item.

Links for fulltext

(May Require Subscription)

Publisher Website: 10.1214/20-AOS1980
Scopus: eid_2-s2.0-85102548958
WOS: WOS:000684378300001
Find via

Supplementary

Citations:
- Scopus: 0
- Web of Science: 0
Appears in Collections:
- Faculty of Business & Economics: Journal/Magazine Articles

Article: A shrinkage principle for heavy-tailed data: High-dimensional robust low-rank matrix recovery

Title	A shrinkage principle for heavy-tailed data: High-dimensional robust low-rank matrix recovery
Authors	Fan, Jianqing Wang, Weichen Zhu, Ziwei
Keywords	Trace regression Heavy-tailed data Shrinkage Robust statistics Low-rank matrix recovery High-dimensional statistics
Issue Date	2021
Citation	Annals of Statistics, 2021, v. 49, n. 3, p. 1239-1266 How to Cite? DOI: http://dx.doi.org/10.1214/20-AOS1980
Abstract	This paper introduces a simple principle for robust statistical inference via appropriate shrinkage on the data. This widens the scope of high-dimensional techniques, reducing the distributional conditions from subexponential or sub-Gaussian to more relaxed bounded second or fourth moment. As an illustration of this principle, we focus on robust estimation of the low-rank matrix Θ∗ from the trace regression model Y = Tr(Θ∗X) + ε. It encompasses four popular problems: sparse linear model, compressed sensing, matrix completion and multitask learning. We propose to apply the penalized least-squares approach to the appropriately truncated or shrunk data. Under only bounded 2 + δ moment condition on the response, the proposed robust methodology yields an estimator that possesses the same statistical error rates as previous literature with sub-Gaussian errors. For sparse linear model and multitask regression, we further allow the design to have only bounded fourth moment and obtain the same statistical rates. As a byproduct, we give a robust covariance estimator with concentration inequality and optimal rate of convergence in terms of the spectral norm, when the samples only bear bounded fourth moment. This result is of its own interest and importance. We reveal that under high dimensions, the sample covariance matrix is not optimal whereas our proposed robust covariance can achieve optimality. Extensive simulations are carried out to support the theories.
Persistent Identifier	http://hdl.handle.net/10722/303766
ISSN	0090-5364 2023 Impact Factor: 3.2 2023 SCImago Journal Rankings: 5.335
ISI Accession Number ID	WOS:000684378300001

DC Field	Value	Language
dc.contributor.author	Fan, Jianqing	-
dc.contributor.author	Wang, Weichen	-
dc.contributor.author	Zhu, Ziwei	-
dc.date.accessioned	2021-09-15T08:25:58Z	-
dc.date.available	2021-09-15T08:25:58Z	-
dc.date.issued	2021	-
dc.identifier.citation	Annals of Statistics, 2021, v. 49, n. 3, p. 1239-1266	-
dc.identifier.issn	0090-5364	-
dc.identifier.uri	http://hdl.handle.net/10722/303766	-
dc.description.abstract	This paper introduces a simple principle for robust statistical inference via appropriate shrinkage on the data. This widens the scope of high-dimensional techniques, reducing the distributional conditions from subexponential or sub-Gaussian to more relaxed bounded second or fourth moment. As an illustration of this principle, we focus on robust estimation of the low-rank matrix Θ∗ from the trace regression model Y = Tr(Θ∗X) + ε. It encompasses four popular problems: sparse linear model, compressed sensing, matrix completion and multitask learning. We propose to apply the penalized least-squares approach to the appropriately truncated or shrunk data. Under only bounded 2 + δ moment condition on the response, the proposed robust methodology yields an estimator that possesses the same statistical error rates as previous literature with sub-Gaussian errors. For sparse linear model and multitask regression, we further allow the design to have only bounded fourth moment and obtain the same statistical rates. As a byproduct, we give a robust covariance estimator with concentration inequality and optimal rate of convergence in terms of the spectral norm, when the samples only bear bounded fourth moment. This result is of its own interest and importance. We reveal that under high dimensions, the sample covariance matrix is not optimal whereas our proposed robust covariance can achieve optimality. Extensive simulations are carried out to support the theories.	-
dc.language	eng	-
dc.relation.ispartof	Annals of Statistics	-
dc.subject	Trace regression	-
dc.subject	Heavy-tailed data	-
dc.subject	Shrinkage	-
dc.subject	Robust statistics	-
dc.subject	Low-rank matrix recovery	-
dc.subject	High-dimensional statistics	-
dc.title	A shrinkage principle for heavy-tailed data: High-dimensional robust low-rank matrix recovery	-
dc.type	Article	-
dc.description.nature	link_to_subscribed_fulltext	-
dc.identifier.doi	10.1214/20-AOS1980	-
dc.identifier.scopus	eid_2-s2.0-85102548958	-
dc.identifier.hkuros	327589	-
dc.identifier.volume	49	-
dc.identifier.issue	3	-
dc.identifier.spage	1239	-
dc.identifier.epage	1266	-
dc.identifier.eissn	2168-8966	-
dc.identifier.isi	WOS:000684378300001	-

File Download

Links for fulltext

(May Require Subscription)

Supplementary

Article: A shrinkage principle for heavy-tailed data: High-dimensional robust low-rank matrix recovery

Export via OAI-PMH Interface in XML Formats

OR

Export to Other Non-XML Formats