File Download

There are no files associated with this item.

  Links for fulltext
     (May Require Subscription)
Supplementary

Article: Spectral statistics of sample block correlation matrices

TitleSpectral statistics of sample block correlation matrices
Authors
Issue Date1-Oct-2024
PublisherInstitute of Mathematical Statistics
Citation
The Annals of Statistics, 2024, v. 52, n. 5, p. 1873-1898 How to Cite?
Abstract

A fundamental concept in multivariate statistics, the sample correlation matrix, is often used to infer the correlation/dependence structure among random variables, when the population mean and covariance are unknown. A natural block extension of it, the sample block correlation matrix, is proposed to take on the same role, when random variables are generalized to random subvectors. In this paper, we establish a spectral theory of the sample block correlation matrices and apply it to group independent tests and related problems, under the high-dimensional setting. More specifically, we consider a random vector of dimension p, consisting of k subvectors of dimension pt’s, where pt’s can vary from 1 to order p. Our primary goal is to investigate the dependence of the k subvectors. We construct a random matrix model called sample block correlation matrix based on N samples for this purpose. The spectral statistics of the sample block correlation matrix include the classical Wilks’ statistic and Schott’s statistic as special cases. It turns out that the spectral statistics do not depend on the unknown population mean and covariance, under the null hypothesis that the subvectors are independent. Further, the limiting behavior of the spectral statistics can be described with the aid of the free probability theory. Specifically, under three different settings of possibly N-dependent k and pt’s, we show that the empirical spectral distribution of the sample block correlation matrix converges to the free Poisson binomial distribution, free Poisson distribution (Marchenko–Pastur law) and free Gaussian distribution (semicircle law), respectively. We then further derive the CLTs for the linear spectral statistics of the block correlation matrix under a general setting. Our results are established under the general distribution assumption on the random vector. It turns out that the CLTs are universal and do not depend on the 4th cumulants of the vector components, due to a self-normalizing effect of the correlation-type matrices. We further derive the CLT under the alternative hypothesis and discuss the power of our statistics. Based on our theory, real data analysis on stock return data and gene data is also conducted.


Persistent Identifierhttp://hdl.handle.net/10722/353580
ISSN
2023 Impact Factor: 3.2
2023 SCImago Journal Rankings: 5.335
ISI Accession Number ID

 

DC FieldValueLanguage
dc.contributor.authorBao, By Zhigang-
dc.contributor.authorJiang, H. U.-
dc.contributor.authorXiaocong, X. U.-
dc.contributor.authorZhang, Xiaozhuo-
dc.date.accessioned2025-01-21T00:35:48Z-
dc.date.available2025-01-21T00:35:48Z-
dc.date.issued2024-10-01-
dc.identifier.citationThe Annals of Statistics, 2024, v. 52, n. 5, p. 1873-1898-
dc.identifier.issn0090-5364-
dc.identifier.urihttp://hdl.handle.net/10722/353580-
dc.description.abstract<p>A fundamental concept in multivariate statistics, the sample correlation matrix, is often used to infer the correlation/dependence structure among random variables, when the population mean and covariance are unknown. A natural block extension of it, the sample block correlation matrix, is proposed to take on the same role, when random variables are generalized to random subvectors. In this paper, we establish a spectral theory of the sample block correlation matrices and apply it to group independent tests and related problems, under the high-dimensional setting. More specifically, we consider a random vector of dimension p, consisting of k subvectors of dimension pt’s, where pt’s can vary from 1 to order p. Our primary goal is to investigate the dependence of the k subvectors. We construct a random matrix model called sample block correlation matrix based on N samples for this purpose. The spectral statistics of the sample block correlation matrix include the classical Wilks’ statistic and Schott’s statistic as special cases. It turns out that the spectral statistics do not depend on the unknown population mean and covariance, under the null hypothesis that the subvectors are independent. Further, the limiting behavior of the spectral statistics can be described with the aid of the free probability theory. Specifically, under three different settings of possibly N-dependent k and pt’s, we show that the empirical spectral distribution of the sample block correlation matrix converges to the free Poisson binomial distribution, free Poisson distribution (Marchenko–Pastur law) and free Gaussian distribution (semicircle law), respectively. We then further derive the CLTs for the linear spectral statistics of the block correlation matrix under a general setting. Our results are established under the general distribution assumption on the random vector. It turns out that the CLTs are universal and do not depend on the 4th cumulants of the vector components, due to a self-normalizing effect of the correlation-type matrices. We further derive the CLT under the alternative hypothesis and discuss the power of our statistics. Based on our theory, real data analysis on stock return data and gene data is also conducted.</p>-
dc.languageeng-
dc.publisherInstitute of Mathematical Statistics-
dc.relation.ispartofThe Annals of Statistics-
dc.titleSpectral statistics of sample block correlation matrices-
dc.typeArticle-
dc.identifier.doi10.1214/24-AOS2375-
dc.identifier.scopuseid_2-s2.0-85210369460-
dc.identifier.volume52-
dc.identifier.issue5-
dc.identifier.spage1873-
dc.identifier.epage1898-
dc.identifier.isiWOS:001362323500001-
dc.identifier.issnl0090-5364-

Export via OAI-PMH Interface in XML Formats


OR


Export to Other Non-XML Formats