An imputation-regularized optimization algorithm for high dimensional missing data problems and beyond

Luo, Y; Liang, FAMING; Jia, BOCHAO; Xue, JINGNAN; Li, QIZAI

File Download

content.pdf

Links for fulltext

(May Require Subscription)

Publisher Website: 10.1111/rssb.12279
Scopus: eid_2-s2.0-85055455509
WOS: WOS:000448897700003
Find via

Supplementary

Citations:
- Scopus: 0
- Web of Science: 0
Appears in Collections:
- Faculty of Business & Economics: Journal/Magazine Articles

Article: An imputation-regularized optimization algorithm for high dimensional missing data problems and beyond

Title	An imputation-regularized optimization algorithm for high dimensional missing data problems and beyond
Authors	Luo, Y Liang, FAMING Jia, BOCHAO Xue, JINGNAN Li, QIZAI
Keywords	Expectation–maximization algorithm Gaussian graphical model Gibbs sampler Imputation consistency Random‐coefficient model
Issue Date	2018
Publisher	Wiley-Blackwell Publishing Ltd. The Journal's web site is located at http://www.blackwellpublishing.com/journals/RSSB
Citation	Journal of the Royal Statistical Society. Series B: Statistical Methodology, 2018, v. 80 n. 5, p. 899-926 How to Cite? DOI: http://dx.doi.org/10.1111/rssb.12279
Abstract	Missing data are frequently encountered in high-dimensional problems, but they are usually difficult to deal with using standard algorithms, such as the expectation-maximization (EM) algorithm and its variants. To tackle this difficulty, some problem-specific algorithms have been developed in the literature, but there still lacks a general algorithm. This work is to fill the gap: we propose a general algorithm for high-dimensional missing data problems. The proposed algorithm works by iterating between an imputation step and a consistency step. At the imputation step, the missing data are imputed conditional on the observed data and the current estimate of parameters; and at the consistency step, a consistent estimate is found for the minimizer of a Kullback-Leibler divergence defined on the pseudo-complete data. For high dimensional problems, the consistent estimate can be found under sparsity constraints. The consistency of the averaged estimate for the true parameter can be established under quite general conditions. The proposed algorithm is illustrated using high-dimensional Gaussian graphical models, high-dimensional variable selection, and a random coefficient model.
Persistent Identifier	http://hdl.handle.net/10722/272098
ISSN	1369-7412 2023 Impact Factor: 3.1 2023 SCImago Journal Rankings: 4.330
ISI Accession Number ID	WOS:000448897700003

DC Field	Value	Language
dc.contributor.author	Luo, Y	-
dc.contributor.author	Liang, FAMING	-
dc.contributor.author	Jia, BOCHAO	-
dc.contributor.author	Xue, JINGNAN	-
dc.contributor.author	Li, QIZAI	-
dc.date.accessioned	2019-07-20T10:35:38Z	-
dc.date.available	2019-07-20T10:35:38Z	-
dc.date.issued	2018	-
dc.identifier.citation	Journal of the Royal Statistical Society. Series B: Statistical Methodology, 2018, v. 80 n. 5, p. 899-926	-
dc.identifier.issn	1369-7412	-
dc.identifier.uri	http://hdl.handle.net/10722/272098	-
dc.description.abstract	Missing data are frequently encountered in high-dimensional problems, but they are usually difficult to deal with using standard algorithms, such as the expectation-maximization (EM) algorithm and its variants. To tackle this difficulty, some problem-specific algorithms have been developed in the literature, but there still lacks a general algorithm. This work is to fill the gap: we propose a general algorithm for high-dimensional missing data problems. The proposed algorithm works by iterating between an imputation step and a consistency step. At the imputation step, the missing data are imputed conditional on the observed data and the current estimate of parameters; and at the consistency step, a consistent estimate is found for the minimizer of a Kullback-Leibler divergence defined on the pseudo-complete data. For high dimensional problems, the consistent estimate can be found under sparsity constraints. The consistency of the averaged estimate for the true parameter can be established under quite general conditions. The proposed algorithm is illustrated using high-dimensional Gaussian graphical models, high-dimensional variable selection, and a random coefficient model.	-
dc.language	eng	-
dc.publisher	Wiley-Blackwell Publishing Ltd. The Journal's web site is located at http://www.blackwellpublishing.com/journals/RSSB	-
dc.relation.ispartof	Journal of the Royal Statistical Society. Series B: Statistical Methodology	-
dc.rights	Preprint This is the pre-peer reviewed version of the following article: [Journal of the Royal Statistical Society. Series B: Statistical Methodology, 2018, v. 80 n. 5, p. 899-926], which has been published in final form at [http://dx.doi.org/10.1111/rssb.12279]. This article may be used for non-commercial purposes in accordance with Wiley Terms and Conditions for Use of Self-Archived Versions.	-
dc.subject	Expectation–maximization algorithm	-
dc.subject	Gaussian graphical model	-
dc.subject	Gibbs sampler	-
dc.subject	Imputation consistency	-
dc.subject	Random‐coefficient model	-
dc.title	An imputation-regularized optimization algorithm for high dimensional missing data problems and beyond	-
dc.type	Article	-
dc.identifier.email	Luo, Y: kurtluo@hku.hk	-
dc.identifier.authority	Luo, Y=rp02428	-
dc.description.nature	preprint	-
dc.identifier.doi	10.1111/rssb.12279	-
dc.identifier.scopus	eid_2-s2.0-85055455509	-
dc.identifier.hkuros	299561	-
dc.identifier.volume	80	-
dc.identifier.issue	5	-
dc.identifier.spage	899	-
dc.identifier.epage	926	-
dc.identifier.isi	WOS:000448897700003	-
dc.publisher.place	United Kingdom	-
dc.identifier.issnl	1369-7412	-

File Download

Links for fulltext

(May Require Subscription)

Supplementary

Article: An imputation-regularized optimization algorithm for high dimensional missing data problems and beyond

Export via OAI-PMH Interface in XML Formats

OR

Export to Other Non-XML Formats