File Download
  Links for fulltext
     (May Require Subscription)
Supplementary

Conference Paper: Intra- and Inter-sparse Multiple Output Regression with Application on Environmental Microbial Community Study

TitleIntra- and Inter-sparse Multiple Output Regression with Application on Environmental Microbial Community Study
Authors
Issue Date2013
PublisherI E E E. The Journal's web site is located at http://ieeexplore.ieee.org/xpl/conhome.jsp?punumber=1001586
Citation
The IEEE International Conference on Bioinformatics and Biomedicine (BIBM), Shanghai, China, 18-21 December. 2013. In IEEE International Conference on Bioinformatics and Biomedicine Proceedings, 2013, p. 404-409, article no. 6732526 How to Cite?
AbstractFeature selection is important for many biological studies, especially when the number of available samples is limited (in order of hundreds) while the number of input features is large (in order of millions), such as eQTL (expression quantitative trait loci) mapping, GWAS (genome wide association study) and environmental microbial community study. We study the problem of multiple output regression which leverages the underlying common relationship shared by multiple output features and propose an efficient and accurate approach for feature selection. Our approach considers both intra- and inter-group sparsities. The intergroup sparsity assumes that only small set of input features are related to the output features. The intragroup sparsity assumes that each input features may relate to multiple output features which should have different kinds of sparsity. Most existing methods do not model the intragroup sparsity well by either assuming uniform regularization on each group, i.e. each input feature relates to similar number of output features, or requiring prior knowledge of the relationship of input and output features. By modelling the regression coefficients as a mixture distributions of Laplacian and Gaussian, we can shrink group regression coefficients to be small adaptively and learn the intergroup, intragroup sparsity and shrinkage estimation patterns. Empirical studies on the synthetic and real environmental microbial community datasets show that our model has better predictions on test dataset than existing methods such as Lasso, Elastic Net, dirty model and rMTFL (robust multi-task feature learning). Moreover, by using least angle regression or coordinate descent and projected gradient descent techniques for optimization, we can obtain the optimal regression efficiently. © 2013 IEEE.
Persistent Identifierhttp://hdl.handle.net/10722/201113
ISBN

 

DC FieldValueLanguage
dc.contributor.authorYang, Jen_US
dc.contributor.authorLeung, HCMen_US
dc.contributor.authorYiu, SMen_US
dc.contributor.authorCai, YPen_US
dc.contributor.authorChin, FYLen_US
dc.date.accessioned2014-08-21T07:13:35Z-
dc.date.available2014-08-21T07:13:35Z-
dc.date.issued2013en_US
dc.identifier.citationThe IEEE International Conference on Bioinformatics and Biomedicine (BIBM), Shanghai, China, 18-21 December. 2013. In IEEE International Conference on Bioinformatics and Biomedicine Proceedings, 2013, p. 404-409, article no. 6732526en_US
dc.identifier.isbn9781479913091-
dc.identifier.urihttp://hdl.handle.net/10722/201113-
dc.description.abstractFeature selection is important for many biological studies, especially when the number of available samples is limited (in order of hundreds) while the number of input features is large (in order of millions), such as eQTL (expression quantitative trait loci) mapping, GWAS (genome wide association study) and environmental microbial community study. We study the problem of multiple output regression which leverages the underlying common relationship shared by multiple output features and propose an efficient and accurate approach for feature selection. Our approach considers both intra- and inter-group sparsities. The intergroup sparsity assumes that only small set of input features are related to the output features. The intragroup sparsity assumes that each input features may relate to multiple output features which should have different kinds of sparsity. Most existing methods do not model the intragroup sparsity well by either assuming uniform regularization on each group, i.e. each input feature relates to similar number of output features, or requiring prior knowledge of the relationship of input and output features. By modelling the regression coefficients as a mixture distributions of Laplacian and Gaussian, we can shrink group regression coefficients to be small adaptively and learn the intergroup, intragroup sparsity and shrinkage estimation patterns. Empirical studies on the synthetic and real environmental microbial community datasets show that our model has better predictions on test dataset than existing methods such as Lasso, Elastic Net, dirty model and rMTFL (robust multi-task feature learning). Moreover, by using least angle regression or coordinate descent and projected gradient descent techniques for optimization, we can obtain the optimal regression efficiently. © 2013 IEEE.-
dc.languageengen_US
dc.publisherI E E E. The Journal's web site is located at http://ieeexplore.ieee.org/xpl/conhome.jsp?punumber=1001586-
dc.relation.ispartofIEEE International Conference on Bioinformatics and Biomedicine Proceedingsen_US
dc.rightsIEEE International Conference on Bioinformatics and Biomedicine Proceedings. Copyright © I E E E.-
dc.rights©2013 IEEE. Personal use of this material is permitted. However, permission to reprint/republish this material for advertising or promotional purposes or for creating new collective works for resale or redistribution to servers or lists, or to reuse any copyrighted component of this work in other works must be obtained from the IEEE.-
dc.rightsCreative Commons: Attribution 3.0 Hong Kong License-
dc.titleIntra- and Inter-sparse Multiple Output Regression with Application on Environmental Microbial Community Studyen_US
dc.typeConference_Paperen_US
dc.identifier.emailYang, J: lne1013@hku.hken_US
dc.identifier.emailLeung, HCM: cmleung2@cs.hku.hken_US
dc.identifier.emailYiu, SM: smyiu@cs.hku.hken_US
dc.identifier.emailChin, FYL: chin@cs.hku.hken_US
dc.identifier.authorityLeung, HCM=rp00144en_US
dc.identifier.authorityYiu, SM=rp00207en_US
dc.identifier.authorityChin, FYL=rp00105en_US
dc.description.naturepublished_or_final_version-
dc.identifier.doi10.1109/BIBM.2013.6732526-
dc.identifier.scopuseid_2-s2.0-84894561395-
dc.identifier.hkuros235158en_US
dc.identifier.spage404, article no. 6732526en_US
dc.identifier.epage409, article no. 6732526en_US
dc.publisher.placeUnited States-

Export via OAI-PMH Interface in XML Formats


OR


Export to Other Non-XML Formats