File Download

There are no files associated with this item.

  Links for fulltext
     (May Require Subscription)
Supplementary

Article: Sample size considerations of prediction-validation methods in high-dimensional data for survival outcomes

TitleSample size considerations of prediction-validation methods in high-dimensional data for survival outcomes
Authors
KeywordsGene expression
GWAS
High-dimensional data
Prediction validation
Sample size
Survival
Issue Date2013
Citation
Genetic Epidemiology, 2013, v. 37 n. 3, p. 276-282 How to Cite?
AbstractA variety of prediction methods are used to relate high-dimensional genome data with a clinical outcome using a prediction model. Once a prediction model is developed from a data set, it should be validated using a resampling method or an independent data set. Although the existing prediction methods have been intensively evaluated by many investigators, there has not been a comprehensive study investigating the performance of the validation methods, especially with a survival clinical outcome. Understanding the properties of the various validation methods can allow researchers to perform more powerful validations while controlling for type I error. In addition, sample size calculation strategy based on these validation methods is lacking. We conduct extensive simulations to examine the statistical properties of these validation strategies. In both simulations and a real data example, we have found that 10-fold cross-validation with permutation gave the best power while controlling type I error close to the nominal level. Based on this, we have also developed a sample size calculation method that will be used to design a validation study with a user-chosen combination of prediction. Microarray and genome-wide association studies data are used as illustrations. The power calculation method in this presentation can be used for the design of any biomedical studies involving high-dimensional data and survival outcomes. © 2013 Wiley Periodicals, Inc.
Persistent Identifierhttp://hdl.handle.net/10722/194384
ISSN
2023 Impact Factor: 1.7
2023 SCImago Journal Rankings: 0.977
ISI Accession Number ID

 

DC FieldValueLanguage
dc.contributor.authorPang, H-
dc.contributor.authorJung, S-H-
dc.date.accessioned2014-01-30T03:32:31Z-
dc.date.available2014-01-30T03:32:31Z-
dc.date.issued2013-
dc.identifier.citationGenetic Epidemiology, 2013, v. 37 n. 3, p. 276-282-
dc.identifier.issn0741-0395-
dc.identifier.urihttp://hdl.handle.net/10722/194384-
dc.description.abstractA variety of prediction methods are used to relate high-dimensional genome data with a clinical outcome using a prediction model. Once a prediction model is developed from a data set, it should be validated using a resampling method or an independent data set. Although the existing prediction methods have been intensively evaluated by many investigators, there has not been a comprehensive study investigating the performance of the validation methods, especially with a survival clinical outcome. Understanding the properties of the various validation methods can allow researchers to perform more powerful validations while controlling for type I error. In addition, sample size calculation strategy based on these validation methods is lacking. We conduct extensive simulations to examine the statistical properties of these validation strategies. In both simulations and a real data example, we have found that 10-fold cross-validation with permutation gave the best power while controlling type I error close to the nominal level. Based on this, we have also developed a sample size calculation method that will be used to design a validation study with a user-chosen combination of prediction. Microarray and genome-wide association studies data are used as illustrations. The power calculation method in this presentation can be used for the design of any biomedical studies involving high-dimensional data and survival outcomes. © 2013 Wiley Periodicals, Inc.-
dc.languageeng-
dc.relation.ispartofGenetic Epidemiology-
dc.subjectGene expression-
dc.subjectGWAS-
dc.subjectHigh-dimensional data-
dc.subjectPrediction validation-
dc.subjectSample size-
dc.subjectSurvival-
dc.titleSample size considerations of prediction-validation methods in high-dimensional data for survival outcomes-
dc.typeArticle-
dc.description.naturelink_to_subscribed_fulltext-
dc.identifier.doi10.1002/gepi.21721-
dc.identifier.pmid23471879-
dc.identifier.scopuseid_2-s2.0-84875647843-
dc.identifier.volume37-
dc.identifier.issue3-
dc.identifier.spage276-
dc.identifier.epage282-
dc.identifier.isiWOS:000316810600006-
dc.identifier.issnl0741-0395-

Export via OAI-PMH Interface in XML Formats


OR


Export to Other Non-XML Formats