File Download
There are no files associated with this item.
Links for fulltext
(May Require Subscription)
- Publisher Website: 10.1002/gepi.21721
- Scopus: eid_2-s2.0-84875647843
- PMID: 23471879
- WOS: WOS:000316810600006
- Find via
Supplementary
- Citations:
- Appears in Collections:
Article: Sample size considerations of prediction-validation methods in high-dimensional data for survival outcomes
Title | Sample size considerations of prediction-validation methods in high-dimensional data for survival outcomes |
---|---|
Authors | |
Keywords | Gene expression GWAS High-dimensional data Prediction validation Sample size Survival |
Issue Date | 2013 |
Citation | Genetic Epidemiology, 2013, v. 37 n. 3, p. 276-282 How to Cite? |
Abstract | A variety of prediction methods are used to relate high-dimensional genome data with a clinical outcome using a prediction model. Once a prediction model is developed from a data set, it should be validated using a resampling method or an independent data set. Although the existing prediction methods have been intensively evaluated by many investigators, there has not been a comprehensive study investigating the performance of the validation methods, especially with a survival clinical outcome. Understanding the properties of the various validation methods can allow researchers to perform more powerful validations while controlling for type I error. In addition, sample size calculation strategy based on these validation methods is lacking. We conduct extensive simulations to examine the statistical properties of these validation strategies. In both simulations and a real data example, we have found that 10-fold cross-validation with permutation gave the best power while controlling type I error close to the nominal level. Based on this, we have also developed a sample size calculation method that will be used to design a validation study with a user-chosen combination of prediction. Microarray and genome-wide association studies data are used as illustrations. The power calculation method in this presentation can be used for the design of any biomedical studies involving high-dimensional data and survival outcomes. © 2013 Wiley Periodicals, Inc. |
Persistent Identifier | http://hdl.handle.net/10722/194384 |
ISSN | 2023 Impact Factor: 1.7 2023 SCImago Journal Rankings: 0.977 |
ISI Accession Number ID |
DC Field | Value | Language |
---|---|---|
dc.contributor.author | Pang, H | - |
dc.contributor.author | Jung, S-H | - |
dc.date.accessioned | 2014-01-30T03:32:31Z | - |
dc.date.available | 2014-01-30T03:32:31Z | - |
dc.date.issued | 2013 | - |
dc.identifier.citation | Genetic Epidemiology, 2013, v. 37 n. 3, p. 276-282 | - |
dc.identifier.issn | 0741-0395 | - |
dc.identifier.uri | http://hdl.handle.net/10722/194384 | - |
dc.description.abstract | A variety of prediction methods are used to relate high-dimensional genome data with a clinical outcome using a prediction model. Once a prediction model is developed from a data set, it should be validated using a resampling method or an independent data set. Although the existing prediction methods have been intensively evaluated by many investigators, there has not been a comprehensive study investigating the performance of the validation methods, especially with a survival clinical outcome. Understanding the properties of the various validation methods can allow researchers to perform more powerful validations while controlling for type I error. In addition, sample size calculation strategy based on these validation methods is lacking. We conduct extensive simulations to examine the statistical properties of these validation strategies. In both simulations and a real data example, we have found that 10-fold cross-validation with permutation gave the best power while controlling type I error close to the nominal level. Based on this, we have also developed a sample size calculation method that will be used to design a validation study with a user-chosen combination of prediction. Microarray and genome-wide association studies data are used as illustrations. The power calculation method in this presentation can be used for the design of any biomedical studies involving high-dimensional data and survival outcomes. © 2013 Wiley Periodicals, Inc. | - |
dc.language | eng | - |
dc.relation.ispartof | Genetic Epidemiology | - |
dc.subject | Gene expression | - |
dc.subject | GWAS | - |
dc.subject | High-dimensional data | - |
dc.subject | Prediction validation | - |
dc.subject | Sample size | - |
dc.subject | Survival | - |
dc.title | Sample size considerations of prediction-validation methods in high-dimensional data for survival outcomes | - |
dc.type | Article | - |
dc.description.nature | link_to_subscribed_fulltext | - |
dc.identifier.doi | 10.1002/gepi.21721 | - |
dc.identifier.pmid | 23471879 | - |
dc.identifier.scopus | eid_2-s2.0-84875647843 | - |
dc.identifier.volume | 37 | - |
dc.identifier.issue | 3 | - |
dc.identifier.spage | 276 | - |
dc.identifier.epage | 282 | - |
dc.identifier.isi | WOS:000316810600006 | - |
dc.identifier.issnl | 0741-0395 | - |