Identifying Frailty in Older Adults Receiving Home Care Assessment Using Machine Learning: Longitudinal Observational Study on the Role of Classifier, Feature Selection, and Sample Size

Pan, Cheng; Luo, Hao; Cheung, Gary; Zhou, Huiquan; Cheng, Reynold; Cullum, Sarah; Wu, Chuan

File Download

There are no files associated with this item.

Links for fulltext

(May Require Subscription)

Publisher Website: 10.2196/44185
WOS: WOS:001374810600001
Find via

Supplementary

Citations:
- Web of Science: 0
Appears in Collections:
- Computer Science: Journal/Magazine Articles

Article: Identifying Frailty in Older Adults Receiving Home Care Assessment Using Machine Learning: Longitudinal Observational Study on the Role of Classifier, Feature Selection, and Sample Size

Title	Identifying Frailty in Older Adults Receiving Home Care Assessment Using Machine Learning: Longitudinal Observational Study on the Role of Classifier, Feature Selection, and Sample Size
Authors	Pan, Cheng Luo, Hao Cheung, Gary Zhou, Huiquan Cheng, Reynold Cullum, Sarah Wu, Chuan
Issue Date	2-Jan-2024
Publisher	JMIR Publications
Citation	JMIR AI, 2024, v. 3 How to Cite? DOI: http://dx.doi.org/10.2196/44185
Abstract	Background:Machine learning techniques are starting to be used in various health care data sets to identify frail persons who may benefit from interventions. However, evidence about the performance of machine learning techniques compared to conventional regression is mixed. It is also unclear what methodological and database factors are associated with performance. Objective:This study aimed to compare the mortality prediction accuracy of various machine learning classifiers for identifying frail older adults in different scenarios. Methods:We used deidentified data collected from older adults (65 years of age and older) assessed with interRAI-Home Care instrument in New Zealand between January 1, 2012, and December 31, 2016. A total of 138 interRAI assessment items were used to predict 6-month and 12-month mortality, using 3 machine learning classifiers (random forest [RF], extreme gradient boosting [XGBoost], and multilayer perceptron [MLP]) and regularized logistic regression. We conducted a simulation study comparing the performance of machine learning models with logistic regression and interRAI Home Care Frailty Scale and examined the effects of sample sizes, the number of features, and train-test split ratios. Results:A total of 95,042 older adults (median age 82.66 years, IQR 77.92-88.76; n=37,462, 39.42% male) receiving home care were analyzed. The average area under the curve (AUC) and sensitivities of 6-month mortality prediction showed that machine learning classifiers did not outperform regularized logistic regressions. In terms of AUC, regularized logistic regression had better performance than XGBoost, MLP, and RF when the number of features was ≤80 and the sample size ≤16,000; MLP outperformed regularized logistic regression in terms of sensitivities when the number of features was ≥40 and the sample size ≥4000. Conversely, RF and XGBoost demonstrated higher specificities than regularized logistic regression in all scenarios. Conclusions:The study revealed that machine learning models exhibited significant variation in prediction performance when evaluated using different metrics. Regularized logistic regression was an effective model for identifying frail older adults receiving home care, as indicated by the AUC, particularly when the number of features and sample sizes were not excessively large. Conversely, MLP displayed superior sensitivity, while RF exhibited superior specificity when the number of features and sample sizes were large.
Persistent Identifier	http://hdl.handle.net/10722/347484
ISSN	2817-1705
ISI Accession Number ID	WOS:001374810600001

DC Field	Value	Language
dc.contributor.author	Pan, Cheng	-
dc.contributor.author	Luo, Hao	-
dc.contributor.author	Cheung, Gary	-
dc.contributor.author	Zhou, Huiquan	-
dc.contributor.author	Cheng, Reynold	-
dc.contributor.author	Cullum, Sarah	-
dc.contributor.author	Wu, Chuan	-
dc.date.accessioned	2024-09-23T03:11:23Z	-
dc.date.available	2024-09-23T03:11:23Z	-
dc.date.issued	2024-01-02	-
dc.identifier.citation	JMIR AI, 2024, v. 3	-
dc.identifier.issn	2817-1705	-
dc.identifier.uri	http://hdl.handle.net/10722/347484	-
dc.description.abstract	<p>Background:Machine learning techniques are starting to be used in various health care data sets to identify frail persons who may benefit from interventions. However, evidence about the performance of machine learning techniques compared to conventional regression is mixed. It is also unclear what methodological and database factors are associated with performance.</p><p>Objective:This study aimed to compare the mortality prediction accuracy of various machine learning classifiers for identifying frail older adults in different scenarios.</p><p>Methods:We used deidentified data collected from older adults (65 years of age and older) assessed with interRAI-Home Care instrument in New Zealand between January 1, 2012, and December 31, 2016. A total of 138 interRAI assessment items were used to predict 6-month and 12-month mortality, using 3 machine learning classifiers (random forest [RF], extreme gradient boosting [XGBoost], and multilayer perceptron [MLP]) and regularized logistic regression. We conducted a simulation study comparing the performance of machine learning models with logistic regression and interRAI Home Care Frailty Scale and examined the effects of sample sizes, the number of features, and train-test split ratios.</p><p>Results:A total of 95,042 older adults (median age 82.66 years, IQR 77.92-88.76; n=37,462, 39.42% male) receiving home care were analyzed. The average area under the curve (AUC) and sensitivities of 6-month mortality prediction showed that machine learning classifiers did not outperform regularized logistic regressions. In terms of AUC, regularized logistic regression had better performance than XGBoost, MLP, and RF when the number of features was ≤80 and the sample size ≤16,000; MLP outperformed regularized logistic regression in terms of sensitivities when the number of features was ≥40 and the sample size ≥4000. Conversely, RF and XGBoost demonstrated higher specificities than regularized logistic regression in all scenarios.</p><p>Conclusions:The study revealed that machine learning models exhibited significant variation in prediction performance when evaluated using different metrics. Regularized logistic regression was an effective model for identifying frail older adults receiving home care, as indicated by the AUC, particularly when the number of features and sample sizes were not excessively large. Conversely, MLP displayed superior sensitivity, while RF exhibited superior specificity when the number of features and sample sizes were large.</p>	-
dc.language	eng	-
dc.publisher	JMIR Publications	-
dc.relation.ispartof	JMIR AI	-
dc.title	Identifying Frailty in Older Adults Receiving Home Care Assessment Using Machine Learning: Longitudinal Observational Study on the Role of Classifier, Feature Selection, and Sample Size	-
dc.type	Article	-
dc.identifier.doi	10.2196/44185	-
dc.identifier.volume	3	-
dc.identifier.eissn	2817-1705	-
dc.identifier.isi	WOS:001374810600001	-

File Download

Links for fulltext

(May Require Subscription)

Supplementary

Article: Identifying Frailty in Older Adults Receiving Home Care Assessment Using Machine Learning: Longitudinal Observational Study on the Role of Classifier, Feature Selection, and Sample Size

Export via OAI-PMH Interface in XML Formats

OR

Export to Other Non-XML Formats