File Download

There are no files associated with this item.

  Links for fulltext
     (May Require Subscription)
Supplementary

Article: Predicting the Risk of Lumbar Prolapsed Disc: A Gene Signature-Based Machine Learning Analysis

TitlePredicting the Risk of Lumbar Prolapsed Disc: A Gene Signature-Based Machine Learning Analysis
Authors
KeywordsEarly prevention
Gene signature
Low back pain
Lumbar prolapsed disc
Machine learning
Risk prediction
Transcriptomics
Issue Date4-May-2025
PublisherSpringer Nature
Citation
Pain and Therapy, 2025 How to Cite?
Abstract

Introduction

Lumbar prolapsed disc (LPD) is a leading cause of low back pain, contributing significantly to global disability and healthcare burden. This study aimed to develop machine learning models to predict the risk of LPD by analysing gene expression profiles for early detection.

Methods

Transcriptomic data from peripheral blood samples were obtained from the Gene Expression Omnibus (GEO) database, with dataset GSE150408 used for training and GSE124272 for testing. The training dataset included 17 patients with sciatica resulting from LPD, all of whom had magnetic resonance imaging confirmation of single-level LPD at either the L4/5 or L5/S1 levels. Data from 17 healthy volunteers were used as controls. Recursive feature elimination (RFE) was employed to identify the most relevant gene signatures among 23 pain-related genes. Machine learning models, including support vector machine (SVM), random forest, k-nearest neighbours (KNN), logistic regression, and Extreme Gradient Boosting (XGBoost), were trained and evaluated. Model performance was assessed using accuracy, area under the curve (AUC), F1 score, and Matthews correlation coefficient (MCC).

Results

Eight key gene signatures were identified as significant predictors of LPD, with MMP9 exhibiting the highest importance score. Most of these genes were differentially expressed between patients with LPD and healthy controls (p < 0.05). Among the models, random forest demonstrated the highest accuracy (0.80, 95% CI 0.73–0.85) and MCC (0.64, 95% CI 0.53–0.76), followed by KNN, XGBoost, and SVM. Overall, the random forest model exhibited the most robust performance in predicting the risk of LPD.

Conclusion

The results of our study suggest that machine learning models based on pain-related gene signatures may identify patients at high risk of developing LPD with reasonably high accuracy. These prediction models could perhaps be integrated into clinical diagnostic tools to enhance early diagnosis and prevention.


Persistent Identifierhttp://hdl.handle.net/10722/355836
ISSN
2023 Impact Factor: 4.1
2023 SCImago Journal Rankings: 0.847
ISI Accession Number ID

 

DC FieldValueLanguage
dc.contributor.authorWang, Fengfeng-
dc.contributor.authorMeng, Fei-
dc.contributor.authorWong, Stanley Sau Ching-
dc.date.accessioned2025-05-17T00:35:23Z-
dc.date.available2025-05-17T00:35:23Z-
dc.date.issued2025-05-04-
dc.identifier.citationPain and Therapy, 2025-
dc.identifier.issn2193-8237-
dc.identifier.urihttp://hdl.handle.net/10722/355836-
dc.description.abstract<h3>Introduction</h3><p>Lumbar prolapsed disc (LPD) is a leading cause of low back pain, contributing significantly to global disability and healthcare burden. This study aimed to develop machine learning models to predict the risk of LPD by analysing gene expression profiles for early detection.</p><h3>Methods</h3><p>Transcriptomic data from peripheral blood samples were obtained from the Gene Expression Omnibus (GEO) database, with dataset GSE150408 used for training and GSE124272 for testing. The training dataset included 17 patients with sciatica resulting from LPD, all of whom had magnetic resonance imaging confirmation of single-level LPD at either the L4/5 or L5/S1 levels. Data from 17 healthy volunteers were used as controls. Recursive feature elimination (RFE) was employed to identify the most relevant gene signatures among 23 pain-related genes. Machine learning models, including support vector machine (SVM), random forest,<em> k</em>-nearest neighbours (KNN), logistic regression, and Extreme Gradient Boosting (XGBoost), were trained and evaluated. Model performance was assessed using accuracy, area under the curve (AUC), F1 score, and Matthews correlation coefficient (MCC).</p><h3>Results</h3><p>Eight key gene signatures were identified as significant predictors of LPD, with<em> MMP9</em> exhibiting the highest importance score. Most of these genes were differentially expressed between patients with LPD and healthy controls (<em>p</em> < 0.05). Among the models, random forest demonstrated the highest accuracy (0.80, 95% CI 0.73–0.85) and MCC (0.64, 95% CI 0.53–0.76), followed by KNN, XGBoost, and SVM. Overall, the random forest model exhibited the most robust performance in predicting the risk of LPD.</p><h3>Conclusion</h3><p>The results of our study suggest that machine learning models based on pain-related gene signatures may identify patients at high risk of developing LPD with reasonably high accuracy. These prediction models could perhaps be integrated into clinical diagnostic tools to enhance early diagnosis and prevention.</p>-
dc.languageeng-
dc.publisherSpringer Nature-
dc.relation.ispartofPain and Therapy-
dc.rightsThis work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.-
dc.subjectEarly prevention-
dc.subjectGene signature-
dc.subjectLow back pain-
dc.subjectLumbar prolapsed disc-
dc.subjectMachine learning-
dc.subjectRisk prediction-
dc.subjectTranscriptomics-
dc.titlePredicting the Risk of Lumbar Prolapsed Disc: A Gene Signature-Based Machine Learning Analysis-
dc.typeArticle-
dc.identifier.doi10.1007/s40122-025-00744-4-
dc.identifier.scopuseid_2-s2.0-105004063235-
dc.identifier.eissn2193-651X-
dc.identifier.isiWOS:001480829600001-
dc.identifier.issnl2193-8237-

Export via OAI-PMH Interface in XML Formats


OR


Export to Other Non-XML Formats