File Download

There are no files associated with this item.

  Links for fulltext
     (May Require Subscription)
Supplementary

Article: PCA-based missing information imputation for real-time crash likelihood prediction under imbalanced data

TitlePCA-based missing information imputation for real-time crash likelihood prediction under imbalanced data
Authors
Keywordsadaboost
cost-sensitive learning
PCA-based missing data imputation
Real-time crash likelihood prediction
SMOTE
support vector machine
Issue Date2019
Citation
Transportmetrica A: Transport Science, 2019, v. 15, n. 2, p. 872-895 How to Cite?
AbstractAs an important research topic, real-time crash likelihood prediction has been studied for many years. However, few research focuses on the missing data imputation in real-time crash likelihood prediction, although missing values are commonly observed due to breakdown of sensors or external interference. Besides, classifying imbalanced data is also a critical issue in real-time crash likelihood prediction, since the number of crash-prone cases is much smaller than that of non-crash cases. In this paper, three principal component analysis (PCA) based approaches are established for imputing missing values, while two kinds of solutions are developed to tackle the issue of imbalanced data. The results show that the proposed methods can help the classifiers achieve better predictive performance under situations with missing data. The two solutions, i.e. cost-sensitive learning, and synthetic minority oversampling technique (SMOTE), can help improve the sensitivity by adjusting the classifiers to pay more attention to the minority class.
Persistent Identifierhttp://hdl.handle.net/10722/308771
ISSN
2023 Impact Factor: 3.6
2023 SCImago Journal Rankings: 1.099
ISI Accession Number ID

 

DC FieldValueLanguage
dc.contributor.authorKe, Jintao-
dc.contributor.authorZhang, Shuaichao-
dc.contributor.authorYang, Hai-
dc.contributor.authorChen, Xiqun-
dc.date.accessioned2021-12-08T07:50:06Z-
dc.date.available2021-12-08T07:50:06Z-
dc.date.issued2019-
dc.identifier.citationTransportmetrica A: Transport Science, 2019, v. 15, n. 2, p. 872-895-
dc.identifier.issn2324-9935-
dc.identifier.urihttp://hdl.handle.net/10722/308771-
dc.description.abstractAs an important research topic, real-time crash likelihood prediction has been studied for many years. However, few research focuses on the missing data imputation in real-time crash likelihood prediction, although missing values are commonly observed due to breakdown of sensors or external interference. Besides, classifying imbalanced data is also a critical issue in real-time crash likelihood prediction, since the number of crash-prone cases is much smaller than that of non-crash cases. In this paper, three principal component analysis (PCA) based approaches are established for imputing missing values, while two kinds of solutions are developed to tackle the issue of imbalanced data. The results show that the proposed methods can help the classifiers achieve better predictive performance under situations with missing data. The two solutions, i.e. cost-sensitive learning, and synthetic minority oversampling technique (SMOTE), can help improve the sensitivity by adjusting the classifiers to pay more attention to the minority class.-
dc.languageeng-
dc.relation.ispartofTransportmetrica A: Transport Science-
dc.subjectadaboost-
dc.subjectcost-sensitive learning-
dc.subjectPCA-based missing data imputation-
dc.subjectReal-time crash likelihood prediction-
dc.subjectSMOTE-
dc.subjectsupport vector machine-
dc.titlePCA-based missing information imputation for real-time crash likelihood prediction under imbalanced data-
dc.typeArticle-
dc.description.naturelink_to_subscribed_fulltext-
dc.identifier.doi10.1080/23249935.2018.1542414-
dc.identifier.scopuseid_2-s2.0-85057246883-
dc.identifier.volume15-
dc.identifier.issue2-
dc.identifier.spage872-
dc.identifier.epage895-
dc.identifier.eissn2324-9943-
dc.identifier.isiWOS:000466722900001-

Export via OAI-PMH Interface in XML Formats


OR


Export to Other Non-XML Formats