File Download
There are no files associated with this item.
Links for fulltext
(May Require Subscription)
- Publisher Website: 10.1108/ECAM-07-2025-1133
- Scopus: eid_2-s2.0-105025427009
- Find via

Supplementary
-
Citations:
- Scopus: 0
- Appears in Collections:
Article: Vision language model (VLM)-enabled street view analytics: a systematic literature review
| Title | Vision language model (VLM)-enabled street view analytics: a systematic literature review |
|---|---|
| Authors | |
| Keywords | Large language model Multimodal learning Street view analytics Systematic literature review Vision language model |
| Issue Date | 9-Dec-2025 |
| Publisher | Emerald |
| Citation | Engineering, Construction and Architectural Management, 2025, p. 1-19 How to Cite? |
| Abstract | Purpose Street view analytics (SVA) is an emerging field focusing on the systematic analysis of street-level imagery to understand urban environments, which has rapidly advanced with the advent of vision language models (VLMs). Despite the significant advancements, a critical review of the applications of VLMs for SVA is lacking. This paper aims to fill this gap by providing a comprehensive literature review on VLM-enabled SVA. Design/methodology/approach This study adopts a Preferred Reporting Items for Systematic Reviews and Meta-Analyses-guided systematic literature review. After keyword retrieval, literature collection, thematic screening and a five-domain quality assessment (data representativeness, ground truth validity, model design and/or analytic rigor, validation and/or generalization and reporting and/or reproducibility), 69 VLM-enabled SVA studies (2020–2025) were selected. Five reviewers independently extracted and synthesized evidence, and inter-rater reliability was quantified to verify consistency. Findings The systematic analysis underscores the transformative potential of VLMs in SVA, emphasizing their multimodal data handling and open-domain knowledge integration. However, key challenges, while rooted in broader SVA limitations, manifest distinctly in VLM contexts: temporal dynamics, contextual reliance, annotation inconsistencies, computational demands and process transparency. Handling remains task-dependent, with future research focusing on city- and year-held-out temporal evaluation, robustness to street-level variability, retrieval-augmented generation for consistency, hybrid edge-cloud models and chain-of-thought prompting. Originality/value This study contributes to the field by synthesizing the latest development of VLMs for SVA, identifying avenues for future research and ultimately proposing an integrated workflow for enhancing VLMs' applications in SVA tasks. |
| Persistent Identifier | http://hdl.handle.net/10722/369161 |
| ISSN | 2023 Impact Factor: 3.6 2023 SCImago Journal Rankings: 0.896 |
| DC Field | Value | Language |
|---|---|---|
| dc.contributor.author | Peng, Ziyu | - |
| dc.contributor.author | Lu, Weisheng | - |
| dc.contributor.author | An, Hongda | - |
| dc.contributor.author | Xia, Xianhua | - |
| dc.contributor.author | Zhang, Yi | - |
| dc.contributor.author | Xue, Fan | - |
| dc.contributor.author | Chen, Junjie | - |
| dc.date.accessioned | 2026-01-20T08:35:17Z | - |
| dc.date.available | 2026-01-20T08:35:17Z | - |
| dc.date.issued | 2025-12-09 | - |
| dc.identifier.citation | Engineering, Construction and Architectural Management, 2025, p. 1-19 | - |
| dc.identifier.issn | 0969-9988 | - |
| dc.identifier.uri | http://hdl.handle.net/10722/369161 | - |
| dc.description.abstract | <p>Purpose</p><p>Street view analytics (SVA) is an emerging field focusing on the systematic analysis of street-level imagery to understand urban environments, which has rapidly advanced with the advent of vision language models (VLMs). Despite the significant advancements, a critical review of the applications of VLMs for SVA is lacking. This paper aims to fill this gap by providing a comprehensive literature review on VLM-enabled SVA.</p><p>Design/methodology/approach</p><p>This study adopts a Preferred Reporting Items for Systematic Reviews and Meta-Analyses-guided systematic literature review. After keyword retrieval, literature collection, thematic screening and a five-domain quality assessment (data representativeness, ground truth validity, model design and/or analytic rigor, validation and/or generalization and reporting and/or reproducibility), 69 VLM-enabled SVA studies (2020–2025) were selected. Five reviewers independently extracted and synthesized evidence, and inter-rater reliability was quantified to verify consistency.</p><p>Findings</p><p>The systematic analysis underscores the transformative potential of VLMs in SVA, emphasizing their multimodal data handling and open-domain knowledge integration. However, key challenges, while rooted in broader SVA limitations, manifest distinctly in VLM contexts: temporal dynamics, contextual reliance, annotation inconsistencies, computational demands and process transparency. Handling remains task-dependent, with future research focusing on city- and year-held-out temporal evaluation, robustness to street-level variability, retrieval-augmented generation for consistency, hybrid edge-cloud models and chain-of-thought prompting.</p><p>Originality/value</p><p>This study contributes to the field by synthesizing the latest development of VLMs for SVA, identifying avenues for future research and ultimately proposing an integrated workflow for enhancing VLMs' applications in SVA tasks.</p> | - |
| dc.language | eng | - |
| dc.publisher | Emerald | - |
| dc.relation.ispartof | Engineering, Construction and Architectural Management | - |
| dc.subject | Large language model | - |
| dc.subject | Multimodal learning | - |
| dc.subject | Street view analytics | - |
| dc.subject | Systematic literature review | - |
| dc.subject | Vision language model | - |
| dc.title | Vision language model (VLM)-enabled street view analytics: a systematic literature review | - |
| dc.type | Article | - |
| dc.identifier.doi | 10.1108/ECAM-07-2025-1133 | - |
| dc.identifier.scopus | eid_2-s2.0-105025427009 | - |
| dc.identifier.spage | 1 | - |
| dc.identifier.epage | 19 | - |
| dc.identifier.eissn | 1365-232X | - |
| dc.identifier.issnl | 0969-9988 | - |
