Real-world visual search in AI and humans : detecting vehicles and humans in driving scenarios

Yang, Yumeng; 楊瑜萌

File Download

FullText.pdf

Supplementary

Citations:
Appears in Collections:
- HKU Theses Online
- Psychology: Theses

postgraduate thesis: Real-world visual search in AI and humans : detecting vehicles and humans in driving scenarios

Title	Real-world visual search in AI and humans : detecting vehicles and humans in driving scenarios
Authors	Yang, Yumeng 楊瑜萌
Issue Date	2024
Publisher	The University of Hong Kong (Pokfulam, Hong Kong)
Citation	Yang, Y. [楊瑜萌]. (2024). Real-world visual search in AI and humans : detecting vehicles and humans in driving scenarios. (Thesis). University of Hong Kong, Pokfulam, Hong Kong SAR.
Abstract	Looking for a red-colored dot among the black squares is much easier than looking for a friend in a big crowd. Previous laboratory studies have employed paradigms similar to the former task, where participants search for a target among various distractors (Wolfe, 2020). However, those studies lack resemblance to real-world visual search, which involves more complexity. The theme of this MPhil thesis is the investigation of real-world visual search using complex stimuli that have great importance in real-world applications. Study 1 investigated the differences in visual search performance and attention strategies between humans and AI in detecting vehicles and humans under difficult conditions such as occlusion and degradation. We measured humans’ attended features using eye tracking and visualized AI’s decision-making processes using saliency-based explainable AI (XAI) method, which generates saliency of features used by AI for its output. In human search, individuals differed in adopting focused or explorative attention strategies, with better performance associated with the focused strategy. AI (Yolo-v5s) had higher similarity in attended features to the focused than the explorative strategy in humans, and achieved human-expert-level performance in vehicle detection even in difficult cases such as occlusion and degradation. In contrast, it performed much poorer than humans in detecting humans with low attended feature similarity due to humans’ attention bias for stimuli with evolutionary significance. Also, higher similarity to humans’ attended features was associated with better AI performance, suggesting that human attention may be used for guiding AI design. Study 2 investigated whether autistic individuals, known for their atypical gaze behavior and social attention difficulties, have attention bias for social stimuli (i.e., with evolutionary significance; humans) vs. non-social stimuli (vehicles). Recent research has found that autistic individuals have poorer performance and lower eye movement consistency (using a dissimilar/inconsistent visual scanning routine across different trials) in face recognition, which may be related to less face processing experience due to lack of social interests. In this study, we showed that this phenomenon was not observed in visual search tasks, as autistic individuals and matched neurotypicals (NTs) had similar hit rate, precision, and gaze behavior when searching for either social or non-social stimuli. However, autistic individuals had longer search time and made more and longer fixations, suggesting difficulties in identifying potential targets. This difficulty was not limited to social stimuli, supporting a domain-general deficit. In conclusion, we found that AI (Yolo-v5s)’s attention strategy is more similar to the focused strategy than the explorative strategy in humans, achieving human-expert-level performance in detecting vehicles even in difficult cases. However, it performed worse than humans in detecting humans with low attended feature similarity due to humans’ attention bias for stimuli with evolutionary significance. In Study 2, although autistic individuals are reported to have atypical social attention to humans, they did not differ from NTs in search strategy or search accuracy for searching social and non-social stimuli. Nevertheless, they were slower and made more fixations during visual search. The findings suggested a domain-general view of deficits rather than social cognitive deficits in ASD.
Degree	Master of Philosophy
Subject	Visual perception Artificial intelligence
Dept/Program	Psychology
Persistent Identifier	http://hdl.handle.net/10722/352641

DC Field	Value	Language
dc.contributor.author	Yang, Yumeng	-
dc.contributor.author	楊瑜萌	-
dc.date.accessioned	2024-12-19T09:26:55Z	-
dc.date.available	2024-12-19T09:26:55Z	-
dc.date.issued	2024	-
dc.identifier.citation	Yang, Y. [楊瑜萌]. (2024). Real-world visual search in AI and humans : detecting vehicles and humans in driving scenarios. (Thesis). University of Hong Kong, Pokfulam, Hong Kong SAR.	-
dc.identifier.uri	http://hdl.handle.net/10722/352641	-
dc.description.abstract	Looking for a red-colored dot among the black squares is much easier than looking for a friend in a big crowd. Previous laboratory studies have employed paradigms similar to the former task, where participants search for a target among various distractors (Wolfe, 2020). However, those studies lack resemblance to real-world visual search, which involves more complexity. The theme of this MPhil thesis is the investigation of real-world visual search using complex stimuli that have great importance in real-world applications. Study 1 investigated the differences in visual search performance and attention strategies between humans and AI in detecting vehicles and humans under difficult conditions such as occlusion and degradation. We measured humans’ attended features using eye tracking and visualized AI’s decision-making processes using saliency-based explainable AI (XAI) method, which generates saliency of features used by AI for its output. In human search, individuals differed in adopting focused or explorative attention strategies, with better performance associated with the focused strategy. AI (Yolo-v5s) had higher similarity in attended features to the focused than the explorative strategy in humans, and achieved human-expert-level performance in vehicle detection even in difficult cases such as occlusion and degradation. In contrast, it performed much poorer than humans in detecting humans with low attended feature similarity due to humans’ attention bias for stimuli with evolutionary significance. Also, higher similarity to humans’ attended features was associated with better AI performance, suggesting that human attention may be used for guiding AI design. Study 2 investigated whether autistic individuals, known for their atypical gaze behavior and social attention difficulties, have attention bias for social stimuli (i.e., with evolutionary significance; humans) vs. non-social stimuli (vehicles). Recent research has found that autistic individuals have poorer performance and lower eye movement consistency (using a dissimilar/inconsistent visual scanning routine across different trials) in face recognition, which may be related to less face processing experience due to lack of social interests. In this study, we showed that this phenomenon was not observed in visual search tasks, as autistic individuals and matched neurotypicals (NTs) had similar hit rate, precision, and gaze behavior when searching for either social or non-social stimuli. However, autistic individuals had longer search time and made more and longer fixations, suggesting difficulties in identifying potential targets. This difficulty was not limited to social stimuli, supporting a domain-general deficit. In conclusion, we found that AI (Yolo-v5s)’s attention strategy is more similar to the focused strategy than the explorative strategy in humans, achieving human-expert-level performance in detecting vehicles even in difficult cases. However, it performed worse than humans in detecting humans with low attended feature similarity due to humans’ attention bias for stimuli with evolutionary significance. In Study 2, although autistic individuals are reported to have atypical social attention to humans, they did not differ from NTs in search strategy or search accuracy for searching social and non-social stimuli. Nevertheless, they were slower and made more fixations during visual search. The findings suggested a domain-general view of deficits rather than social cognitive deficits in ASD.	-
dc.language	eng	-
dc.publisher	The University of Hong Kong (Pokfulam, Hong Kong)	-
dc.relation.ispartof	HKU Theses Online (HKUTO)	-
dc.rights	The author retains all proprietary rights, (such as patent rights) and the right to use in future works.	-
dc.rights	This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.	-
dc.subject.lcsh	Visual perception	-
dc.subject.lcsh	Artificial intelligence	-
dc.title	Real-world visual search in AI and humans : detecting vehicles and humans in driving scenarios	-
dc.type	PG_Thesis	-
dc.description.thesisname	Master of Philosophy	-
dc.description.thesislevel	Master	-
dc.description.thesisdiscipline	Psychology	-
dc.description.nature	published_or_final_version	-
dc.date.hkucongregation	2024	-
dc.identifier.mmsid	991044891409303414	-

File Download

Supplementary

postgraduate thesis: Real-world visual search in AI and humans : detecting vehicles and humans in driving scenarios

Export via OAI-PMH Interface in XML Formats

OR

Export to Other Non-XML Formats