TSP-Transformer: Task-Specific Prompts Boosted Transformer for Holistic Scene Understanding

Wang, Shuo; Li, Jing; Zhao, Zibo; Lian, Dongze; Huang, Binbin; Wang, Xiaomei; Li, Zhengxin; Gao, Shenghua

File Download

There are no files associated with this item.

Links for fulltext

(May Require Subscription)

Publisher Website: 10.1109/WACV57701.2024.00097
Scopus: eid_2-s2.0-85191950349
WOS: WOS:001222964601003

Supplementary

Citations:
- Scopus: 0
- Web of Science: 0
Appears in Collections:
- Computer Science: Conference papers

Conference Paper: TSP-Transformer: Task-Specific Prompts Boosted Transformer for Holistic Scene Understanding

Title	TSP-Transformer: Task-Specific Prompts Boosted Transformer for Holistic Scene Understanding
Authors	Wang, Shuo Li, Jing Zhao, Zibo Lian, Dongze Huang, Binbin Wang, Xiaomei Li, Zhengxin Gao, Shenghua
Keywords	Algorithms Image recognition and understanding
Issue Date	2024
Citation	Proceedings - 2024 IEEE Winter Conference on Applications of Computer Vision, WACV 2024, 2024, p. 914-923 How to Cite? DOI: http://dx.doi.org/10.1109/WACV57701.2024.00097
Abstract	Holistic scene understanding includes semantic segmentation, surface normal estimation, object boundary detection, depth estimation, etc. The key aspect of this problem is to learn representation effectively, as each subtask builds upon not only correlated but also distinct attributes. Inspired by visual-prompt tuning, we propose a Task-Specific Prompts Transformer, dubbed TSP-Transformer, for holistic scene understanding. It features a vanilla transformer in the early stage and tasks-specific prompts transformer encoder in the lateral stage, where tasks-specific prompts are augmented. By doing so, the transformer layer learns the generic information from the shared parts and is endowed with task-specific capacity. First, the tasks-specific prompts serve as induced priors for each task effectively. Moreover, the task-specific prompts can be seen as switches to favor task-specific representation learning for different tasks. Extensive experiments on NYUD-v2 and PASCAL-Context show that our method achieves state-of-the-art performance, validating the effectiveness of our method for holistic scene understanding. We also provide our code in the following link https://github.com/tb2-sy/TSP-Transformer.
Persistent Identifier	http://hdl.handle.net/10722/345384
ISI Accession Number ID	WOS:001222964601003

DC Field	Value	Language
dc.contributor.author	Wang, Shuo	-
dc.contributor.author	Li, Jing	-
dc.contributor.author	Zhao, Zibo	-
dc.contributor.author	Lian, Dongze	-
dc.contributor.author	Huang, Binbin	-
dc.contributor.author	Wang, Xiaomei	-
dc.contributor.author	Li, Zhengxin	-
dc.contributor.author	Gao, Shenghua	-
dc.date.accessioned	2024-08-15T09:27:00Z	-
dc.date.available	2024-08-15T09:27:00Z	-
dc.date.issued	2024	-
dc.identifier.citation	Proceedings - 2024 IEEE Winter Conference on Applications of Computer Vision, WACV 2024, 2024, p. 914-923	-
dc.identifier.uri	http://hdl.handle.net/10722/345384	-
dc.description.abstract	Holistic scene understanding includes semantic segmentation, surface normal estimation, object boundary detection, depth estimation, etc. The key aspect of this problem is to learn representation effectively, as each subtask builds upon not only correlated but also distinct attributes. Inspired by visual-prompt tuning, we propose a Task-Specific Prompts Transformer, dubbed TSP-Transformer, for holistic scene understanding. It features a vanilla transformer in the early stage and tasks-specific prompts transformer encoder in the lateral stage, where tasks-specific prompts are augmented. By doing so, the transformer layer learns the generic information from the shared parts and is endowed with task-specific capacity. First, the tasks-specific prompts serve as induced priors for each task effectively. Moreover, the task-specific prompts can be seen as switches to favor task-specific representation learning for different tasks. Extensive experiments on NYUD-v2 and PASCAL-Context show that our method achieves state-of-the-art performance, validating the effectiveness of our method for holistic scene understanding. We also provide our code in the following link https://github.com/tb2-sy/TSP-Transformer.	-
dc.language	eng	-
dc.relation.ispartof	Proceedings - 2024 IEEE Winter Conference on Applications of Computer Vision, WACV 2024	-
dc.subject	Algorithms	-
dc.subject	Image recognition and understanding	-
dc.title	TSP-Transformer: Task-Specific Prompts Boosted Transformer for Holistic Scene Understanding	-
dc.type	Conference_Paper	-
dc.description.nature	link_to_subscribed_fulltext	-
dc.identifier.doi	10.1109/WACV57701.2024.00097	-
dc.identifier.scopus	eid_2-s2.0-85191950349	-
dc.identifier.spage	914	-
dc.identifier.epage	923	-
dc.identifier.isi	WOS:001222964601003	-

File Download

Links for fulltext

(May Require Subscription)

Supplementary

Conference Paper: TSP-Transformer: Task-Specific Prompts Boosted Transformer for Holistic Scene Understanding

Export via OAI-PMH Interface in XML Formats

OR

Export to Other Non-XML Formats