Segmenting Transparent Objects in the Wild with Transformer

Xie, E; Wang, WJ; Wang, WH; Sun, P; Xu, H; Liang, D; Luo, P

File Download

There are no files associated with this item.

Links for fulltext

(May Require Subscription)

Publisher Website: 10.24963/ijcai.2021/165
Find via

Supplementary

Citations:
Appears in Collections:
- Computer Science: Conference papers

Conference Paper: Segmenting Transparent Objects in the Wild with Transformer

Title	Segmenting Transparent Objects in the Wild with Transformer
Authors	Xie, E Wang, WJ Wang, WH Sun, P Xu, H Liang, D Luo, P
Keywords	Computer Vision: Perception Computer Vision: Recognition: Detection, Categorization, Indexing, Matching, Retrieval, Semantic Interpretation
Issue Date	2021
Publisher	International Joint Conference on Artificial Intelligence. The Journal's web site is located at https://www.ijcai.org/past_proceedings
Citation	Proceedings of the 30th International Joint Conference on Artificial Intelligence (IJCAI-21), Virtual Conference, Montreal, Canada, 19-26 August 2021, p. 1194-1200 How to Cite? DOI: http://dx.doi.org/10.24963/ijcai.2021/165
Abstract	This work presents a new fine-grained transparent object segmentation dataset, termed Trans10K-v2, extending Trans10K-v1, the first large-scale transparent object segmentation dataset. Unlike Trans10K-v1 that only has two limited categories, our new dataset has several appealing benefits. (1) It has 11 fine-grained categories of transparent objects, commonly occurring in the human domestic environment, making it more practical for real-world application. (2) Trans10K-v2 brings more challenges for the current advanced segmentation methods than its former version. Furthermore, a novel Transformer-based segmentation pipeline termed Trans2Seg is proposed. Firstly, the Transformer encoder of Trans2Seg provides the global receptive field in contrast to CNN's local receptive field, which shows excellent advantages over pure CNN architectures. Secondly, by formulating semantic segmentation as a problem of dictionary look-up, we design a set of learnable prototypes as the query of Trans2Seg's Transformer decoder, where each prototype learns the statistics of one category in the whole dataset. We benchmark more than 20 recent semantic segmentation methods, demonstrating that Trans2Seg significantly outperforms all the CNN-based methods, showing the proposed algorithm's potential ability to solve transparent object segmentation.Code is available in https://github.com/xieenze/Trans2Seg.
Description	Main Track: Computer Vision II
Persistent Identifier	http://hdl.handle.net/10722/301471
ISSN	1045-0823 2020 SCImago Journal Rankings: 0.649

DC Field	Value	Language
dc.contributor.author	Xie, E	-
dc.contributor.author	Wang, WJ	-
dc.contributor.author	Wang, WH	-
dc.contributor.author	Sun, P	-
dc.contributor.author	Xu, H	-
dc.contributor.author	Liang, D	-
dc.contributor.author	Luo, P	-
dc.date.accessioned	2021-07-27T08:11:34Z	-
dc.date.available	2021-07-27T08:11:34Z	-
dc.date.issued	2021	-
dc.identifier.citation	Proceedings of the 30th International Joint Conference on Artificial Intelligence (IJCAI-21), Virtual Conference, Montreal, Canada, 19-26 August 2021, p. 1194-1200	-
dc.identifier.issn	1045-0823	-
dc.identifier.uri	http://hdl.handle.net/10722/301471	-
dc.description	Main Track: Computer Vision II	-
dc.description.abstract	This work presents a new fine-grained transparent object segmentation dataset, termed Trans10K-v2, extending Trans10K-v1, the first large-scale transparent object segmentation dataset. Unlike Trans10K-v1 that only has two limited categories, our new dataset has several appealing benefits. (1) It has 11 fine-grained categories of transparent objects, commonly occurring in the human domestic environment, making it more practical for real-world application. (2) Trans10K-v2 brings more challenges for the current advanced segmentation methods than its former version. Furthermore, a novel Transformer-based segmentation pipeline termed Trans2Seg is proposed. Firstly, the Transformer encoder of Trans2Seg provides the global receptive field in contrast to CNN's local receptive field, which shows excellent advantages over pure CNN architectures. Secondly, by formulating semantic segmentation as a problem of dictionary look-up, we design a set of learnable prototypes as the query of Trans2Seg's Transformer decoder, where each prototype learns the statistics of one category in the whole dataset. We benchmark more than 20 recent semantic segmentation methods, demonstrating that Trans2Seg significantly outperforms all the CNN-based methods, showing the proposed algorithm's potential ability to solve transparent object segmentation.Code is available in https://github.com/xieenze/Trans2Seg.	-
dc.language	eng	-
dc.publisher	International Joint Conference on Artificial Intelligence. The Journal's web site is located at https://www.ijcai.org/past_proceedings	-
dc.relation.ispartof	International Joint Conference on Artificial Intelligence. Proceedings	-
dc.subject	Computer Vision: Perception	-
dc.subject	Computer Vision: Recognition: Detection, Categorization, Indexing, Matching, Retrieval, Semantic Interpretation	-
dc.title	Segmenting Transparent Objects in the Wild with Transformer	-
dc.type	Conference_Paper	-
dc.identifier.email	Luo, P: pluo@hku.hk	-
dc.identifier.authority	Luo, P=rp02575	-
dc.identifier.doi	10.24963/ijcai.2021/165	-
dc.identifier.hkuros	323750	-
dc.identifier.spage	1194	-
dc.identifier.epage	1200	-
dc.publisher.place	United States	-
dc.identifier.eisbn	9780999241196	-

File Download

Links for fulltext

(May Require Subscription)

Supplementary

Conference Paper: Segmenting Transparent Objects in the Wild with Transformer

Export via OAI-PMH Interface in XML Formats

OR

Export to Other Non-XML Formats