File Download

There are no files associated with this item.

Supplementary

Conference Paper: Bottom-Up Shift and Reasoning for Referring Image Segmentation

TitleBottom-Up Shift and Reasoning for Referring Image Segmentation
Authors
Issue Date2021
Citation
IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Virtual Conference, 19-25 June 2021 How to Cite?
AbstractReferring image segmentation aims to segment the referent that is the corresponding object or stuff referred by a natural language expression in an image. Its main challenge lies in how to effectively and efficiently differentiate between the referent and other objects of the same category as the referent. In this paper, we tackle the challenge by jointly performing compositional visual reasoning and accurate segmentation in a single stage via the proposed novel Bottom-Up Shift (BUS) and Bidirectional Attentive Refinement (BIAR) modules. Specifically, BUS progressively locates the referent along hierarchical reasoning steps implied by the expression. At each step, it locates the corresponding visual region by disambiguating between similar regions, where the disambiguation bases on the relationships between regions. By the explainable visual reasoning, BUS explicitly aligns linguistic components with visual regions so that it can identify all the mentioned entities in the expression. BIAR fuses multi-level features via a two-way attentive message passing, which captures the visual details relevant to the referent to refine segmentation results. Experimental results demonstrate that the proposed method consisting of BUS and BIAR modules, can not only consistently surpass all existing state-of-the-art algorithms across common benchmark datasets but also visualize interpretable reasoning steps for stepwise segmentation. Code is available at https://github.com/incredibleXM/BUSNet.
DescriptionPaper Session Eight: Paper ID 3484
Persistent Identifierhttp://hdl.handle.net/10722/301188

 

DC FieldValueLanguage
dc.contributor.authorYang, S-
dc.contributor.authorXia, M-
dc.contributor.authorLi, G-
dc.contributor.authorZhou, H-
dc.contributor.authorYu, Y-
dc.date.accessioned2021-07-27T08:07:25Z-
dc.date.available2021-07-27T08:07:25Z-
dc.date.issued2021-
dc.identifier.citationIEEE Conference on Computer Vision and Pattern Recognition (CVPR), Virtual Conference, 19-25 June 2021-
dc.identifier.urihttp://hdl.handle.net/10722/301188-
dc.descriptionPaper Session Eight: Paper ID 3484-
dc.description.abstractReferring image segmentation aims to segment the referent that is the corresponding object or stuff referred by a natural language expression in an image. Its main challenge lies in how to effectively and efficiently differentiate between the referent and other objects of the same category as the referent. In this paper, we tackle the challenge by jointly performing compositional visual reasoning and accurate segmentation in a single stage via the proposed novel Bottom-Up Shift (BUS) and Bidirectional Attentive Refinement (BIAR) modules. Specifically, BUS progressively locates the referent along hierarchical reasoning steps implied by the expression. At each step, it locates the corresponding visual region by disambiguating between similar regions, where the disambiguation bases on the relationships between regions. By the explainable visual reasoning, BUS explicitly aligns linguistic components with visual regions so that it can identify all the mentioned entities in the expression. BIAR fuses multi-level features via a two-way attentive message passing, which captures the visual details relevant to the referent to refine segmentation results. Experimental results demonstrate that the proposed method consisting of BUS and BIAR modules, can not only consistently surpass all existing state-of-the-art algorithms across common benchmark datasets but also visualize interpretable reasoning steps for stepwise segmentation. Code is available at https://github.com/incredibleXM/BUSNet. -
dc.languageeng-
dc.relation.ispartofIEEE Conference on Computer Vision and Pattern Recognition (CVPR)-
dc.titleBottom-Up Shift and Reasoning for Referring Image Segmentation-
dc.typeConference_Paper-
dc.identifier.emailYu, Y: yzyu@cs.hku.hk-
dc.identifier.authorityYu, Y=rp01415-
dc.identifier.hkuros323541-

Export via OAI-PMH Interface in XML Formats


OR


Export to Other Non-XML Formats