File Download
There are no files associated with this item.
Links for fulltext
(May Require Subscription)
- Publisher Website: 10.1109/CVPR52688.2022.00791
- Scopus: eid_2-s2.0-85141783683
- WOS: WOS:000870759101013
- Find via
Supplementary
- Citations:
- Appears in Collections:
Conference Paper: ADeLA: Automatic Dense Labeling with Attention for Viewpoint Shift in Semantic Segmentation
Title | ADeLA: Automatic Dense Labeling with Attention for Viewpoint Shift in Semantic Segmentation |
---|---|
Authors | |
Keywords | Datasets and evaluation grouping and shape analysis Image and video synthesis and generation Machine learning Robot vision Scene analysis and understanding Segmentation Transfer/low-shot/long-tail learning Vision applications and systems |
Issue Date | 2022 |
Citation | Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2022, v. 2022-June, p. 8069-8079 How to Cite? |
Abstract | We describe a method to deal with performance drop in semantic segmentation caused by viewpoint changes within multi-camera systems, where temporally paired images are readily available, but the annotations may only be abundant for a few typical views. Existing methods alleviate performance drop via domain alignment in a shared space and assume that the mapping from the aligned space to the output is transferable. However, the novel content induced by viewpoint changes may nullify such a space for effective alignments, thus resulting in negative adaptation. Our method works without aligning any statistics of the images between the two domains. Instead, it utilizes a novel attention-based view transformation network trained only on color images to hallucinate the semantic images for the target. Despite the lack of supervision, the view transformation network can still generalize to semantic images thanks to the induced 'information transport' bias. Furthermore, to resolve ambiguities in converting the semantic images to semantic labels, we treat the view transformation network as a functional representation of an unknown mapping implied by the color images and propose functional label hallucination to generate pseudo-labels with uncertainties in the target domains. Our method surpasses baselines built on state-of-the-art correspondence estimation and view synthesis methods. Moreover, it outperforms the state-of-the-art unsupervised domain adaptation methods that utilize self-training and adversarial domain alignments. Our code and dataset will be made publicly available. |
Persistent Identifier | http://hdl.handle.net/10722/325582 |
ISSN | 2023 SCImago Journal Rankings: 10.331 |
ISI Accession Number ID |
DC Field | Value | Language |
---|---|---|
dc.contributor.author | Ren, Hanxiang | - |
dc.contributor.author | Yang, Yanchao | - |
dc.contributor.author | Wang, He | - |
dc.contributor.author | Shen, Bokui | - |
dc.contributor.author | Fan, Qingnan | - |
dc.contributor.author | Zheng, Youyi | - |
dc.contributor.author | Liu, C. Karen | - |
dc.contributor.author | Guibas, Leonidas | - |
dc.date.accessioned | 2023-02-27T07:34:32Z | - |
dc.date.available | 2023-02-27T07:34:32Z | - |
dc.date.issued | 2022 | - |
dc.identifier.citation | Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2022, v. 2022-June, p. 8069-8079 | - |
dc.identifier.issn | 1063-6919 | - |
dc.identifier.uri | http://hdl.handle.net/10722/325582 | - |
dc.description.abstract | We describe a method to deal with performance drop in semantic segmentation caused by viewpoint changes within multi-camera systems, where temporally paired images are readily available, but the annotations may only be abundant for a few typical views. Existing methods alleviate performance drop via domain alignment in a shared space and assume that the mapping from the aligned space to the output is transferable. However, the novel content induced by viewpoint changes may nullify such a space for effective alignments, thus resulting in negative adaptation. Our method works without aligning any statistics of the images between the two domains. Instead, it utilizes a novel attention-based view transformation network trained only on color images to hallucinate the semantic images for the target. Despite the lack of supervision, the view transformation network can still generalize to semantic images thanks to the induced 'information transport' bias. Furthermore, to resolve ambiguities in converting the semantic images to semantic labels, we treat the view transformation network as a functional representation of an unknown mapping implied by the color images and propose functional label hallucination to generate pseudo-labels with uncertainties in the target domains. Our method surpasses baselines built on state-of-the-art correspondence estimation and view synthesis methods. Moreover, it outperforms the state-of-the-art unsupervised domain adaptation methods that utilize self-training and adversarial domain alignments. Our code and dataset will be made publicly available. | - |
dc.language | eng | - |
dc.relation.ispartof | Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition | - |
dc.subject | Datasets and evaluation | - |
dc.subject | grouping and shape analysis | - |
dc.subject | Image and video synthesis and generation | - |
dc.subject | Machine learning | - |
dc.subject | Robot vision | - |
dc.subject | Scene analysis and understanding | - |
dc.subject | Segmentation | - |
dc.subject | Transfer/low-shot/long-tail learning | - |
dc.subject | Vision applications and systems | - |
dc.title | ADeLA: Automatic Dense Labeling with Attention for Viewpoint Shift in Semantic Segmentation | - |
dc.type | Conference_Paper | - |
dc.description.nature | link_to_subscribed_fulltext | - |
dc.identifier.doi | 10.1109/CVPR52688.2022.00791 | - |
dc.identifier.scopus | eid_2-s2.0-85141783683 | - |
dc.identifier.volume | 2022-June | - |
dc.identifier.spage | 8069 | - |
dc.identifier.epage | 8079 | - |
dc.identifier.isi | WOS:000870759101013 | - |