Unifying Training and Inference for Panoptic Segmentation

Li, Q; Qi, X; Torr, P

File Download

There are no files associated with this item.

Links for fulltext

(May Require Subscription)

Publisher Website: 10.1109/CVPR42600.2020.01333
Scopus: eid_2-s2.0-85094864769
Find via

Supplementary

Citations:
- Scopus: 0
Appears in Collections:
- Electrical & Electronic Engineering: Conference papers

Conference Paper: Unifying Training and Inference for Panoptic Segmentation

Title	Unifying Training and Inference for Panoptic Segmentation
Authors	Li, Q Qi, X Torr, P
Keywords	Semantics Feature extraction Object detection Image segmentation Pipelines
Issue Date	2020
Publisher	IEEE Computer Society. The Journal's web site is located at http://ieeexplore.ieee.org/xpl/conhome.jsp?punumber=1000147
Citation	2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, WA, USA, 13-19 June 2020, p. 13317-13325 How to Cite? DOI: http://dx.doi.org/10.1109/CVPR42600.2020.01333
Abstract	We present an end-to-end network to bridge the gap between training and inference pipeline for panoptic segmentation, a task that seeks to partition an image into semantic regions for 'stuff' and object instances for 'things'. In contrast to recent works, our network exploits a parametrised, yet lightweight panoptic segmentation submodule, powered by an end-to-end learnt dense instance affinity, to capture the probability that any pair of pixels belong to the same instance. This panoptic submodule gives rise to a novel propagation mechanism for panoptic logits and enables the network to output a coherent panoptic segmentation map for both 'stuff' and 'thing' classes, without any post-processing. Reaping the benefits of end-to-end training, our full system sets new records on the popular street scene dataset, Cityscapes, achieving 61.4 PQ with a ResNet-50 backbone using only the fine annotations. On the challenging COCO dataset, our ResNet-50-based network also delivers state-of-the-art accuracy of 43.4 PQ. Moreover, our network flexibly works with and without object mask cues, performing competitively under both settings, which is of interest for applications with computation budgets.
Persistent Identifier	http://hdl.handle.net/10722/288233
ISSN	1063-6919 2023 SCImago Journal Rankings: 10.331

DC Field	Value	Language
dc.contributor.author	Li, Q	-
dc.contributor.author	Qi, X	-
dc.contributor.author	Torr, P	-
dc.date.accessioned	2020-10-05T12:09:51Z	-
dc.date.available	2020-10-05T12:09:51Z	-
dc.date.issued	2020	-
dc.identifier.citation	2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, WA, USA, 13-19 June 2020, p. 13317-13325	-
dc.identifier.issn	1063-6919	-
dc.identifier.uri	http://hdl.handle.net/10722/288233	-
dc.description.abstract	We present an end-to-end network to bridge the gap between training and inference pipeline for panoptic segmentation, a task that seeks to partition an image into semantic regions for 'stuff' and object instances for 'things'. In contrast to recent works, our network exploits a parametrised, yet lightweight panoptic segmentation submodule, powered by an end-to-end learnt dense instance affinity, to capture the probability that any pair of pixels belong to the same instance. This panoptic submodule gives rise to a novel propagation mechanism for panoptic logits and enables the network to output a coherent panoptic segmentation map for both 'stuff' and 'thing' classes, without any post-processing. Reaping the benefits of end-to-end training, our full system sets new records on the popular street scene dataset, Cityscapes, achieving 61.4 PQ with a ResNet-50 backbone using only the fine annotations. On the challenging COCO dataset, our ResNet-50-based network also delivers state-of-the-art accuracy of 43.4 PQ. Moreover, our network flexibly works with and without object mask cues, performing competitively under both settings, which is of interest for applications with computation budgets.	-
dc.language	eng	-
dc.publisher	IEEE Computer Society. The Journal's web site is located at http://ieeexplore.ieee.org/xpl/conhome.jsp?punumber=1000147	-
dc.relation.ispartof	IEEE Conference on Computer Vision and Pattern Recognition. Proceedings	-
dc.rights	IEEE Conference on Computer Vision and Pattern Recognition. Proceedings. Copyright © IEEE Computer Society.	-
dc.rights	©2020 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.	-
dc.subject	Semantics	-
dc.subject	Feature extraction	-
dc.subject	Object detection	-
dc.subject	Image segmentation	-
dc.subject	Pipelines	-
dc.title	Unifying Training and Inference for Panoptic Segmentation	-
dc.type	Conference_Paper	-
dc.identifier.email	Qi, X: xjqi@eee.hku.hk	-
dc.identifier.authority	Qi, X=rp02666	-
dc.description.nature	link_to_subscribed_fulltext	-
dc.identifier.doi	10.1109/CVPR42600.2020.01333	-
dc.identifier.scopus	eid_2-s2.0-85094864769	-
dc.identifier.hkuros	315444	-
dc.identifier.spage	13317	-
dc.identifier.epage	13325	-
dc.publisher.place	United States	-
dc.identifier.issnl	1063-6919	-

File Download

Links for fulltext

(May Require Subscription)

Supplementary

Conference Paper: Unifying Training and Inference for Panoptic Segmentation

Export via OAI-PMH Interface in XML Formats

OR

Export to Other Non-XML Formats