Every Frame Counts: Joint Learning of Video Segmentation and Optical Flow

Ding, M; Wang, Z; Zhou, B; Shi, J; Lu, Z; Luo, P

File Download

re01.html

Links for fulltext

(May Require Subscription)

Publisher Website: 10.1609/aaai.v34i07.6699
Find via

Supplementary

Citations:
Appears in Collections:
- Computer Science: Conference papers

Conference Paper: Every Frame Counts: Joint Learning of Video Segmentation and Optical Flow

Title	Every Frame Counts: Joint Learning of Video Segmentation and Optical Flow
Authors	Ding, M Wang, Z Zhou, B Shi, J Lu, Z Luo, P
Issue Date	2020
Publisher	AAAI Press. The Journal's web site is located at https://aaai.org/Library/AAAI/aaai-library.php
Citation	Proceedings of the 34th AAAI Conference on Artificial Intelligence (AAAI-20), New York, NY, USA, 7-12 February 2020, v. 34 n. 7, p. 10713-10720 How to Cite? DOI: http://dx.doi.org/10.1609/aaai.v34i07.6699
Abstract	A major challenge for video semantic segmentation is the lack of labeled data. In most benchmark datasets, only one frame of a video clip is annotated, which makes most supervised methods fail to utilize information from the rest of the frames. To exploit the spatio-temporal information in videos, many previous works use pre-computed optical flows, which encode the temporal consistency to improve the video segmentation. However, the video segmentation and optical flow estimation are still considered as two separate tasks. In this paper, we propose a novel framework for joint video semantic segmentation and optical flow estimation. Semantic segmentation brings semantic information to handle occlusion for more robust optical flow estimation, while the non-occluded optical flow provides accurate pixel-level temporal correspondences to guarantee the temporal consistency of the segmentation. Moreover, our framework is able to utilize both labeled and unlabeled frames in the video through joint training, while no additional calculation is required in inference. Extensive experiments show that the proposed model makes the video semantic segmentation and optical flow estimation benefit from each other and outperforms existing methods under the same settings in both tasks.
Description	AAAI Technical Track: Vision
Persistent Identifier	http://hdl.handle.net/10722/284159
ISSN	2159-5399

DC Field	Value	Language
dc.contributor.author	Ding, M	-
dc.contributor.author	Wang, Z	-
dc.contributor.author	Zhou, B	-
dc.contributor.author	Shi, J	-
dc.contributor.author	Lu, Z	-
dc.contributor.author	Luo, P	-
dc.date.accessioned	2020-07-20T05:56:33Z	-
dc.date.available	2020-07-20T05:56:33Z	-
dc.date.issued	2020	-
dc.identifier.citation	Proceedings of the 34th AAAI Conference on Artificial Intelligence (AAAI-20), New York, NY, USA, 7-12 February 2020, v. 34 n. 7, p. 10713-10720	-
dc.identifier.issn	2159-5399	-
dc.identifier.uri	http://hdl.handle.net/10722/284159	-
dc.description	AAAI Technical Track: Vision	-
dc.description.abstract	A major challenge for video semantic segmentation is the lack of labeled data. In most benchmark datasets, only one frame of a video clip is annotated, which makes most supervised methods fail to utilize information from the rest of the frames. To exploit the spatio-temporal information in videos, many previous works use pre-computed optical flows, which encode the temporal consistency to improve the video segmentation. However, the video segmentation and optical flow estimation are still considered as two separate tasks. In this paper, we propose a novel framework for joint video semantic segmentation and optical flow estimation. Semantic segmentation brings semantic information to handle occlusion for more robust optical flow estimation, while the non-occluded optical flow provides accurate pixel-level temporal correspondences to guarantee the temporal consistency of the segmentation. Moreover, our framework is able to utilize both labeled and unlabeled frames in the video through joint training, while no additional calculation is required in inference. Extensive experiments show that the proposed model makes the video semantic segmentation and optical flow estimation benefit from each other and outperforms existing methods under the same settings in both tasks.	-
dc.language	eng	-
dc.publisher	AAAI Press. The Journal's web site is located at https://aaai.org/Library/AAAI/aaai-library.php	-
dc.relation.ispartof	Proceedings of the AAAI Conference on Artificial Intelligence	-
dc.rights	Copyright (c) 2019 Association for the Advancement of Artificial Intelligence	-
dc.title	Every Frame Counts: Joint Learning of Video Segmentation and Optical Flow	-
dc.type	Conference_Paper	-
dc.identifier.email	Luo, P: pluo@hku.hk	-
dc.identifier.authority	Luo, P=rp02575	-
dc.description.nature	link_to_OA_fulltext	-
dc.identifier.doi	10.1609/aaai.v34i07.6699	-
dc.identifier.hkuros	311019	-
dc.identifier.volume	34	-
dc.identifier.issue	7	-
dc.identifier.spage	10713	-
dc.identifier.epage	10720	-
dc.publisher.place	United States	-
dc.identifier.issnl	2159-5399	-

File Download

Links for fulltext

(May Require Subscription)

Supplementary

Conference Paper: Every Frame Counts: Joint Learning of Video Segmentation and Optical Flow

Export via OAI-PMH Interface in XML Formats

OR

Export to Other Non-XML Formats