Improving action localization by progressive cross-stream cooperation

Su, Rui; Ouyang, Wanli; Zhou, Luping; Xu, Dong

File Download

There are no files associated with this item.

Links for fulltext

(May Require Subscription)

Publisher Website: 10.1109/CVPR.2019.01229
Scopus: eid_2-s2.0-85078810201
WOS: WOS:000542649305064
Find via

Supplementary

Citations:
- Scopus: 0
- Web of Science: 0
Appears in Collections:
- Computer Science: Conference papers

Conference Paper: Improving action localization by progressive cross-stream cooperation

Title	Improving action localization by progressive cross-stream cooperation
Authors	Su, Rui Ouyang, Wanli Zhou, Luping Xu, Dong
Keywords	Categorization Recognition: Detection Retrieval Video Analytics
Issue Date	2019
Citation	Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2019, v. 2019-June, p. 12008-12017 How to Cite? DOI: http://dx.doi.org/10.1109/CVPR.2019.01229
Abstract	Spatio-temporal action localization consists of three levels of tasks: Spatial localization, action classification, and temporal segmentation. In this work, we propose a new Progressive Cross-stream Cooperation (PCSC) framework to iterative improve action localization results and generate better bounding boxes for one stream (i.e., Flow/RGB) by leveraging both region proposals and features from another stream (i.e., RGB/Flow) in an iterative fashion. Specifically, we first generate a larger set of region proposals by combining the latest region proposals from both streams, from which we can readily obtain a larger set of labelled training samples to help learn better action detection models. Second, we also propose a new message passing approach to pass information from one stream to another stream in order to learn better representations, which also leads to better action detection models. As a result, our iterative framework progressively improves action localization results at the frame level. To improve action localization results at the video level, we additionally propose a new strategy to train class-specific actionness detectors for better temporal segmentation, which can be readily learnt by using the training samples around temporal boundaries. Comprehensive experiments on two benchmark datasets UCF-101-24 and J-HMDB demonstrate the effectiveness of our newly proposed approaches for spatio-temporal action localization in realistic scenarios.
Persistent Identifier	http://hdl.handle.net/10722/321877
ISSN	1063-6919 2023 SCImago Journal Rankings: 10.331
ISI Accession Number ID	WOS:000542649305064

DC Field	Value	Language
dc.contributor.author	Su, Rui	-
dc.contributor.author	Ouyang, Wanli	-
dc.contributor.author	Zhou, Luping	-
dc.contributor.author	Xu, Dong	-
dc.date.accessioned	2022-11-03T02:22:04Z	-
dc.date.available	2022-11-03T02:22:04Z	-
dc.date.issued	2019	-
dc.identifier.citation	Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2019, v. 2019-June, p. 12008-12017	-
dc.identifier.issn	1063-6919	-
dc.identifier.uri	http://hdl.handle.net/10722/321877	-
dc.description.abstract	Spatio-temporal action localization consists of three levels of tasks: Spatial localization, action classification, and temporal segmentation. In this work, we propose a new Progressive Cross-stream Cooperation (PCSC) framework to iterative improve action localization results and generate better bounding boxes for one stream (i.e., Flow/RGB) by leveraging both region proposals and features from another stream (i.e., RGB/Flow) in an iterative fashion. Specifically, we first generate a larger set of region proposals by combining the latest region proposals from both streams, from which we can readily obtain a larger set of labelled training samples to help learn better action detection models. Second, we also propose a new message passing approach to pass information from one stream to another stream in order to learn better representations, which also leads to better action detection models. As a result, our iterative framework progressively improves action localization results at the frame level. To improve action localization results at the video level, we additionally propose a new strategy to train class-specific actionness detectors for better temporal segmentation, which can be readily learnt by using the training samples around temporal boundaries. Comprehensive experiments on two benchmark datasets UCF-101-24 and J-HMDB demonstrate the effectiveness of our newly proposed approaches for spatio-temporal action localization in realistic scenarios.	-
dc.language	eng	-
dc.relation.ispartof	Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition	-
dc.subject	Categorization	-
dc.subject	Recognition: Detection	-
dc.subject	Retrieval	-
dc.subject	Video Analytics	-
dc.title	Improving action localization by progressive cross-stream cooperation	-
dc.type	Conference_Paper	-
dc.description.nature	link_to_subscribed_fulltext	-
dc.identifier.doi	10.1109/CVPR.2019.01229	-
dc.identifier.scopus	eid_2-s2.0-85078810201	-
dc.identifier.volume	2019-June	-
dc.identifier.spage	12008	-
dc.identifier.epage	12017	-
dc.identifier.isi	WOS:000542649305064	-

File Download

Links for fulltext

(May Require Subscription)

Supplementary

Conference Paper: Improving action localization by progressive cross-stream cooperation

Export via OAI-PMH Interface in XML Formats

OR

Export to Other Non-XML Formats