File Download

There are no files associated with this item.

  Links for fulltext
     (May Require Subscription)
Supplementary

Article: Progressive Cross-Stream Cooperation in Spatial and Temporal Domain for Action Localization

TitleProgressive Cross-Stream Cooperation in Spatial and Temporal Domain for Action Localization
Authors
KeywordsAction localization
spatio-temporal action localization
two-stream cooperation
Issue Date2021
Citation
IEEE Transactions on Pattern Analysis and Machine Intelligence, 2021, v. 43, n. 12, p. 4477-4490 How to Cite?
AbstractSpatio-temporal action localization consists of three levels of tasks: spatial localization, action classification, and temporal localization. In this work, we propose a new progressive cross-stream cooperation (PCSC) framework that improves all three tasks above. The basic idea is to utilize both spatial region (resp., temporal segment proposals) and features from one stream (i.e., the Flow/RGB stream) to help another stream (i.e., the RGB/Flow stream) to iteratively generate better bounding boxes in the spatial domain (resp., temporal segments in the temporal domain). In this way, not only the actions could be more accurately localized both spatially and temporally, but also the action classes could be predicted more precisely. Specifically, we first combine the latest region proposals (for spatial detection) or segment proposals (for temporal localization) from both streams to form a larger set of labelled training samples to help learn better action detection or segment detection models. Second, to learn better representations, we also propose a new message passing approach to pass information from one stream to another stream, which also leads to better action detection and segment detection models. By first using our newly proposed PCSC framework for spatial localization at the frame-level and then applying our temporal PCSC framework for temporal localization at the tube-level, the action localization results are progressively improved at both the frame level and the video level. Comprehensive experiments on two benchmark datasets UCF-101-24 and J-HMDB demonstrate the effectiveness of our newly proposed approaches for spatio-temporal action localization in realistic scenarios.
Persistent Identifierhttp://hdl.handle.net/10722/322061
ISSN
2023 Impact Factor: 20.8
2023 SCImago Journal Rankings: 6.158
ISI Accession Number ID

 

DC FieldValueLanguage
dc.contributor.authorSu, Rui-
dc.contributor.authorXu, Dong-
dc.contributor.authorZhou, Luping-
dc.contributor.authorOuyang, Wanli-
dc.date.accessioned2022-11-03T02:23:20Z-
dc.date.available2022-11-03T02:23:20Z-
dc.date.issued2021-
dc.identifier.citationIEEE Transactions on Pattern Analysis and Machine Intelligence, 2021, v. 43, n. 12, p. 4477-4490-
dc.identifier.issn0162-8828-
dc.identifier.urihttp://hdl.handle.net/10722/322061-
dc.description.abstractSpatio-temporal action localization consists of three levels of tasks: spatial localization, action classification, and temporal localization. In this work, we propose a new progressive cross-stream cooperation (PCSC) framework that improves all three tasks above. The basic idea is to utilize both spatial region (resp., temporal segment proposals) and features from one stream (i.e., the Flow/RGB stream) to help another stream (i.e., the RGB/Flow stream) to iteratively generate better bounding boxes in the spatial domain (resp., temporal segments in the temporal domain). In this way, not only the actions could be more accurately localized both spatially and temporally, but also the action classes could be predicted more precisely. Specifically, we first combine the latest region proposals (for spatial detection) or segment proposals (for temporal localization) from both streams to form a larger set of labelled training samples to help learn better action detection or segment detection models. Second, to learn better representations, we also propose a new message passing approach to pass information from one stream to another stream, which also leads to better action detection and segment detection models. By first using our newly proposed PCSC framework for spatial localization at the frame-level and then applying our temporal PCSC framework for temporal localization at the tube-level, the action localization results are progressively improved at both the frame level and the video level. Comprehensive experiments on two benchmark datasets UCF-101-24 and J-HMDB demonstrate the effectiveness of our newly proposed approaches for spatio-temporal action localization in realistic scenarios.-
dc.languageeng-
dc.relation.ispartofIEEE Transactions on Pattern Analysis and Machine Intelligence-
dc.subjectAction localization-
dc.subjectspatio-temporal action localization-
dc.subjecttwo-stream cooperation-
dc.titleProgressive Cross-Stream Cooperation in Spatial and Temporal Domain for Action Localization-
dc.typeArticle-
dc.description.naturelink_to_subscribed_fulltext-
dc.identifier.doi10.1109/TPAMI.2020.2997860-
dc.identifier.pmid32750775-
dc.identifier.scopuseid_2-s2.0-85118604122-
dc.identifier.volume43-
dc.identifier.issue12-
dc.identifier.spage4477-
dc.identifier.epage4490-
dc.identifier.eissn1939-3539-
dc.identifier.isiWOS:000714203900024-

Export via OAI-PMH Interface in XML Formats


OR


Export to Other Non-XML Formats