File Download
There are no files associated with this item.
Links for fulltext
(May Require Subscription)
- Publisher Website: 10.1109/TPAMI.2020.2997860
- Scopus: eid_2-s2.0-85118604122
- PMID: 32750775
- WOS: WOS:000714203900024
- Find via
Supplementary
- Citations:
- Appears in Collections:
Article: Progressive Cross-Stream Cooperation in Spatial and Temporal Domain for Action Localization
Title | Progressive Cross-Stream Cooperation in Spatial and Temporal Domain for Action Localization |
---|---|
Authors | |
Keywords | Action localization spatio-temporal action localization two-stream cooperation |
Issue Date | 2021 |
Citation | IEEE Transactions on Pattern Analysis and Machine Intelligence, 2021, v. 43, n. 12, p. 4477-4490 How to Cite? |
Abstract | Spatio-temporal action localization consists of three levels of tasks: spatial localization, action classification, and temporal localization. In this work, we propose a new progressive cross-stream cooperation (PCSC) framework that improves all three tasks above. The basic idea is to utilize both spatial region (resp., temporal segment proposals) and features from one stream (i.e., the Flow/RGB stream) to help another stream (i.e., the RGB/Flow stream) to iteratively generate better bounding boxes in the spatial domain (resp., temporal segments in the temporal domain). In this way, not only the actions could be more accurately localized both spatially and temporally, but also the action classes could be predicted more precisely. Specifically, we first combine the latest region proposals (for spatial detection) or segment proposals (for temporal localization) from both streams to form a larger set of labelled training samples to help learn better action detection or segment detection models. Second, to learn better representations, we also propose a new message passing approach to pass information from one stream to another stream, which also leads to better action detection and segment detection models. By first using our newly proposed PCSC framework for spatial localization at the frame-level and then applying our temporal PCSC framework for temporal localization at the tube-level, the action localization results are progressively improved at both the frame level and the video level. Comprehensive experiments on two benchmark datasets UCF-101-24 and J-HMDB demonstrate the effectiveness of our newly proposed approaches for spatio-temporal action localization in realistic scenarios. |
Persistent Identifier | http://hdl.handle.net/10722/322061 |
ISSN | 2023 Impact Factor: 20.8 2023 SCImago Journal Rankings: 6.158 |
ISI Accession Number ID |
DC Field | Value | Language |
---|---|---|
dc.contributor.author | Su, Rui | - |
dc.contributor.author | Xu, Dong | - |
dc.contributor.author | Zhou, Luping | - |
dc.contributor.author | Ouyang, Wanli | - |
dc.date.accessioned | 2022-11-03T02:23:20Z | - |
dc.date.available | 2022-11-03T02:23:20Z | - |
dc.date.issued | 2021 | - |
dc.identifier.citation | IEEE Transactions on Pattern Analysis and Machine Intelligence, 2021, v. 43, n. 12, p. 4477-4490 | - |
dc.identifier.issn | 0162-8828 | - |
dc.identifier.uri | http://hdl.handle.net/10722/322061 | - |
dc.description.abstract | Spatio-temporal action localization consists of three levels of tasks: spatial localization, action classification, and temporal localization. In this work, we propose a new progressive cross-stream cooperation (PCSC) framework that improves all three tasks above. The basic idea is to utilize both spatial region (resp., temporal segment proposals) and features from one stream (i.e., the Flow/RGB stream) to help another stream (i.e., the RGB/Flow stream) to iteratively generate better bounding boxes in the spatial domain (resp., temporal segments in the temporal domain). In this way, not only the actions could be more accurately localized both spatially and temporally, but also the action classes could be predicted more precisely. Specifically, we first combine the latest region proposals (for spatial detection) or segment proposals (for temporal localization) from both streams to form a larger set of labelled training samples to help learn better action detection or segment detection models. Second, to learn better representations, we also propose a new message passing approach to pass information from one stream to another stream, which also leads to better action detection and segment detection models. By first using our newly proposed PCSC framework for spatial localization at the frame-level and then applying our temporal PCSC framework for temporal localization at the tube-level, the action localization results are progressively improved at both the frame level and the video level. Comprehensive experiments on two benchmark datasets UCF-101-24 and J-HMDB demonstrate the effectiveness of our newly proposed approaches for spatio-temporal action localization in realistic scenarios. | - |
dc.language | eng | - |
dc.relation.ispartof | IEEE Transactions on Pattern Analysis and Machine Intelligence | - |
dc.subject | Action localization | - |
dc.subject | spatio-temporal action localization | - |
dc.subject | two-stream cooperation | - |
dc.title | Progressive Cross-Stream Cooperation in Spatial and Temporal Domain for Action Localization | - |
dc.type | Article | - |
dc.description.nature | link_to_subscribed_fulltext | - |
dc.identifier.doi | 10.1109/TPAMI.2020.2997860 | - |
dc.identifier.pmid | 32750775 | - |
dc.identifier.scopus | eid_2-s2.0-85118604122 | - |
dc.identifier.volume | 43 | - |
dc.identifier.issue | 12 | - |
dc.identifier.spage | 4477 | - |
dc.identifier.epage | 4490 | - |
dc.identifier.eissn | 1939-3539 | - |
dc.identifier.isi | WOS:000714203900024 | - |