Cross-Model Pseudo-Labeling for Semi-Supervised Action Recognition

Xu, Yinghao; Wei, Fangyun; Sun, Xiao; Yang, Ceyuan; Shen, Yujun; Dai, Bo; Zhou, Bolei; Lin, Stephen

File Download

There are no files associated with this item.

Links for fulltext

(May Require Subscription)

Publisher Website: 10.1109/CVPR52688.2022.00297
Scopus: eid_2-s2.0-85136488690
Find via

Supplementary

Citations:
- Scopus: 0
Appears in Collections:
- HKU Musketeers Foundation Institute of Data Science: Conference papers

Conference Paper: Cross-Model Pseudo-Labeling for Semi-Supervised Action Recognition

Title	Cross-Model Pseudo-Labeling for Semi-Supervised Action Recognition
Authors	Xu, Yinghao Wei, Fangyun Sun, Xiao Yang, Ceyuan Shen, Yujun Dai, Bo Zhou, Bolei Lin, Stephen
Keywords	Self-& semi-& meta- & unsupervised learning Video analysis and understanding
Issue Date	2022
Citation	Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2022, v. 2022-June, p. 2949-2958 How to Cite? DOI: http://dx.doi.org/10.1109/CVPR52688.2022.00297
Abstract	Semi-supervised action recognition is a challenging but important task due to the high cost of data annotation. A common approach to this problem is to assign unlabeled data with pseudo-labels, which are then used as additional supervision in training. Typically in recent work, the pseudo-labels are obtained by training a model on the labeled data, and then using confident predictions from the model to teach itself. In this work, we propose a more effective pseudo-labeling scheme, called Cross-Model Pseudo-Labeling (CMPL). Concretely, we introduce a lightweight auxiliary network in addition to the primary backbone, and ask them to predict pseudo-labels for each other. We observe that, due to their different structural biases, these two models tend to learn complementary representations from the same video clips. Each model can thus benefit from its counterpart by utilizing cross-model predictions as supervision. Experiments on different data partition protocols demonstrate the significant improvement of our framework over existing alternatives. For example, CMPL achieves 17.6% and 25.1% Top-1 accuracy on Kinetics-400 and UCF-101 using only the RGB modality and 1% labeled data, outperforming our baseline model, FixMatch [17], by 9.0% and 10.3%, respectively. 11Project page is at https://justimyhxu.github.io/projects/cmpl/.
Persistent Identifier	http://hdl.handle.net/10722/352303
ISSN	1063-6919 2023 SCImago Journal Rankings: 10.331

DC Field	Value	Language
dc.contributor.author	Xu, Yinghao	-
dc.contributor.author	Wei, Fangyun	-
dc.contributor.author	Sun, Xiao	-
dc.contributor.author	Yang, Ceyuan	-
dc.contributor.author	Shen, Yujun	-
dc.contributor.author	Dai, Bo	-
dc.contributor.author	Zhou, Bolei	-
dc.contributor.author	Lin, Stephen	-
dc.date.accessioned	2024-12-16T03:57:58Z	-
dc.date.available	2024-12-16T03:57:58Z	-
dc.date.issued	2022	-
dc.identifier.citation	Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2022, v. 2022-June, p. 2949-2958	-
dc.identifier.issn	1063-6919	-
dc.identifier.uri	http://hdl.handle.net/10722/352303	-
dc.description.abstract	Semi-supervised action recognition is a challenging but important task due to the high cost of data annotation. A common approach to this problem is to assign unlabeled data with pseudo-labels, which are then used as additional supervision in training. Typically in recent work, the pseudo-labels are obtained by training a model on the labeled data, and then using confident predictions from the model to teach itself. In this work, we propose a more effective pseudo-labeling scheme, called Cross-Model Pseudo-Labeling (CMPL). Concretely, we introduce a lightweight auxiliary network in addition to the primary backbone, and ask them to predict pseudo-labels for each other. We observe that, due to their different structural biases, these two models tend to learn complementary representations from the same video clips. Each model can thus benefit from its counterpart by utilizing cross-model predictions as supervision. Experiments on different data partition protocols demonstrate the significant improvement of our framework over existing alternatives. For example, CMPL achieves 17.6% and 25.1% Top-1 accuracy on Kinetics-400 and UCF-101 using only the RGB modality and 1% labeled data, outperforming our baseline model, FixMatch [17], by 9.0% and 10.3%, respectively. 11Project page is at https://justimyhxu.github.io/projects/cmpl/.	-
dc.language	eng	-
dc.relation.ispartof	Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition	-
dc.subject	Self-& semi-& meta- & unsupervised learning	-
dc.subject	Video analysis and understanding	-
dc.title	Cross-Model Pseudo-Labeling for Semi-Supervised Action Recognition	-
dc.type	Conference_Paper	-
dc.description.nature	link_to_subscribed_fulltext	-
dc.identifier.doi	10.1109/CVPR52688.2022.00297	-
dc.identifier.scopus	eid_2-s2.0-85136488690	-
dc.identifier.volume	2022-June	-
dc.identifier.spage	2949	-
dc.identifier.epage	2958	-

File Download

Links for fulltext

(May Require Subscription)

Supplementary

Conference Paper: Cross-Model Pseudo-Labeling for Semi-Supervised Action Recognition

Export via OAI-PMH Interface in XML Formats

OR

Export to Other Non-XML Formats