Action and Event Recognition in Videos by Learning From Heterogeneous Web Sources

Niu, Li; Xu, Xinxing; Chen, Lin; Duan, Lixin; Xu, Dong

File Download

There are no files associated with this item.

Links for fulltext

(May Require Subscription)

Publisher Website: 10.1109/TNNLS.2016.2518700
Scopus: eid_2-s2.0-84960532368
PMID: 26978834
WOS: WOS:000401982100004
Find via

Supplementary

Citations:
- Scopus: 0
- Web of Science: 0
- PubMed Central: 0
Appears in Collections:
- Computer Science: Journal/Magazine Articles

Article: Action and Event Recognition in Videos by Learning From Heterogeneous Web Sources

Title	Action and Event Recognition in Videos by Learning From Heterogeneous Web Sources
Authors	Niu, Li Xu, Xinxing Chen, Lin Duan, Lixin Xu, Dong
Keywords	Domain adaptation learning using privileged information multiple kernel learning
Issue Date	2017
Citation	IEEE Transactions on Neural Networks and Learning Systems, 2017, v. 28, n. 6, p. 1290-1304 How to Cite? DOI: http://dx.doi.org/10.1109/TNNLS.2016.2518700
Abstract	In this paper, we propose new approaches for action and event recognition by leveraging a large number of freely available Web videos (e.g., from Flickr video search engine) and Web images (e.g., from Bing and Google image search engines). We address this problem by formulating it as a new multi-domain adaptation problem, in which heterogeneous Web sources are provided. Specifically, we are given different types of visual features (e.g., the DeCAF features from Bing/Google images and the trajectory-based features from Flickr videos) from heterogeneous source domains and all types of visual features from the target domain. Considering the target domain is more relevant to some source domains, we propose a new approach named multi-domain adaptation with heterogeneous sources (MDA-HS) to effectively make use of the heterogeneous sources. In MDA-HS, we simultaneously seek for the optimal weights of multiple source domains, infer the labels of target domain samples, and learn an optimal target classifier. Moreover, as textual descriptions are often available for both Web videos and images, we propose a novel approach called MDA-HS using privileged information (MDA-HS+) to effectively incorporate the valuable textual information into our MDA-HS method, based on the recent learning using privileged information paradigm. MDA-HS+ can be further extended by using a new elastic-net-like regularization. We solve our MDA-HS and MDA-HS+ methods by using the cutting-plane algorithm, in which a multiple kernel learning problem is derived and solved. Extensive experiments on three benchmark data sets demonstrate that our proposed approaches are effective for action and event recognition without requiring any labeled samples from the target domain.
Persistent Identifier	http://hdl.handle.net/10722/321664
ISSN	2162-237X 2023 Impact Factor: 10.2 2023 SCImago Journal Rankings: 4.170
ISI Accession Number ID	WOS:000401982100004

DC Field	Value	Language
dc.contributor.author	Niu, Li	-
dc.contributor.author	Xu, Xinxing	-
dc.contributor.author	Chen, Lin	-
dc.contributor.author	Duan, Lixin	-
dc.contributor.author	Xu, Dong	-
dc.date.accessioned	2022-11-03T02:20:36Z	-
dc.date.available	2022-11-03T02:20:36Z	-
dc.date.issued	2017	-
dc.identifier.citation	IEEE Transactions on Neural Networks and Learning Systems, 2017, v. 28, n. 6, p. 1290-1304	-
dc.identifier.issn	2162-237X	-
dc.identifier.uri	http://hdl.handle.net/10722/321664	-
dc.description.abstract	In this paper, we propose new approaches for action and event recognition by leveraging a large number of freely available Web videos (e.g., from Flickr video search engine) and Web images (e.g., from Bing and Google image search engines). We address this problem by formulating it as a new multi-domain adaptation problem, in which heterogeneous Web sources are provided. Specifically, we are given different types of visual features (e.g., the DeCAF features from Bing/Google images and the trajectory-based features from Flickr videos) from heterogeneous source domains and all types of visual features from the target domain. Considering the target domain is more relevant to some source domains, we propose a new approach named multi-domain adaptation with heterogeneous sources (MDA-HS) to effectively make use of the heterogeneous sources. In MDA-HS, we simultaneously seek for the optimal weights of multiple source domains, infer the labels of target domain samples, and learn an optimal target classifier. Moreover, as textual descriptions are often available for both Web videos and images, we propose a novel approach called MDA-HS using privileged information (MDA-HS+) to effectively incorporate the valuable textual information into our MDA-HS method, based on the recent learning using privileged information paradigm. MDA-HS+ can be further extended by using a new elastic-net-like regularization. We solve our MDA-HS and MDA-HS+ methods by using the cutting-plane algorithm, in which a multiple kernel learning problem is derived and solved. Extensive experiments on three benchmark data sets demonstrate that our proposed approaches are effective for action and event recognition without requiring any labeled samples from the target domain.	-
dc.language	eng	-
dc.relation.ispartof	IEEE Transactions on Neural Networks and Learning Systems	-
dc.subject	Domain adaptation	-
dc.subject	learning using privileged information	-
dc.subject	multiple kernel learning	-
dc.title	Action and Event Recognition in Videos by Learning From Heterogeneous Web Sources	-
dc.type	Article	-
dc.description.nature	link_to_subscribed_fulltext	-
dc.identifier.doi	10.1109/TNNLS.2016.2518700	-
dc.identifier.pmid	26978834	-
dc.identifier.scopus	eid_2-s2.0-84960532368	-
dc.identifier.volume	28	-
dc.identifier.issue	6	-
dc.identifier.spage	1290	-
dc.identifier.epage	1304	-
dc.identifier.eissn	2162-2388	-
dc.identifier.isi	WOS:000401982100004	-

File Download

Links for fulltext

(May Require Subscription)

Supplementary

Article: Action and Event Recognition in Videos by Learning From Heterogeneous Web Sources

Export via OAI-PMH Interface in XML Formats

OR

Export to Other Non-XML Formats