File Download

There are no files associated with this item.

  Links for fulltext
     (May Require Subscription)
Supplementary

Article: Action and Event Recognition in Videos by Learning From Heterogeneous Web Sources

TitleAction and Event Recognition in Videos by Learning From Heterogeneous Web Sources
Authors
KeywordsDomain adaptation
learning using privileged information
multiple kernel learning
Issue Date2017
Citation
IEEE Transactions on Neural Networks and Learning Systems, 2017, v. 28, n. 6, p. 1290-1304 How to Cite?
AbstractIn this paper, we propose new approaches for action and event recognition by leveraging a large number of freely available Web videos (e.g., from Flickr video search engine) and Web images (e.g., from Bing and Google image search engines). We address this problem by formulating it as a new multi-domain adaptation problem, in which heterogeneous Web sources are provided. Specifically, we are given different types of visual features (e.g., the DeCAF features from Bing/Google images and the trajectory-based features from Flickr videos) from heterogeneous source domains and all types of visual features from the target domain. Considering the target domain is more relevant to some source domains, we propose a new approach named multi-domain adaptation with heterogeneous sources (MDA-HS) to effectively make use of the heterogeneous sources. In MDA-HS, we simultaneously seek for the optimal weights of multiple source domains, infer the labels of target domain samples, and learn an optimal target classifier. Moreover, as textual descriptions are often available for both Web videos and images, we propose a novel approach called MDA-HS using privileged information (MDA-HS+) to effectively incorporate the valuable textual information into our MDA-HS method, based on the recent learning using privileged information paradigm. MDA-HS+ can be further extended by using a new elastic-net-like regularization. We solve our MDA-HS and MDA-HS+ methods by using the cutting-plane algorithm, in which a multiple kernel learning problem is derived and solved. Extensive experiments on three benchmark data sets demonstrate that our proposed approaches are effective for action and event recognition without requiring any labeled samples from the target domain.
Persistent Identifierhttp://hdl.handle.net/10722/321664
ISSN
2021 Impact Factor: 14.255
2020 SCImago Journal Rankings: 2.882

 

DC FieldValueLanguage
dc.contributor.authorNiu, Li-
dc.contributor.authorXu, Xinxing-
dc.contributor.authorChen, Lin-
dc.contributor.authorDuan, Lixin-
dc.contributor.authorXu, Dong-
dc.date.accessioned2022-11-03T02:20:36Z-
dc.date.available2022-11-03T02:20:36Z-
dc.date.issued2017-
dc.identifier.citationIEEE Transactions on Neural Networks and Learning Systems, 2017, v. 28, n. 6, p. 1290-1304-
dc.identifier.issn2162-237X-
dc.identifier.urihttp://hdl.handle.net/10722/321664-
dc.description.abstractIn this paper, we propose new approaches for action and event recognition by leveraging a large number of freely available Web videos (e.g., from Flickr video search engine) and Web images (e.g., from Bing and Google image search engines). We address this problem by formulating it as a new multi-domain adaptation problem, in which heterogeneous Web sources are provided. Specifically, we are given different types of visual features (e.g., the DeCAF features from Bing/Google images and the trajectory-based features from Flickr videos) from heterogeneous source domains and all types of visual features from the target domain. Considering the target domain is more relevant to some source domains, we propose a new approach named multi-domain adaptation with heterogeneous sources (MDA-HS) to effectively make use of the heterogeneous sources. In MDA-HS, we simultaneously seek for the optimal weights of multiple source domains, infer the labels of target domain samples, and learn an optimal target classifier. Moreover, as textual descriptions are often available for both Web videos and images, we propose a novel approach called MDA-HS using privileged information (MDA-HS+) to effectively incorporate the valuable textual information into our MDA-HS method, based on the recent learning using privileged information paradigm. MDA-HS+ can be further extended by using a new elastic-net-like regularization. We solve our MDA-HS and MDA-HS+ methods by using the cutting-plane algorithm, in which a multiple kernel learning problem is derived and solved. Extensive experiments on three benchmark data sets demonstrate that our proposed approaches are effective for action and event recognition without requiring any labeled samples from the target domain.-
dc.languageeng-
dc.relation.ispartofIEEE Transactions on Neural Networks and Learning Systems-
dc.subjectDomain adaptation-
dc.subjectlearning using privileged information-
dc.subjectmultiple kernel learning-
dc.titleAction and Event Recognition in Videos by Learning From Heterogeneous Web Sources-
dc.typeArticle-
dc.description.naturelink_to_subscribed_fulltext-
dc.identifier.doi10.1109/TNNLS.2016.2518700-
dc.identifier.pmid26978834-
dc.identifier.scopuseid_2-s2.0-84960532368-
dc.identifier.volume28-
dc.identifier.issue6-
dc.identifier.spage1290-
dc.identifier.epage1304-
dc.identifier.eissn2162-2388-

Export via OAI-PMH Interface in XML Formats


OR


Export to Other Non-XML Formats