File Download

There are no files associated with this item.

  Links for fulltext
     (May Require Subscription)
Supplementary

Conference Paper: Visual event recognition in news video using kernel methods with multi-level temporal alignment

TitleVisual event recognition in news video using kernel methods with multi-level temporal alignment
Authors
Issue Date2007
Citation
Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2007, article no. 4270251 How to Cite?
AbstractIn this work, we systematically study the problem of visual event recognition in unconstrained news video sequences. We adopt the discriminative kernel-based method for which video clip similarity plays an important role. First, we represent a video clip as a bag of orderless descriptors extracted from all of the constituent frames and apply Earth Mover's Distance (EMD) to integrate similarities among frames from two clips. Observing that a video clip is usually comprised of multiple sub-clips corresponding to event evolution over time, we further build a multilevel temporal pyramid. At each pyramid level, we integrate the information from different sub-clips with Integer-value- constrained EMD to explicitly align the sub-clips. By fusing the information from the different pyramid levels, we develop Temporally Aligned Pyramid Matching (TAPM) for measuring video similarity. We conduct comprehensive experiments on the Trecvid 2005 corpus, which contains more than 6,800 clips. Our experiments demonstrate that 1) the TAPM multi-level method clearly outperforms single-level EMD, and 2) single-level EMD outperforms by a large margin (43.0% in Mean Average Precision) basic detection methods that use only a single key-frame. Extensive analysis of the results also reveals an intuitive interpretation of subclip alignment at different levels. © 2007 IEEE.
Persistent Identifierhttp://hdl.handle.net/10722/321323
ISSN
2023 SCImago Journal Rankings: 10.331

 

DC FieldValueLanguage
dc.contributor.authorXu, Dong-
dc.contributor.authorChang, Shih Fu-
dc.date.accessioned2022-11-03T02:18:09Z-
dc.date.available2022-11-03T02:18:09Z-
dc.date.issued2007-
dc.identifier.citationProceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2007, article no. 4270251-
dc.identifier.issn1063-6919-
dc.identifier.urihttp://hdl.handle.net/10722/321323-
dc.description.abstractIn this work, we systematically study the problem of visual event recognition in unconstrained news video sequences. We adopt the discriminative kernel-based method for which video clip similarity plays an important role. First, we represent a video clip as a bag of orderless descriptors extracted from all of the constituent frames and apply Earth Mover's Distance (EMD) to integrate similarities among frames from two clips. Observing that a video clip is usually comprised of multiple sub-clips corresponding to event evolution over time, we further build a multilevel temporal pyramid. At each pyramid level, we integrate the information from different sub-clips with Integer-value- constrained EMD to explicitly align the sub-clips. By fusing the information from the different pyramid levels, we develop Temporally Aligned Pyramid Matching (TAPM) for measuring video similarity. We conduct comprehensive experiments on the Trecvid 2005 corpus, which contains more than 6,800 clips. Our experiments demonstrate that 1) the TAPM multi-level method clearly outperforms single-level EMD, and 2) single-level EMD outperforms by a large margin (43.0% in Mean Average Precision) basic detection methods that use only a single key-frame. Extensive analysis of the results also reveals an intuitive interpretation of subclip alignment at different levels. © 2007 IEEE.-
dc.languageeng-
dc.relation.ispartofProceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition-
dc.titleVisual event recognition in news video using kernel methods with multi-level temporal alignment-
dc.typeConference_Paper-
dc.description.naturelink_to_subscribed_fulltext-
dc.identifier.doi10.1109/CVPR.2007.383226-
dc.identifier.scopuseid_2-s2.0-34948823856-
dc.identifier.spagearticle no. 4270251-
dc.identifier.epagearticle no. 4270251-

Export via OAI-PMH Interface in XML Formats


OR


Export to Other Non-XML Formats