File Download
There are no files associated with this item.
Supplementary
-
Citations:
- Appears in Collections:
Conference Paper: End-to-End Video Text Spotting with Transformer
Title | End-to-End Video Text Spotting with Transformer |
---|---|
Authors | |
Issue Date | 2022 |
Publisher | Ortra Ltd. |
Citation | European Conference on Computer Vision (Hybrid), Tel Aviv, Israel, October 23-27, 2022. In Proceedings of the European Conference on Computer Vision (ECCV), 2022 How to Cite? |
Abstract | Recent video text spotting methods usually require the three-staged pipeline, i.e., detecting text in individual images, recognizing localized text, tracking text streams with post-processing to generate final results. These methods typically follow the tracking-by-match paradigm and develop sophisticated pipelines. In this paper, rooted in Transformer sequence modeling, we propose a simple, but effective end-to-end video text DEtection, Tracking, and Recognition framework (TransDETR). TransDETR mainly includes two advantages: 1) Different from the explicit match paradigm in the adjacent frame, TransDETR tracks and recognizes each text implicitly by the different query termed text query over long-range temporal sequence (more than 7 frames). 2) TransDETR is the first end-to-end trainable video text spotting framework, which simultaneously addresses the three sub-tasks (e.g., text detection, tracking, recognition). Extensive experiments in four video text datasets (i.e.,ICDAR2013 Video, ICDAR2015 Video, Minetto, and YouTube Video Text) are conducted to demonstrate that TransDETR achieves state-of-the-art performance with up to around 8.0% improvements on video text spotting tasks. |
Persistent Identifier | http://hdl.handle.net/10722/315806 |
DC Field | Value | Language |
---|---|---|
dc.contributor.author | Wu, W | - |
dc.contributor.author | Cai, Y | - |
dc.contributor.author | Shen, C | - |
dc.contributor.author | Zhang, D | - |
dc.contributor.author | Fu, Y | - |
dc.contributor.author | Zhou, H | - |
dc.contributor.author | Luo, P | - |
dc.date.accessioned | 2022-08-19T09:04:47Z | - |
dc.date.available | 2022-08-19T09:04:47Z | - |
dc.date.issued | 2022 | - |
dc.identifier.citation | European Conference on Computer Vision (Hybrid), Tel Aviv, Israel, October 23-27, 2022. In Proceedings of the European Conference on Computer Vision (ECCV), 2022 | - |
dc.identifier.uri | http://hdl.handle.net/10722/315806 | - |
dc.description.abstract | Recent video text spotting methods usually require the three-staged pipeline, i.e., detecting text in individual images, recognizing localized text, tracking text streams with post-processing to generate final results. These methods typically follow the tracking-by-match paradigm and develop sophisticated pipelines. In this paper, rooted in Transformer sequence modeling, we propose a simple, but effective end-to-end video text DEtection, Tracking, and Recognition framework (TransDETR). TransDETR mainly includes two advantages: 1) Different from the explicit match paradigm in the adjacent frame, TransDETR tracks and recognizes each text implicitly by the different query termed text query over long-range temporal sequence (more than 7 frames). 2) TransDETR is the first end-to-end trainable video text spotting framework, which simultaneously addresses the three sub-tasks (e.g., text detection, tracking, recognition). Extensive experiments in four video text datasets (i.e.,ICDAR2013 Video, ICDAR2015 Video, Minetto, and YouTube Video Text) are conducted to demonstrate that TransDETR achieves state-of-the-art performance with up to around 8.0% improvements on video text spotting tasks. | - |
dc.language | eng | - |
dc.publisher | Ortra Ltd. | - |
dc.relation.ispartof | Proceedings of the European Conference on Computer Vision (ECCV), 2022 | - |
dc.title | End-to-End Video Text Spotting with Transformer | - |
dc.type | Conference_Paper | - |
dc.identifier.email | Luo, P: pluo@hku.hk | - |
dc.identifier.authority | Luo, P=rp02575 | - |
dc.identifier.doi | 10.48550/arXiv.2203.10539 | - |
dc.identifier.hkuros | 335609 | - |
dc.publisher.place | Israel | - |