AE TextSpotter: Learning Visual and Linguistic Representation for Ambiguous Text Spotting

Wang, W; Liu, X; Ji, X; Xie, E; Liang, D; Yang, Z; Lu, T; Shen, C; Luo, P

File Download

There are no files associated with this item.

Supplementary

Citations:
Appears in Collections:
- Computer Science: Conference papers

Conference Paper: AE TextSpotter: Learning Visual and Linguistic Representation for Ambiguous Text Spotting

Title	AE TextSpotter: Learning Visual and Linguistic Representation for Ambiguous Text Spotting
Authors	Wang, W Liu, X Ji, X Xie, E Liang, D Yang, Z Lu, T Shen, C Luo, P
Keywords	Text Spotting Text Detection Text Recognition Text Detection Ambiguit
Issue Date	2020
Citation	The 16th European Conference on Computer Vision (ECCV), Online, 23-28 August 2020 How to Cite?
Abstract	Scene text spotting aims to detect and recognize the entire word or sentence with multiple characters in natural images. It is still challenging because ambiguity often occurs when the spacing between characters is large or the characters are evenly spread in multiple rows and columns, making many visually plausible groupings of the characters (e.g. 'BERLIN' is incorrectly detected as 'BERL' and 'IN' in Fig. 1(c)). Unlike previous works that merely employed visual features for text detection, this work proposes a novel text spotter, named Ambiguity Eliminating Text Spotter (AE TextSpotter), which learns both visual and linguistic features to significantly reduce ambiguity in text detection. The proposed AE TextSpotter has three important benefits. 1) The linguistic representation is learned together with the visual representation in a framework. To our knowledge, it is the first time to improve text detection by using a language model. 2) A carefully designed language module is utilized to reduce the detection confidence of incorrect text lines, making them easily pruned in the detection stage. 3) Extensive experiments show that AE TextSpotter outperforms other state-of-the-art methods by a large margin. For example, we carefully select a set of extremely ambiguous samples from the IC19-ReCTS dataset, where our approach surpasses other methods by more than 4%
Description	ECCV 2020 take place virtually due to COVID-19 Poster Presentation - Paper ID: 2183
Persistent Identifier	http://hdl.handle.net/10722/284148

DC Field	Value	Language
dc.contributor.author	Wang, W	-
dc.contributor.author	Liu, X	-
dc.contributor.author	Ji, X	-
dc.contributor.author	Xie, E	-
dc.contributor.author	Liang, D	-
dc.contributor.author	Yang, Z	-
dc.contributor.author	Lu, T	-
dc.contributor.author	Shen, C	-
dc.contributor.author	Luo, P	-
dc.date.accessioned	2020-07-20T05:56:28Z	-
dc.date.available	2020-07-20T05:56:28Z	-
dc.date.issued	2020	-
dc.identifier.citation	The 16th European Conference on Computer Vision (ECCV), Online, 23-28 August 2020	-
dc.identifier.uri	http://hdl.handle.net/10722/284148	-
dc.description	ECCV 2020 take place virtually due to COVID-19	-
dc.description	Poster Presentation - Paper ID: 2183	-
dc.description.abstract	Scene text spotting aims to detect and recognize the entire word or sentence with multiple characters in natural images. It is still challenging because ambiguity often occurs when the spacing between characters is large or the characters are evenly spread in multiple rows and columns, making many visually plausible groupings of the characters (e.g. 'BERLIN' is incorrectly detected as 'BERL' and 'IN' in Fig. 1(c)). Unlike previous works that merely employed visual features for text detection, this work proposes a novel text spotter, named Ambiguity Eliminating Text Spotter (AE TextSpotter), which learns both visual and linguistic features to significantly reduce ambiguity in text detection. The proposed AE TextSpotter has three important benefits. 1) The linguistic representation is learned together with the visual representation in a framework. To our knowledge, it is the first time to improve text detection by using a language model. 2) A carefully designed language module is utilized to reduce the detection confidence of incorrect text lines, making them easily pruned in the detection stage. 3) Extensive experiments show that AE TextSpotter outperforms other state-of-the-art methods by a large margin. For example, we carefully select a set of extremely ambiguous samples from the IC19-ReCTS dataset, where our approach surpasses other methods by more than 4%	-
dc.language	eng	-
dc.relation.ispartof	European Conference on Computer Vision (ECCV)	-
dc.subject	Text Spotting	-
dc.subject	Text Detection	-
dc.subject	Text Recognition	-
dc.subject	Text Detection Ambiguit	-
dc.title	AE TextSpotter: Learning Visual and Linguistic Representation for Ambiguous Text Spotting	-
dc.type	Conference_Paper	-
dc.identifier.email	Luo, P: pluo@hku.hk	-
dc.identifier.authority	Luo, P=rp02575	-
dc.identifier.hkuros	311006	-

File Download

Supplementary

Conference Paper: AE TextSpotter: Learning Visual and Linguistic Representation for Ambiguous Text Spotting

Export via OAI-PMH Interface in XML Formats

OR

Export to Other Non-XML Formats