Improving Deep Visual Representation for Person Re-identification by Global and Local Image-language Association

Chen, Dapeng; Li, Hongsheng; Liu, Xihui; Shen, Yantao; Shao, Jing; Yuan, Zejian; Wang, Xiaogang

File Download

There are no files associated with this item.

Links for fulltext

(May Require Subscription)

Publisher Website: 10.1007/978-3-030-01270-0_4
Scopus: eid_2-s2.0-85055102182
WOS: WOS:000603403700004
Find via

Supplementary

Citations:
- Scopus: 0
- Web of Science: 0
Appears in Collections:
- Electrical & Electronic Engineering: Conference papers

Conference Paper: Improving Deep Visual Representation for Person Re-identification by Global and Local Image-language Association

Title	Improving Deep Visual Representation for Person Re-identification by Global and Local Image-language Association
Authors	Chen, Dapeng Li, Hongsheng Liu, Xihui Shen, Yantao Shao, Jing Yuan, Zejian Wang, Xiaogang
Keywords	Image-text correspondence Local-global language association Person re-identification
Issue Date	2018
Publisher	Springer
Citation	15th European Conference on Computer Vision (ECCV 2018), Munich, Germany, September 8-14 2018. In Ferrari, V, Hebert, M, Sminchisescu, C, et al. (Eds), Computer Vision – ECCV 2018: 15th European Conference, Munich, Germany, September 8-14, 2018, Proceedings, Part XVI, p. 56-73. Cham, Switzerland: Springer, 2018 How to Cite? DOI: http://dx.doi.org/10.1007/978-3-030-01270-0_4
Abstract	Person re-identification is an important task that requires learning discriminative visual features for distinguishing different person identities. Diverse auxiliary information has been utilized to improve the visual feature learning. In this paper, we propose to exploit natural language description as additional training supervisions for effective visual features. Compared with other auxiliary information, language can describe a specific person from more compact and semantic visual aspects, thus is complementary to the pixel-level image data. Our method not only learns better global visual feature with the supervision of the overall description but also enforces semantic consistencies between local visual and linguistic features, which is achieved by building global and local image-language associations. The global image-language association is established according to the identity labels, while the local association is based upon the implicit correspondences between image regions and noun phrases. Extensive experiments demonstrate the effectiveness of employing language as training supervisions with the two association schemes. Our method achieves state-of-the-art performance without utilizing any auxiliary information during testing and shows better performance than other joint embedding methods for the image-language association.
Persistent Identifier	http://hdl.handle.net/10722/316499
ISBN	9783030012694
ISSN	0302-9743 2023 SCImago Journal Rankings: 0.606
ISI Accession Number ID	WOS:000603403700004
Series/Report no.	Lecture Notes in Computer Science ; 11220 LNCS Sublibrary. SL 6, Image Processing, Computer Vision, Pattern Recognition, and Graphics

DC Field	Value	Language
dc.contributor.author	Chen, Dapeng	-
dc.contributor.author	Li, Hongsheng	-
dc.contributor.author	Liu, Xihui	-
dc.contributor.author	Shen, Yantao	-
dc.contributor.author	Shao, Jing	-
dc.contributor.author	Yuan, Zejian	-
dc.contributor.author	Wang, Xiaogang	-
dc.date.accessioned	2022-09-14T11:40:36Z	-
dc.date.available	2022-09-14T11:40:36Z	-
dc.date.issued	2018	-
dc.identifier.citation	15th European Conference on Computer Vision (ECCV 2018), Munich, Germany, September 8-14 2018. In Ferrari, V, Hebert, M, Sminchisescu, C, et al. (Eds), Computer Vision – ECCV 2018: 15th European Conference, Munich, Germany, September 8-14, 2018, Proceedings, Part XVI, p. 56-73. Cham, Switzerland: Springer, 2018	-
dc.identifier.isbn	9783030012694	-
dc.identifier.issn	0302-9743	-
dc.identifier.uri	http://hdl.handle.net/10722/316499	-
dc.description.abstract	Person re-identification is an important task that requires learning discriminative visual features for distinguishing different person identities. Diverse auxiliary information has been utilized to improve the visual feature learning. In this paper, we propose to exploit natural language description as additional training supervisions for effective visual features. Compared with other auxiliary information, language can describe a specific person from more compact and semantic visual aspects, thus is complementary to the pixel-level image data. Our method not only learns better global visual feature with the supervision of the overall description but also enforces semantic consistencies between local visual and linguistic features, which is achieved by building global and local image-language associations. The global image-language association is established according to the identity labels, while the local association is based upon the implicit correspondences between image regions and noun phrases. Extensive experiments demonstrate the effectiveness of employing language as training supervisions with the two association schemes. Our method achieves state-of-the-art performance without utilizing any auxiliary information during testing and shows better performance than other joint embedding methods for the image-language association.	-
dc.language	eng	-
dc.publisher	Springer	-
dc.relation.ispartof	Computer Vision – ECCV 2018: 15th European Conference, Munich, Germany, September 8-14, 2018, Proceedings, Part XVI	-
dc.relation.ispartofseries	Lecture Notes in Computer Science ; 11220	-
dc.relation.ispartofseries	LNCS Sublibrary. SL 6, Image Processing, Computer Vision, Pattern Recognition, and Graphics	-
dc.subject	Image-text correspondence	-
dc.subject	Local-global language association	-
dc.subject	Person re-identification	-
dc.title	Improving Deep Visual Representation for Person Re-identification by Global and Local Image-language Association	-
dc.type	Conference_Paper	-
dc.description.nature	link_to_subscribed_fulltext	-
dc.identifier.doi	10.1007/978-3-030-01270-0_4	-
dc.identifier.scopus	eid_2-s2.0-85055102182	-
dc.identifier.spage	56	-
dc.identifier.epage	73	-
dc.identifier.eissn	1611-3349	-
dc.identifier.isi	WOS:000603403700004	-
dc.publisher.place	Cham, Switzerland	-

File Download

Links for fulltext

(May Require Subscription)

Supplementary

Conference Paper: Improving Deep Visual Representation for Person Re-identification by Global and Local Image-language Association

Export via OAI-PMH Interface in XML Formats

OR

Export to Other Non-XML Formats