File Download

There are no files associated with this item.

  Links for fulltext
     (May Require Subscription)
Supplementary

Conference Paper: Improving Deep Visual Representation for Person Re-identification by Global and Local Image-language Association

TitleImproving Deep Visual Representation for Person Re-identification by Global and Local Image-language Association
Authors
KeywordsImage-text correspondence
Local-global language association
Person re-identification
Issue Date2018
PublisherSpringer
Citation
15th European Conference on Computer Vision (ECCV 2018), Munich, Germany, September 8-14 2018. In Ferrari, V, Hebert, M, Sminchisescu, C, et al. (Eds), Computer Vision – ECCV 2018: 15th European Conference, Munich, Germany, September 8-14, 2018, Proceedings, Part XVI, p. 56-73. Cham, Switzerland: Springer, 2018 How to Cite?
AbstractPerson re-identification is an important task that requires learning discriminative visual features for distinguishing different person identities. Diverse auxiliary information has been utilized to improve the visual feature learning. In this paper, we propose to exploit natural language description as additional training supervisions for effective visual features. Compared with other auxiliary information, language can describe a specific person from more compact and semantic visual aspects, thus is complementary to the pixel-level image data. Our method not only learns better global visual feature with the supervision of the overall description but also enforces semantic consistencies between local visual and linguistic features, which is achieved by building global and local image-language associations. The global image-language association is established according to the identity labels, while the local association is based upon the implicit correspondences between image regions and noun phrases. Extensive experiments demonstrate the effectiveness of employing language as training supervisions with the two association schemes. Our method achieves state-of-the-art performance without utilizing any auxiliary information during testing and shows better performance than other joint embedding methods for the image-language association.
Persistent Identifierhttp://hdl.handle.net/10722/316499
ISBN
ISSN
2023 SCImago Journal Rankings: 0.606
ISI Accession Number ID
Series/Report no.Lecture Notes in Computer Science ; 11220
LNCS Sublibrary. SL 6, Image Processing, Computer Vision, Pattern Recognition, and Graphics

 

DC FieldValueLanguage
dc.contributor.authorChen, Dapeng-
dc.contributor.authorLi, Hongsheng-
dc.contributor.authorLiu, Xihui-
dc.contributor.authorShen, Yantao-
dc.contributor.authorShao, Jing-
dc.contributor.authorYuan, Zejian-
dc.contributor.authorWang, Xiaogang-
dc.date.accessioned2022-09-14T11:40:36Z-
dc.date.available2022-09-14T11:40:36Z-
dc.date.issued2018-
dc.identifier.citation15th European Conference on Computer Vision (ECCV 2018), Munich, Germany, September 8-14 2018. In Ferrari, V, Hebert, M, Sminchisescu, C, et al. (Eds), Computer Vision – ECCV 2018: 15th European Conference, Munich, Germany, September 8-14, 2018, Proceedings, Part XVI, p. 56-73. Cham, Switzerland: Springer, 2018-
dc.identifier.isbn9783030012694-
dc.identifier.issn0302-9743-
dc.identifier.urihttp://hdl.handle.net/10722/316499-
dc.description.abstractPerson re-identification is an important task that requires learning discriminative visual features for distinguishing different person identities. Diverse auxiliary information has been utilized to improve the visual feature learning. In this paper, we propose to exploit natural language description as additional training supervisions for effective visual features. Compared with other auxiliary information, language can describe a specific person from more compact and semantic visual aspects, thus is complementary to the pixel-level image data. Our method not only learns better global visual feature with the supervision of the overall description but also enforces semantic consistencies between local visual and linguistic features, which is achieved by building global and local image-language associations. The global image-language association is established according to the identity labels, while the local association is based upon the implicit correspondences between image regions and noun phrases. Extensive experiments demonstrate the effectiveness of employing language as training supervisions with the two association schemes. Our method achieves state-of-the-art performance without utilizing any auxiliary information during testing and shows better performance than other joint embedding methods for the image-language association.-
dc.languageeng-
dc.publisherSpringer-
dc.relation.ispartofComputer Vision – ECCV 2018: 15th European Conference, Munich, Germany, September 8-14, 2018, Proceedings, Part XVI-
dc.relation.ispartofseriesLecture Notes in Computer Science ; 11220-
dc.relation.ispartofseriesLNCS Sublibrary. SL 6, Image Processing, Computer Vision, Pattern Recognition, and Graphics-
dc.subjectImage-text correspondence-
dc.subjectLocal-global language association-
dc.subjectPerson re-identification-
dc.titleImproving Deep Visual Representation for Person Re-identification by Global and Local Image-language Association-
dc.typeConference_Paper-
dc.description.naturelink_to_subscribed_fulltext-
dc.identifier.doi10.1007/978-3-030-01270-0_4-
dc.identifier.scopuseid_2-s2.0-85055102182-
dc.identifier.spage56-
dc.identifier.epage73-
dc.identifier.eissn1611-3349-
dc.identifier.isiWOS:000603403700004-
dc.publisher.placeCham, Switzerland-

Export via OAI-PMH Interface in XML Formats


OR


Export to Other Non-XML Formats