ManiGAN: Text-Guided Image Manipulation

Li, B; Qi, X; Lukasiewicz, T; Torr, P

File Download

There are no files associated with this item.

Links for fulltext

(May Require Subscription)

Publisher Website: 10.1109/CVPR42600.2020.00790
Scopus: eid_2-s2.0-85094808832
Find via

Supplementary

Citations:
- Scopus: 0
Appears in Collections:
- Electrical & Electronic Engineering: Conference papers

Conference Paper: ManiGAN: Text-Guided Image Manipulation

Title	ManiGAN: Text-Guided Image Manipulation
Authors	Li, B Qi, X Lukasiewicz, T Torr, P
Keywords	Image reconstruction Visualization Fuses Image representation Semantics
Issue Date	2020
Publisher	IEEE Computer Society. The Journal's web site is located at http://ieeexplore.ieee.org/xpl/conhome.jsp?punumber=1000147
Citation	2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, WA, USA, 13-19 June 2020, p. 7877-7886 How to Cite? DOI: http://dx.doi.org/10.1109/CVPR42600.2020.00790
Abstract	The goal of our paper is to semantically edit parts of an image matching a given text that describes desired attributes (e.g., texture, colour, and background), while preserving other contents that are irrelevant to the text. To achieve this, we propose a novel generative adversarial network (ManiGAN), which contains two key components: text-image affine combination module (ACM) and detail correction module (DCM). The ACM selects image regions relevant to the given text and then correlates the regions with corresponding semantic words for effective manipulation. Meanwhile, it encodes original image features to help reconstruct text-irrelevant contents. The DCM rectifies mismatched attributes and completes missing contents of the synthetic image. Finally, we suggest a new metric for evaluating image manipulation results, in terms of both the generation of new attributes and the reconstruction of text-irrelevant contents. Extensive experiments on the CUB and COCO datasets demonstrate the superior performance of the proposed method.
Persistent Identifier	http://hdl.handle.net/10722/288470
ISSN	1063-6919 2020 SCImago Journal Rankings: 4.658

DC Field	Value	Language
dc.contributor.author	Li, B	-
dc.contributor.author	Qi, X	-
dc.contributor.author	Lukasiewicz, T	-
dc.contributor.author	Torr, P	-
dc.date.accessioned	2020-10-05T12:13:23Z	-
dc.date.available	2020-10-05T12:13:23Z	-
dc.date.issued	2020	-
dc.identifier.citation	2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, WA, USA, 13-19 June 2020, p. 7877-7886	-
dc.identifier.issn	1063-6919	-
dc.identifier.uri	http://hdl.handle.net/10722/288470	-
dc.description.abstract	The goal of our paper is to semantically edit parts of an image matching a given text that describes desired attributes (e.g., texture, colour, and background), while preserving other contents that are irrelevant to the text. To achieve this, we propose a novel generative adversarial network (ManiGAN), which contains two key components: text-image affine combination module (ACM) and detail correction module (DCM). The ACM selects image regions relevant to the given text and then correlates the regions with corresponding semantic words for effective manipulation. Meanwhile, it encodes original image features to help reconstruct text-irrelevant contents. The DCM rectifies mismatched attributes and completes missing contents of the synthetic image. Finally, we suggest a new metric for evaluating image manipulation results, in terms of both the generation of new attributes and the reconstruction of text-irrelevant contents. Extensive experiments on the CUB and COCO datasets demonstrate the superior performance of the proposed method.	-
dc.language	eng	-
dc.publisher	IEEE Computer Society. The Journal's web site is located at http://ieeexplore.ieee.org/xpl/conhome.jsp?punumber=1000147	-
dc.relation.ispartof	IEEE Conference on Computer Vision and Pattern Recognition. Proceedings	-
dc.rights	IEEE Conference on Computer Vision and Pattern Recognition. Proceedings. Copyright © IEEE Computer Society.	-
dc.rights	©2020 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.	-
dc.subject	Image reconstruction	-
dc.subject	Visualization	-
dc.subject	Fuses	-
dc.subject	Image representation	-
dc.subject	Semantics	-
dc.title	ManiGAN: Text-Guided Image Manipulation	-
dc.type	Conference_Paper	-
dc.identifier.email	Qi, X: xjqi@eee.hku.hk	-
dc.identifier.authority	Qi, X=rp02666	-
dc.description.nature	link_to_subscribed_fulltext	-
dc.identifier.doi	10.1109/CVPR42600.2020.00790	-
dc.identifier.scopus	eid_2-s2.0-85094808832	-
dc.identifier.hkuros	315443	-
dc.identifier.spage	7877	-
dc.identifier.epage	7886	-
dc.publisher.place	United States	-
dc.identifier.issnl	1063-6919	-

File Download

Links for fulltext

(May Require Subscription)

Supplementary

Conference Paper: ManiGAN: Text-Guided Image Manipulation

Export via OAI-PMH Interface in XML Formats

OR

Export to Other Non-XML Formats