File Download
There are no files associated with this item.
Links for fulltext
(May Require Subscription)
- Publisher Website: 10.1109/CVPR42600.2020.00790
- Scopus: eid_2-s2.0-85094808832
- Find via
Supplementary
-
Citations:
- Scopus: 0
- Appears in Collections:
Conference Paper: ManiGAN: Text-Guided Image Manipulation
Title | ManiGAN: Text-Guided Image Manipulation |
---|---|
Authors | |
Keywords | Image reconstruction Visualization Fuses Image representation Semantics |
Issue Date | 2020 |
Publisher | IEEE Computer Society. The Journal's web site is located at http://ieeexplore.ieee.org/xpl/conhome.jsp?punumber=1000147 |
Citation | 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, WA, USA, 13-19 June 2020, p. 7877-7886 How to Cite? |
Abstract | The goal of our paper is to semantically edit parts of an image matching a given text that describes desired attributes (e.g., texture, colour, and background), while preserving other contents that are irrelevant to the text. To achieve this, we propose a novel generative adversarial network (ManiGAN), which contains two key components: text-image affine combination module (ACM) and detail correction module (DCM). The ACM selects image regions relevant to the given text and then correlates the regions with corresponding semantic words for effective manipulation. Meanwhile, it encodes original image features to help reconstruct text-irrelevant contents. The DCM rectifies mismatched attributes and completes missing contents of the synthetic image. Finally, we suggest a new metric for evaluating image manipulation results, in terms of both the generation of new attributes and the reconstruction of text-irrelevant contents. Extensive experiments on the CUB and COCO datasets demonstrate the superior performance of the proposed method. |
Persistent Identifier | http://hdl.handle.net/10722/288470 |
ISSN | 2023 SCImago Journal Rankings: 10.331 |
DC Field | Value | Language |
---|---|---|
dc.contributor.author | Li, B | - |
dc.contributor.author | Qi, X | - |
dc.contributor.author | Lukasiewicz, T | - |
dc.contributor.author | Torr, P | - |
dc.date.accessioned | 2020-10-05T12:13:23Z | - |
dc.date.available | 2020-10-05T12:13:23Z | - |
dc.date.issued | 2020 | - |
dc.identifier.citation | 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, WA, USA, 13-19 June 2020, p. 7877-7886 | - |
dc.identifier.issn | 1063-6919 | - |
dc.identifier.uri | http://hdl.handle.net/10722/288470 | - |
dc.description.abstract | The goal of our paper is to semantically edit parts of an image matching a given text that describes desired attributes (e.g., texture, colour, and background), while preserving other contents that are irrelevant to the text. To achieve this, we propose a novel generative adversarial network (ManiGAN), which contains two key components: text-image affine combination module (ACM) and detail correction module (DCM). The ACM selects image regions relevant to the given text and then correlates the regions with corresponding semantic words for effective manipulation. Meanwhile, it encodes original image features to help reconstruct text-irrelevant contents. The DCM rectifies mismatched attributes and completes missing contents of the synthetic image. Finally, we suggest a new metric for evaluating image manipulation results, in terms of both the generation of new attributes and the reconstruction of text-irrelevant contents. Extensive experiments on the CUB and COCO datasets demonstrate the superior performance of the proposed method. | - |
dc.language | eng | - |
dc.publisher | IEEE Computer Society. The Journal's web site is located at http://ieeexplore.ieee.org/xpl/conhome.jsp?punumber=1000147 | - |
dc.relation.ispartof | IEEE Conference on Computer Vision and Pattern Recognition. Proceedings | - |
dc.rights | IEEE Conference on Computer Vision and Pattern Recognition. Proceedings. Copyright © IEEE Computer Society. | - |
dc.rights | ©2020 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works. | - |
dc.subject | Image reconstruction | - |
dc.subject | Visualization | - |
dc.subject | Fuses | - |
dc.subject | Image representation | - |
dc.subject | Semantics | - |
dc.title | ManiGAN: Text-Guided Image Manipulation | - |
dc.type | Conference_Paper | - |
dc.identifier.email | Qi, X: xjqi@eee.hku.hk | - |
dc.identifier.authority | Qi, X=rp02666 | - |
dc.description.nature | link_to_subscribed_fulltext | - |
dc.identifier.doi | 10.1109/CVPR42600.2020.00790 | - |
dc.identifier.scopus | eid_2-s2.0-85094808832 | - |
dc.identifier.hkuros | 315443 | - |
dc.identifier.spage | 7877 | - |
dc.identifier.epage | 7886 | - |
dc.publisher.place | United States | - |
dc.identifier.issnl | 1063-6919 | - |