Open-Edit: Open-Domain Image Manipulation with Open-Vocabulary Instructions

Liu, Xihui; Lin, Zhe; Zhang, Jianming; Zhao, Handong; Tran, Quan; Wang, Xiaogang; Li, Hongsheng

File Download

There are no files associated with this item.

Links for fulltext

(May Require Subscription)

Publisher Website: 10.1007/978-3-030-58621-8_6
Scopus: eid_2-s2.0-85097645652
WOS: WOS:001500594500006
Find via

Supplementary

Citations:
- Scopus: 0
- Web of Science: 0
Appears in Collections:
- Electrical & Electronic Engineering: Conference papers

Conference Paper: Open-Edit: Open-Domain Image Manipulation with Open-Vocabulary Instructions

Title	Open-Edit: Open-Domain Image Manipulation with Open-Vocabulary Instructions
Authors	Liu, Xihui Lin, Zhe Zhang, Jianming Zhao, Handong Tran, Quan Wang, Xiaogang Li, Hongsheng
Issue Date	2020
Publisher	Springer
Citation	16th European Conference on Computer Vision (ECCV 2020), Glasgow, UK, August 23-28 2020. In Vedaldi, A, Bischof, H, Brox, T, et al. (Eds), Computer Vision – ECCV 2020: 16th European Conference, Glasgow, UK, August 23–28, 2020, Proceedings, Part XI, p. 89-106. Cham: Springer, 2020 How to Cite? DOI: http://dx.doi.org/10.1007/978-3-030-58621-8_6
Abstract	We propose a novel algorithm, named Open-Edit, which is the first attempt on open-domain image manipulation with open-vocabulary instructions. It is a challenging task considering the large variation of image domains and the lack of training supervision. Our approach takes advantage of the unified visual-semantic embedding space pretrained on a general image-caption dataset, and manipulates the embedded visual features by applying text-guided vector arithmetic on the image feature maps. A structure-preserving image decoder then generates the manipulated images from the manipulated feature maps. We further propose an on-the-fly sample-specific optimization approach with cycle-consistency constraints to regularize the manipulated images and force them to preserve details of the source images. Our approach shows promising results in manipulating open-vocabulary color, texture, and high-level attributes for various scenarios of open-domain images (Code is released at https://github.com/xh-liu/Open-Edit).
Persistent Identifier	http://hdl.handle.net/10722/316564
ISBN	9783030586201
ISSN	0302-9743 2023 SCImago Journal Rankings: 0.606
ISI Accession Number ID	WOS:001500594500006
Series/Report no.	Lecture Notes in Computer Science ; 12356 LNCS Sublibrary. SL 6, Image Processing, Computer Vision, Pattern Recognition, and Graphics

DC Field	Value	Language
dc.contributor.author	Liu, Xihui	-
dc.contributor.author	Lin, Zhe	-
dc.contributor.author	Zhang, Jianming	-
dc.contributor.author	Zhao, Handong	-
dc.contributor.author	Tran, Quan	-
dc.contributor.author	Wang, Xiaogang	-
dc.contributor.author	Li, Hongsheng	-
dc.date.accessioned	2022-09-14T11:40:45Z	-
dc.date.available	2022-09-14T11:40:45Z	-
dc.date.issued	2020	-
dc.identifier.citation	16th European Conference on Computer Vision (ECCV 2020), Glasgow, UK, August 23-28 2020. In Vedaldi, A, Bischof, H, Brox, T, et al. (Eds), Computer Vision – ECCV 2020: 16th European Conference, Glasgow, UK, August 23–28, 2020, Proceedings, Part XI, p. 89-106. Cham: Springer, 2020	-
dc.identifier.isbn	9783030586201	-
dc.identifier.issn	0302-9743	-
dc.identifier.uri	http://hdl.handle.net/10722/316564	-
dc.description.abstract	We propose a novel algorithm, named Open-Edit, which is the first attempt on open-domain image manipulation with open-vocabulary instructions. It is a challenging task considering the large variation of image domains and the lack of training supervision. Our approach takes advantage of the unified visual-semantic embedding space pretrained on a general image-caption dataset, and manipulates the embedded visual features by applying text-guided vector arithmetic on the image feature maps. A structure-preserving image decoder then generates the manipulated images from the manipulated feature maps. We further propose an on-the-fly sample-specific optimization approach with cycle-consistency constraints to regularize the manipulated images and force them to preserve details of the source images. Our approach shows promising results in manipulating open-vocabulary color, texture, and high-level attributes for various scenarios of open-domain images (Code is released at https://github.com/xh-liu/Open-Edit).	-
dc.language	eng	-
dc.publisher	Springer	-
dc.relation.ispartof	Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)	-
dc.relation.ispartof	Computer Vision – ECCV 2020: 16th European Conference, Glasgow, UK, August 23–28, 2020, Proceedings, Part XI	-
dc.relation.ispartofseries	Lecture Notes in Computer Science ; 12356	-
dc.relation.ispartofseries	LNCS Sublibrary. SL 6, Image Processing, Computer Vision, Pattern Recognition, and Graphics	-
dc.title	Open-Edit: Open-Domain Image Manipulation with Open-Vocabulary Instructions	-
dc.type	Conference_Paper	-
dc.description.nature	link_to_subscribed_fulltext	-
dc.identifier.doi	10.1007/978-3-030-58621-8_6	-
dc.identifier.scopus	eid_2-s2.0-85097645652	-
dc.identifier.spage	89	-
dc.identifier.epage	106	-
dc.identifier.eissn	1611-3349	-
dc.identifier.isi	WOS:001500594500006	-
dc.publisher.place	Cham	-

File Download

Links for fulltext

(May Require Subscription)

Supplementary

Conference Paper: Open-Edit: Open-Domain Image Manipulation with Open-Vocabulary Instructions

Export via OAI-PMH Interface in XML Formats

OR

Export to Other Non-XML Formats