DiffMorpher: Unleashing the Capability of Diffusion Models for Image Morphing

Zhang, Kaiwen; Zhou, Yifan; Xu, Xudong; Dai, Bo; Pan, Xingang

File Download

There are no files associated with this item.

Links for fulltext

(May Require Subscription)

Publisher Website: 10.1109/CVPR52733.2024.00756
Scopus: eid_2-s2.0-85202201317
Find via

Supplementary

Citations:
- Scopus: 0
Appears in Collections:
- HKU Musketeers Foundation Institute of Data Science: Conference papers

Conference Paper: DiffMorpher: Unleashing the Capability of Diffusion Models for Image Morphing

Title	DiffMorpher: Unleashing the Capability of Diffusion Models for Image Morphing
Authors	Zhang, Kaiwen Zhou, Yifan Xu, Xudong Dai, Bo Pan, Xingang
Keywords	Diffusion models Image morphing video generation
Issue Date	2024
Citation	Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2024, p. 7912-7921 How to Cite? DOI: http://dx.doi.org/10.1109/CVPR52733.2024.00756
Abstract	Diffusion models have achieved remarkable image generation quality surpassing previous generative models. However, a notable limitation of diffusion models, in comparison to GANs, is their difficulty in smoothly interpolating between two image samples, due to their highly unstructured latent space. Such a smooth interpolation is intriguing as it naturally serves as a solution for the image mor-phing task with many applications. In this work, we address this limitation via DiffMorpher, an approach that enables smooth and natural image interpolation by harnessing the prior knowledge of a pretrained diffusion model. Our key idea is to capture the semantics of the two images by fitting two LoRAs to them respectively, and interpolate between both the LoRA parameters and the latent noises to ensure a smooth semantic transition, where correspon-dence automatically emerges without the need for annotation. In addition, we propose an attention interpolation and injection technique, an adaptive normalization adjustment method, and a new sampling schedule to further enhance the smoothness between consecutive images. Extensive experiments demonstrate that DiffMorpher achieves starkly better image morphing effects than previous methods across a variety of object categories, bridging a critical functional gap that distinguished diffusion models from GANs.
Persistent Identifier	http://hdl.handle.net/10722/352462
ISSN	1063-6919 2023 SCImago Journal Rankings: 10.331

DC Field	Value	Language
dc.contributor.author	Zhang, Kaiwen	-
dc.contributor.author	Zhou, Yifan	-
dc.contributor.author	Xu, Xudong	-
dc.contributor.author	Dai, Bo	-
dc.contributor.author	Pan, Xingang	-
dc.date.accessioned	2024-12-16T03:59:11Z	-
dc.date.available	2024-12-16T03:59:11Z	-
dc.date.issued	2024	-
dc.identifier.citation	Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2024, p. 7912-7921	-
dc.identifier.issn	1063-6919	-
dc.identifier.uri	http://hdl.handle.net/10722/352462	-
dc.description.abstract	Diffusion models have achieved remarkable image generation quality surpassing previous generative models. However, a notable limitation of diffusion models, in comparison to GANs, is their difficulty in smoothly interpolating between two image samples, due to their highly unstructured latent space. Such a smooth interpolation is intriguing as it naturally serves as a solution for the image mor-phing task with many applications. In this work, we address this limitation via DiffMorpher, an approach that enables smooth and natural image interpolation by harnessing the prior knowledge of a pretrained diffusion model. Our key idea is to capture the semantics of the two images by fitting two LoRAs to them respectively, and interpolate between both the LoRA parameters and the latent noises to ensure a smooth semantic transition, where correspon-dence automatically emerges without the need for annotation. In addition, we propose an attention interpolation and injection technique, an adaptive normalization adjustment method, and a new sampling schedule to further enhance the smoothness between consecutive images. Extensive experiments demonstrate that DiffMorpher achieves starkly better image morphing effects than previous methods across a variety of object categories, bridging a critical functional gap that distinguished diffusion models from GANs.	-
dc.language	eng	-
dc.relation.ispartof	Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition	-
dc.subject	Diffusion models	-
dc.subject	Image morphing	-
dc.subject	video generation	-
dc.title	DiffMorpher: Unleashing the Capability of Diffusion Models for Image Morphing	-
dc.type	Conference_Paper	-
dc.description.nature	link_to_subscribed_fulltext	-
dc.identifier.doi	10.1109/CVPR52733.2024.00756	-
dc.identifier.scopus	eid_2-s2.0-85202201317	-
dc.identifier.spage	7912	-
dc.identifier.epage	7921	-

File Download

Links for fulltext

(May Require Subscription)

Supplementary

Conference Paper: DiffMorpher: Unleashing the Capability of Diffusion Models for Image Morphing

Export via OAI-PMH Interface in XML Formats

OR

Export to Other Non-XML Formats