File Download
There are no files associated with this item.
Links for fulltext
(May Require Subscription)
- Publisher Website: 10.1109/CVPR52733.2024.00756
- Scopus: eid_2-s2.0-85202201317
- Find via
Supplementary
-
Citations:
- Scopus: 0
- Appears in Collections:
Conference Paper: DiffMorpher: Unleashing the Capability of Diffusion Models for Image Morphing
Title | DiffMorpher: Unleashing the Capability of Diffusion Models for Image Morphing |
---|---|
Authors | |
Keywords | Diffusion models Image morphing video generation |
Issue Date | 2024 |
Citation | Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2024, p. 7912-7921 How to Cite? |
Abstract | Diffusion models have achieved remarkable image generation quality surpassing previous generative models. However, a notable limitation of diffusion models, in comparison to GANs, is their difficulty in smoothly interpolating between two image samples, due to their highly unstructured latent space. Such a smooth interpolation is intriguing as it naturally serves as a solution for the image mor-phing task with many applications. In this work, we address this limitation via DiffMorpher, an approach that enables smooth and natural image interpolation by harnessing the prior knowledge of a pretrained diffusion model. Our key idea is to capture the semantics of the two images by fitting two LoRAs to them respectively, and interpolate between both the LoRA parameters and the latent noises to ensure a smooth semantic transition, where correspon-dence automatically emerges without the need for annotation. In addition, we propose an attention interpolation and injection technique, an adaptive normalization adjustment method, and a new sampling schedule to further enhance the smoothness between consecutive images. Extensive experiments demonstrate that DiffMorpher achieves starkly better image morphing effects than previous methods across a variety of object categories, bridging a critical functional gap that distinguished diffusion models from GANs. |
Persistent Identifier | http://hdl.handle.net/10722/352462 |
ISSN | 2023 SCImago Journal Rankings: 10.331 |
DC Field | Value | Language |
---|---|---|
dc.contributor.author | Zhang, Kaiwen | - |
dc.contributor.author | Zhou, Yifan | - |
dc.contributor.author | Xu, Xudong | - |
dc.contributor.author | Dai, Bo | - |
dc.contributor.author | Pan, Xingang | - |
dc.date.accessioned | 2024-12-16T03:59:11Z | - |
dc.date.available | 2024-12-16T03:59:11Z | - |
dc.date.issued | 2024 | - |
dc.identifier.citation | Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2024, p. 7912-7921 | - |
dc.identifier.issn | 1063-6919 | - |
dc.identifier.uri | http://hdl.handle.net/10722/352462 | - |
dc.description.abstract | Diffusion models have achieved remarkable image generation quality surpassing previous generative models. However, a notable limitation of diffusion models, in comparison to GANs, is their difficulty in smoothly interpolating between two image samples, due to their highly unstructured latent space. Such a smooth interpolation is intriguing as it naturally serves as a solution for the image mor-phing task with many applications. In this work, we address this limitation via DiffMorpher, an approach that enables smooth and natural image interpolation by harnessing the prior knowledge of a pretrained diffusion model. Our key idea is to capture the semantics of the two images by fitting two LoRAs to them respectively, and interpolate between both the LoRA parameters and the latent noises to ensure a smooth semantic transition, where correspon-dence automatically emerges without the need for annotation. In addition, we propose an attention interpolation and injection technique, an adaptive normalization adjustment method, and a new sampling schedule to further enhance the smoothness between consecutive images. Extensive experiments demonstrate that DiffMorpher achieves starkly better image morphing effects than previous methods across a variety of object categories, bridging a critical functional gap that distinguished diffusion models from GANs. | - |
dc.language | eng | - |
dc.relation.ispartof | Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition | - |
dc.subject | Diffusion models | - |
dc.subject | Image morphing | - |
dc.subject | video generation | - |
dc.title | DiffMorpher: Unleashing the Capability of Diffusion Models for Image Morphing | - |
dc.type | Conference_Paper | - |
dc.description.nature | link_to_subscribed_fulltext | - |
dc.identifier.doi | 10.1109/CVPR52733.2024.00756 | - |
dc.identifier.scopus | eid_2-s2.0-85202201317 | - |
dc.identifier.spage | 7912 | - |
dc.identifier.epage | 7921 | - |