Liquid Warping GAN with Attention: A Unified Framework for Human Image Synthesis

Liu, Wen; Piao, Zhixin; Tu, Zhi; Luo, Wenhan; Ma, Lin; Gao, Shenghua

File Download

There are no files associated with this item.

Links for fulltext

(May Require Subscription)

Publisher Website: 10.1109/TPAMI.2021.3078270
Scopus: eid_2-s2.0-85105882963
PMID: 33961551
Find via

Supplementary

Citations:
- Scopus: 0
- PubMed Central: 0
Appears in Collections:
- Computer Science: Journal/Magazine Articles

See more details

Article: Liquid Warping GAN with Attention: A Unified Framework for Human Image Synthesis

Title	Liquid Warping GAN with Attention: A Unified Framework for Human Image Synthesis
Authors	Liu, Wen Piao, Zhixin Tu, Zhi Luo, Wenhan Ma, Lin Gao, Shenghua
Keywords	appearance transfer generative adversarial network Human image synthesis motion imitation novel view synthesis
Issue Date	2022
Citation	IEEE Transactions on Pattern Analysis and Machine Intelligence, 2022, v. 44, n. 9, p. 5114-5131 How to Cite? DOI: http://dx.doi.org/10.1109/TPAMI.2021.3078270
Abstract	We tackle human image synthesis, including human motion imitation, appearance transfer, and novel view synthesis, within a unified framework. It means that the model, once being trained, can be used to handle all these tasks. The existing task-specific methods mainly use 2D keypoints (pose) to estimate the human body structure. However, they only express the position information with no ability to characterize the personalized shape of the person and model the limb rotations. In this paper, we propose to use a 3D body mesh recovery module to disentangle the pose and shape. It can not only model the joint location and rotation but also characterize the personalized body shape. To preserve the source information, such as texture, style, color, and face identity, we propose an Attentional Liquid Warping GAN with Attentional Liquid Warping Block (AttLWB) that propagates the source information in both image and feature spaces to the synthesized reference. Specifically, the source features are extracted by a denoising convolutional auto-encoder for characterizing the source identity well. Furthermore, our proposed method can support a more flexible warping from multiple sources. To further improve the generalization ability of the unseen source images, a one/few-shot adversarial learning is applied. In detail, it first trains a model in an extensive training set. Then, it finetunes the model by one/few-shot unseen image(s) in a self-supervised way to generate high-resolution (512 512512×512 and 1024 \times 10241024×1024) results. Also, we build a new dataset, namely Impersonator (iPER) dataset, for the evaluation of human motion imitation, appearance transfer, and novel view synthesis. Extensive experiments demonstrate the effectiveness of our methods in terms of preserving face identity, shape consistency, and clothes details. All codes and dataset are available on https://impersonator.org/work/impersonator-plus-plus.html.
Persistent Identifier	http://hdl.handle.net/10722/345032
ISSN	0162-8828 2023 Impact Factor: 20.8 2023 SCImago Journal Rankings: 6.158

DC Field	Value	Language
dc.contributor.author	Liu, Wen	-
dc.contributor.author	Piao, Zhixin	-
dc.contributor.author	Tu, Zhi	-
dc.contributor.author	Luo, Wenhan	-
dc.contributor.author	Ma, Lin	-
dc.contributor.author	Gao, Shenghua	-
dc.date.accessioned	2024-08-15T09:24:47Z	-
dc.date.available	2024-08-15T09:24:47Z	-
dc.date.issued	2022	-
dc.identifier.citation	IEEE Transactions on Pattern Analysis and Machine Intelligence, 2022, v. 44, n. 9, p. 5114-5131	-
dc.identifier.issn	0162-8828	-
dc.identifier.uri	http://hdl.handle.net/10722/345032	-
dc.description.abstract	We tackle human image synthesis, including human motion imitation, appearance transfer, and novel view synthesis, within a unified framework. It means that the model, once being trained, can be used to handle all these tasks. The existing task-specific methods mainly use 2D keypoints (pose) to estimate the human body structure. However, they only express the position information with no ability to characterize the personalized shape of the person and model the limb rotations. In this paper, we propose to use a 3D body mesh recovery module to disentangle the pose and shape. It can not only model the joint location and rotation but also characterize the personalized body shape. To preserve the source information, such as texture, style, color, and face identity, we propose an Attentional Liquid Warping GAN with Attentional Liquid Warping Block (AttLWB) that propagates the source information in both image and feature spaces to the synthesized reference. Specifically, the source features are extracted by a denoising convolutional auto-encoder for characterizing the source identity well. Furthermore, our proposed method can support a more flexible warping from multiple sources. To further improve the generalization ability of the unseen source images, a one/few-shot adversarial learning is applied. In detail, it first trains a model in an extensive training set. Then, it finetunes the model by one/few-shot unseen image(s) in a self-supervised way to generate high-resolution (512 512512×512 and 1024 \times 10241024×1024) results. Also, we build a new dataset, namely Impersonator (iPER) dataset, for the evaluation of human motion imitation, appearance transfer, and novel view synthesis. Extensive experiments demonstrate the effectiveness of our methods in terms of preserving face identity, shape consistency, and clothes details. All codes and dataset are available on https://impersonator.org/work/impersonator-plus-plus.html.	-
dc.language	eng	-
dc.relation.ispartof	IEEE Transactions on Pattern Analysis and Machine Intelligence	-
dc.subject	appearance transfer	-
dc.subject	generative adversarial network	-
dc.subject	Human image synthesis	-
dc.subject	motion imitation	-
dc.subject	novel view synthesis	-
dc.title	Liquid Warping GAN with Attention: A Unified Framework for Human Image Synthesis	-
dc.type	Article	-
dc.description.nature	link_to_subscribed_fulltext	-
dc.identifier.doi	10.1109/TPAMI.2021.3078270	-
dc.identifier.pmid	33961551	-
dc.identifier.scopus	eid_2-s2.0-85105882963	-
dc.identifier.volume	44	-
dc.identifier.issue	9	-
dc.identifier.spage	5114	-
dc.identifier.epage	5131	-
dc.identifier.eissn	1939-3539	-

File Download

Links for fulltext

(May Require Subscription)

Supplementary

Article: Liquid Warping GAN with Attention: A Unified Framework for Human Image Synthesis

Export via OAI-PMH Interface in XML Formats

OR

Export to Other Non-XML Formats