Duomin Wang (王多民)

Email: wangduomin[at]xiaobing.ai       Google Scholar      Github

I am currently a researcher at Xiaobing.ai from 2021. My research interests include talking head synthesis, representation learning, disentanglement and 3D face reconstruction.

Before joining Xiaobing, I was worked at OPPO Research Institute for three years, my research results are applied to the camera software of OPPO mobile phones as the basic face algorithm.

I'm seeking research interns on talking head synthesis and representation learning. Feel free to send me an email if you are interested.

profile photo
News

2024/02/27   Had two papers accepted by CVPR 2024, one is about 4D avatar synthesis (Portrait4D), the other one is about unconstrained virtural try-on (PICTURE).

2023/07/14   Had one paper accepted by ICCV 2023 about talking head sythesis (TH-PAD).

2023/07/10   Our CVPR 2023 work PD-FGC has released the code and model, check it out!

2023/02/28   Had one paper accepted by CVPR 2023 about talking head sythesis (PD-FGC).

Publications
Portrait4D-v2: Pseudo Multi-View Data Creates Better 4D Head Synthesizer
Yu Deng, Duomin Wang, Baoyuan Wang
arXiv,
[PDF] [Project] [Code]

We learn a lifelike 4D head synthesizer by creating pseudo multi-view videos from monocular ones as supervision.

PICTURE: PhotorealistIC Virtual Try-on from UnconstRained dEsigns
Shuliang Ning, Duomin Wang, Yipeng Qin, Zirong Jin, Baoyuan Wang, Xiaoguang Han
2024 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2024,
[PDF] [Project] [Code] [BibTeX]

we propose a novel virtual try-on from unconstrained designs (ucVTON) task to enable photorealistic synthesis of personalized composite clothing on input human image.

Learning One-Shot 4D Head Avatar Synthesis using Synthetic Data
Yu Deng, Duomin Wang, Xiaohang Ren, Xingyu Chen, Baoyuan Wang
2024 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2024,
[PDF] [Project] [Code] [BibTeX]

We propose a one-shot 4D head synthesis approach for high-fidelity 4D head avatar reconstruction while trained on large-scale synthetic data.

Disentangling Planning, Driving and Rendering for Photorealistic Avatar Agents
Duomin Wang, Bin Dai, Yu Deng, Baoyuan Wang
arxiv 2023,
[PDF] [Project] [Code] [BibTeX]

We introduce a system that harnesses LLMs to produce a series of detailed text descriptions of the avatar agents’ facial motions and then pro- cessed by our task-agnostic driving engine into motion to- ken sequences, which are subsequently converted into con- tinuous motion embeddings that are further consumed by our standalone neural-based renderer to generate the fi- nal photorealistic avatar animations.

Talking Head Generation with Probabilistic Audio-to-Visual Diffusion Priors
Zhentao Yu, Zixin Yin, Deyu Zhou, Duomin Wang, Finn Wong, Baoyuan Wang
2023 IEEE International Conference on Computer Vision, ICCV 2023,
[PDF] [Project] [Code(coming soon)] [BibTeX]

We introduce a simple and novel framework for one-shot audio-driven talking head generation. Unlike prior works that require additional driving sources for controlled synthesis in a deterministic manner, we instead probabilistically sample all the holistic lip-irrelevant facial motions (i.e. pose, expression, blink, gaze, etc.) to semantically match the input audio while still maintaining both the photo-realism of audio-lip synchronization and the overall naturalness.

Progressive Disentangled Representation Learning for Fine-Grained Controllable Talking Head Synthesis
Duomin Wang, Yu Deng, Zixin Yin, Heung-Yeung Shum, Baoyuan Wang
2023 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2023,
[PDF] [Project] [Code] [BibTeX]

We present a novel one-shot talking head synthesis method that achieves disentangled and fine-grained control over lip motion, eye gaze&blink, head pose, and emotional expression.  We represent different motions via disentangled latent representations and leverage an image generator to synthesize talking heads from them.


The website template was adapted from Yu Deng.