EmoTalker: Emotionally Editable Talking Face Generation via Diffusion Model

Bingyuan Zhang, Xulong Zhang, Ning Cheng, Jun Yu, Jing Xiao, Jianzong Wang

January 2024

The overall framework of EmoTalker

Abstract

In recent years, the field of talking faces generation has attracted considerable attention, with certain methods adept at generating virtual faces that convincingly imitate human expressions. However, existing methods face challenges related to limited generalization, particularly when dealing with challenging identities. Furthermore, methods for editing expressions are often confined to a singular emotion, failing to adapt to intricate emotions. To overcome these challenges, this paper proposes EmoTalker, an emotionally editable portraits animation approach based on the diffusion model. EmoTalker modifies the denoising process to ensure preservation of the original portrait’s identity during inference. To enhance emotion comprehension from text input, Emotion Intensity Block is introduced to analyze fine-grained emotions and strengths derived from prompts. Additionally, a crafted dataset is harnessed to enhance emotion comprehension within prompts. Experiments show the effectiveness of EmoTalker in generating high-quality, emotionally customizable facial expressions.

Type

Publication

In 2024 IEEE International Conference on Acoustics, Speech, and Signal Processing

Click the Cite button above to demo the feature to enable visitors to import publication metadata into their reference management software.

EmoTalker: Emotionally Editable Talking Face Generation via Diffusion Model

Abstract

Bingyuan Zhang

University of Science and Technology of China

Xulong Zhang

Executive Director

Jianzong Wang

Honorary Director