CP-EB: Talking Face Generation with Controllable Pose and Eye Blinking Embedding

Jianzong Wang, Yimin Deng, Ziqi Liang, Xulong Zhang, Ning Cheng, Jing Xiao

October 2023

Pipeline of the proposed generation framework

Abstract

This paper proposes a talking face generation method named “CP-EB” that takes an audio signal as input and a person image as reference, to synthesize a photo-realistic people talking video with head poses controlled by a short video clip and proper eye blinking embedding. It’s noted that not only the head pose but also eye blinking are both important aspects for deep fake detection. The implicit control of poses by video has already achieved by the state-of-art work. According to recent research, eye blinking has weak correlation with input audio which means eye blinks extraction from audio and generation are possible. Hence, we propose a GAN-based architecture to extract eye blink feature from input audio and reference video respectively and employ contrastive training between them, then embed it into the concatenated features of identity and poses to generate talking face images. Experimental results show that the proposed method can generate photo-realistic talking face with synchronous lips motions, natural head poses and blinking eyes.

Type

Publication

In 2023 International Symposium on Parallel and Distributed Processing with Applications

Click the Cite button above to demo the feature to enable visitors to import publication metadata into their reference management software.

CP-EB: Talking Face Generation with Controllable Pose and Eye Blinking Embedding

Abstract

Jianzong Wang

Honorary Director

Yimin Deng

University of Science and Technology of China

Ziqi Liang

University of Science and Technology of China

Xulong Zhang

Executive Director