用于实时人脸定位的粗精自动编码器网络(CFAN)Coarse-to-Fine Auto-encoder Networks (CFAN) for Real-time Face Alignment |
|
课程网址: | http://videolectures.net/eccv2014_zhang_face_alignment/ |
主讲教师: | Jie Zhang |
开课单位: | 中国科学院 |
开课时间: | 2014-10-29 |
课程语种: | 英语 |
中文简介: | 准确的面部对齐是大多数面部感知任务(如面部识别,面部表情分析和不真实的面部重绘)的重要先决步骤。它可以表示为从检测到的脸部区域对脸部界标的非线性推断。深度网络似乎是建模非线性的一个不错的选择,但是直接应用它并非易事。在本文中,不是直接应用深层网络,我们提出了一种从粗到细的自动编码器网络(CFAN)方法,该方法将几个连续的堆叠式自动编码器网络(SAN)层叠起来。具体来说,第一个SAN通过整体输入检测到的面部的低分辨率版本作为输入,可以快速而准确地初步预测地标。然后,随后的SAN通过以越来越高的分辨率将当前地标周围提取的局部特征(以前的SAN的输出)作为输入,逐步完善地标。在三个具有挑战性的数据集上进行的广泛实验表明,我们的CFAN优于最新技术,并且可以实时执行(40 fps,不包括台式机上的面部检测)。 p> |
课程简介: | Accurate face alignment is a vital prerequisite step for most face perception tasks such as face recognition, facial expression analysis and non-realistic face re-rendering. It can be formulated as the nonlinear inference of the facial landmarks from the detected face region. Deep network seems a good choice to model the nonlinearity, but it is nontrivial to apply it directly. In this paper, instead of a straightforward application of deep network, we propose a Coarse-to-Fine Auto-encoder Networks (CFAN) approach, which cascades a few successive Stacked Auto-encoder Networks (SANs). Specifically, the first SAN predicts the landmarks quickly but accurately enough as a preliminary, by taking as input a low-resolution version of the detected face holistically. The following SANs then progressively refine the landmark by taking as input the local features extracted around the current landmarks (output of the previous SAN) with higher and higher resolution. Extensive experiments conducted on three challenging datasets demonstrate that our CFAN outperforms the state-of-the-art methods and performs in real-time(40+fps excluding face detection on a desktop). |
关 键 词: | 自动编码; 数据集 |
课程来源: | 视频讲座网 |
数据采集: | 2020-11-15:zyk |
最后编审: | 2020-12-20:yumf |
阅读次数: | 47 |