基于注意力机制和多尺度特征融合模块的人脸图像修复方法

徐开丽<sup>1; 2</sup>; 张乾<sup>1; 2</sup>; 何剑<sup>1; 2</sup>; 滕林<sup>1; 2</sup>

文章摘要

徐开丽, 张乾, 何剑, 滕林.基于注意力机制和多尺度特征融合模块的人脸图像修复方法[J].海南师范大学学报自科版,2024,37(3):338-347

基于注意力机制和多尺度特征融合模块的人脸图像修复方法

Facial Image Inpainting Method Based on Attention Mechanism and Multi-scale Feature Aggregation Module

DOI：10.12051/j.issn.1674-4942.2024.03.011

中文关键词: 图像修复注意力机制多尺度特征融合 dropout方法人脸图像

英文关键词: image inpainting attention mechanism multi-scale feature aggregation dropout method face image

基金项目:贵州民族大学校级科研项目（GZMUZK［2021］YB23）；贵州省高等学校大数据分析与智能计算重点实验室（黔教技［2023］012号）

作者	单位
徐开丽^1,2, 张乾^1,2, 何剑^1,2, 滕林^1,2	1.贵州民族大学数据科学与信息工程学院，贵州贵阳 550025 2.贵州省模式识别与智能系统重点实验室，贵州贵阳 550025

摘要点击次数: 413

全文下载次数: 523

中文摘要:

基于深度学习的人脸图像修复算法在获取深层特征时会造成信息丢失，容易忽略图像语义特征而产生结构不合理的修复结果，不利于纹理细节修复。为了解决这些问题，本文提出了使用卷积注意力模块和多尺度特征融合模块改进人脸图像修复网络。首先，提出了基于卷积注意力模块的人脸图像修复方法，增强人脸图像语义修复的能力，确保所提模型能够生成清晰的纹理修复结果，同时使用多尺度特征融合模块获取图像深层特征，通过融合多尺度特征来减少卷积过程中的信息丢失。其次，设计了一个具有正则化的CNN编解码结构，以解决修复网络产生过拟合的现象，并且提升网络的泛化能力。通过在FFHQ数据集上进行定量实验分析，当掩码比例较大时，峰值信噪比、结构相似性指数和平均绝对误差指标分别达到21.704 2 dB、0.749 2和0.041 8，证明所提出的方法优于现有的图像修复方法，能较好地重建具有完整的信息、纹理细节清晰的人脸图像。

英文摘要:

Face image inpainting algorithms based on deep learning often experience information loss when capturing deep features, which could lead to neglection of image semantic features and result in structurally unreasonable inpainting outcomes, hindering texture detail repairs. To address these issues, we proposed an improved face image inpainting network incorporating convolutional block attention module and multi scale feature aggregation module. Firstly, a face image inpainting method based on convolutional block attention module was introduced to enhance the capability of semantic inpainting, ensuring the model generated clear texture inpainting results. Simultaneously, a multi scale feature aggregation module was utilized to capture deep image features and mitigate information loss during convolution processes. Secondly, a CNN encoder-decoder structure with regularization was designed to mitigate overfitting issues in the inpainting network and enhance its generalization ability. Quantitative experiments conducted on the FFHQ dataset demonstrated that, with a larger mask ratio, the peak signal to noise ratio, structural similarity index, and mean absolute error metrics achieved 21.704 2 dB, 0.749 2, and 0.041 8 respectively, validating the superiority of the proposed method over existing image inpainting approaches in reconstructing complete information and clear texture details of face images.

查看全文查看/发表评论下载PDF阅读器

关闭