文章摘要
徐开丽, 张乾, 何剑, 滕林.基于注意力机制和多尺度特征融合模块的人脸图像修复方法[J].海南师范大学学报自科版,2024,37(3):338-347
基于注意力机制和多尺度特征融合模块的人脸图像修复方法
Facial Image Inpainting Method Based on Attention Mechanism and Multi-scale Feature Aggregation Module
  
DOI:10.12051/j.issn.1674-4942.2024.03.011
中文关键词: 图像修复  注意力机制  多尺度特征融合  dropout方法  人脸图像
英文关键词: image inpainting  attention mechanism  multi-scale feature aggregation  dropout method  face image
基金项目:贵州民族大学校级科研项目(GZMUZK[2021]YB23) ;贵州省高等学校大数据分析与智能计算重点实验室(黔教技[2023]012号)
作者单位
徐开丽1,2, 张乾1,2, 何剑1,2, 滕林1,2 1.贵州民族大学 数据科学与信息工程学院贵州 贵阳 550025
2.贵州省模式识别与智能系统重点实验室
贵州 贵阳 550025 
摘要点击次数: 133
全文下载次数: 189
中文摘要:
      基于深度学习的人脸图像修复算法在获取深层特征时会造成信息丢失,容易忽略图像语义特征而产生结构不合理的修复结果,不利于纹理细节修复。为了解决这些问题,本文提出了使用卷积注意力模块和多尺度特征融合模块改进人脸图像修复网络。首先,提出了基于卷积注意力模块的人脸图像修复方法,增强人脸图像语义修复的能力,确保所提模型能够生成清晰的纹理修复结果,同时使用多尺度特征融合模块获取图像深层特征,通过融合多尺度特征来减少卷积过程中的信息丢失。其次,设计了一个具有正则化的CNN编解码结构,以解决修复网络产生过拟合的现象,并且提升网络的泛化能力。通过在FFHQ数据集上进行定量实验分析,当掩码比例较大时,峰值信噪比、结构相似性指数和平均绝对误差指标分别达到21.704 2 dB、0.749 2和0.041 8,证明所提出的方法优于现有的图像修复方法,能较好地重建具有完整的信息、纹理细节清晰的人脸图像。
英文摘要:
      Face image inpainting algorithms based on deep learning often experience information loss when capturing deep features, which could lead to neglection of image semantic features and result in structurally unreasonable inpainting outcomes, hindering texture detail repairs. To address these issues, we proposed an improved face image inpainting network incorporating convolutional block attention module and multi scale feature aggregation module. Firstly, a face image inpainting method based on convolutional block attention module was introduced to enhance the capability of semantic inpainting, ensuring the model generated clear texture inpainting results. Simultaneously, a multi scale feature aggregation module was utilized to capture deep image features and mitigate information loss during convolution processes. Secondly, a CNN encoder-decoder structure with regularization was designed to mitigate overfitting issues in the inpainting network and enhance its generalization ability. Quantitative experiments conducted on the FFHQ dataset demonstrated that, with a larger mask ratio, the peak signal to noise ratio, structural similarity index, and mean absolute error metrics achieved 21.704 2 dB, 0.749 2, and 0.041 8 respectively, validating the superiority of the proposed method over existing image inpainting approaches in reconstructing complete information and clear texture details of face images.
查看全文   查看/发表评论  下载PDF阅读器
关闭