文章摘要
基于深度学习的AI动漫制作内容识别方法
Content recognition method for AI animation production based on deep learning
投稿时间:2026-01-07  修订日期:2026-04-25
DOI:
中文关键词: 动漫内容识别  双模态融合  时空注意力  3D卷积  ConvLSTM
英文关键词: Anime content recognition  Bimodal fusion  Spatiotemporal attention  3D convolution  ConvLSTM
基金项目:2023年安徽省高等学校科学研究项目,应用基础研究类重点项目“基于监督学习方法的无人驾驶汽车跟驰策略研究”(2023AH052128)
作者单位邮编
熊帆* 安徽扬子职业技术学院 241000
张业柱 安徽工程大学 
毛钰婷 谷歌 
摘要点击次数: 9
全文下载次数: 0
中文摘要:
      针对传统方法在动漫视频内容识别中难以有效捕捉时空动态与复杂背景干扰的问题,研究提出一种融合双模态时序特征与注意力机制的深度学习识别方法。该方法利用三维卷积网络提取短期时空特征,卷积长短期记忆网络捕捉长期时序上下文;引入运动激励模块增强运动信息感知,采用时空坐标注意力聚焦关键时空区域。实验结果表明,该方法在多个性能维度表现突出:识别稳定性方面,平均精度均值达93.2%,且在12fps时仍保持在85.6%;时空建模能力方面,空间注意力覆盖度在32帧时达89.2%;抗背景干扰方面,复杂背景占比为30%-40%时,识别准确率可达95.1%。结果表明,该方法在识别稳定性、空间感知与复杂环境适应性上均优于主流对比方法,对动漫制作流程的智能化升级具有重要理论价值与实践意义。
英文摘要:
      In response to the difficulty of traditional methods in effectively capturing spatiotemporal dynamics and complex background interference in anime video content recognition, a deep learning recognition method that integrates bimodal temporal features and attention mechanism is proposed. This method utilizes a three-dimensional convolutional network to extract short-term spatiotemporal features, and a convolutional long short-term memory network to capture long-term temporal context; Introducing a motion incentive module to enhance the perception of motion information, using spatiotemporal coordinate attention to focus on key spatiotemporal regions. The experimental results show that this method performs outstandingly in multiple performance dimensions: in terms of recognition stability, the average accuracy reaches 93.2%, and it still maintains 85.6% at 12fps; In terms of spatiotemporal modeling ability, the spatial attention coverage reached 89.2% at 32 frames; In terms of anti background interference, when the proportion of complex background is 30% -40%, the recognition accuracy can reach 95.1%. The results show that this method outperforms mainstream comparative methods in recognition stability, spatial perception, and adaptability to complex environments, and has important theoretical value and practical significance for the intelligent upgrade of animation production processes.
View Fulltext   查看/发表评论  下载PDF阅读器
关闭