BDMFuse:基于红外与可见光图像基础特征和细节特征的多尺度融合
DOI:
作者:
作者单位:

1.河南农业大学;2.Universidade Nova de Lisboa

作者简介:

通讯作者:

中图分类号:

基金项目:

河南省重点研发专项;河南省中央引导地方科技发展资金;河南省杰出外籍科学家工作室


BDMFuse: Multi-Scale Network Fusion for Infrared and Visible Images Based on base and Detail Features
Author:
Affiliation:

1.Henan Agricultural University;2.Universidade Nova de Lisboa

Fund Project:

Henan Province Key Research and Development Project;The central government of Henan Province guides local science and technology development funds; Henan Provincial Outstanding Foreign Scientist Studio.

  • 摘要
  • |
  • 图/表
  • |
  • 访问统计
  • |
  • 参考文献
  • |
  • 相似文献
  • |
  • 引证文献
  • |
  • 资源附件
  • |
  • 文章评论
    摘要:

    红外与可见光图像融合结果应该突出红外图像显著目标,同时保留可见光纹理细节。为了满足上述要求,本文提出一种基于自编码器的红外与可见光图像融合方法。编码器根据优化目标构建基础编码器和细节编码器。基础编码器用于提取图像的低频信息,细节编码器用于提取图像的高频信息。这种提取方式会造成部分信息未被捕捉,我们提出补偿编码器补充信息。同时,我们引入多尺度分解来更加全面的提取图像特征。随后,将编码器获取的图像特征送入解码器。解码器先将高频、低频和补充信息相加获取多尺度特征。我们从多尺度特征中获取注意力图,与对应尺度的融合图像相乘。多尺度融合中引入Fusion模块,实现图像重建。本文提出的网络在TNO、RoadScene和LLVIP数据集上证明网络的有效性。实验表明我们的网络能够更好的感知光线变化,较好的提取图像细节信息,融合后的图像更符合人的视觉感知。

    Abstract:

    The result of infrared and visible image fusion should highlight the significant targets of the infrared image while preserving the visible light texture details. In order to satisfy the above requirements, this paper proposes an automated encoder-based infrared and visible image fusion method. The encoder constructs both a base encoder and a detail encoder according to the optimization objective. The base encoder extracts low-frequency information from the image, while the detail encoder captures high-frequency information. Since this extraction method may miss some information, we introduce a compensation encoder to supplement the missing information. Additionally, we introduce multi-scale decomposition for the encoder to extract image features more comprehensively. The image features obtained by the encoders are then fed into the decoder. The decoder first adds the low-frequency, high-frequency and compensatory information to obtain multi-scale features. An attention map is derived from these multi-scale features and multiplied with the fused image at the corresponding scale. The Fusion module is introduced in the multi-scale fusion process to achieve image reconstruction. The network proposed in this paper demonstrates its effectiveness on the TNO, RoadScene, and LLVIP datasets. Experiments show that our network can better perceive changes in light, effectively extract image detail information, and produce fused images that are more aligned with human visual perception.

    参考文献
    相似文献
    引证文献
引用本文
分享
文章指标
  • 点击次数:
  • 下载次数:
  • HTML阅读次数:
  • 引用次数:
历史
  • 收稿日期:2024-07-25
  • 最后修改日期:2024-09-11
  • 录用日期:2024-09-13
  • 在线发布日期:
  • 出版日期:
文章二维码