SDDFusion: 分割与检测驱动的红外与可见光图像融合网络
DOI:
CSTR:
作者:
作者单位:

南京理工大学 电子工程与光电技术学院

作者简介:

通讯作者:

中图分类号:

基金项目:

国家自然科学基金项目(面上项目,重点项目,重大项目)


SDDFusion: Segmentation and Detection-Driven Infrared and Visible Image Fusion Network
Author:
Affiliation:

1.School of Electronic and Optical Engineering,Nanjing University of Science and Technology,Nanjing,210094;2.Jiangsu,China

Fund Project:

The National Natural Science Foundation of China

  • 摘要
  • |
  • 图/表
  • |
  • 访问统计
  • |
  • 参考文献
  • |
  • 相似文献
  • |
  • 引证文献
  • |
  • 资源附件
  • |
  • 文章评论
    摘要:

    红外与可见光图像融合旨在跨谱段整合热辐射与反射成像的互补信息,在复杂场景下同时突出显著目标并保留纹理细节,从而为人眼感知与机器视觉提供更全面的输入。为进一步提升融合图像质量及其在下游任务上的表现,本文提出一种分割与检测驱动的红外与可见光图像融合网络。该联合框架由融合网络与两条任务驱动分支构成:目标判别器与分割分支分别通过各自的损失函数引导融合网络保留更丰富的高层语义。为增强特征表征能力,我们设计了基于密集块的密集连接与梯度残差模块( DCGRM)用于深层特征提取;在解码阶段引入大核注意力(LKA)模块,以聚焦关键区域并降低信息损失,从而进一步提升融合图像质量。三组公共数据集上的实验表明,所提方法能够有效整合两种模态的优势,既突出显著目标又保留丰富细节;在多项融合指标上优于对比方法,并具备实时推理速度。此外,得益于与下游任务的协同设计,所提方法在分割与检测这类下游视觉任务上亦展现出性能优势。

    Abstract:

    Infrared and visible image fusion aims to integrate complementary information from thermal radiation and reflected imaging across spectra, simultaneously highlighting salient targets and preserving texture details in complex scenes, thereby providing more comprehensive inputs for both human perception and machine vision. To further improve fusion image quality and its performance on downstream tasks, this paper proposes a segmentation and detection-driven infrared and visible image fusion network. The unified framework consists of a fusion network and two task-driven branches: a target discriminator and a segmentation branch, which guide the fusion network to retain richer high-level semantics through their respective loss functions. To enhance feature representation capabilities, we designed the dense connection and gradient residuals module (DCGRM) based on dense blocks for deep feature extraction. Furthermore, a large kernel attention (LKA) module is introduced in the decoding stage to focus on key regions and reduce information loss, thereby further improving the quality of fused images. Experiments on three public datasets demonstrate that the proposed method effectively integrates the complementary strengths of both modalities, highlighting salient targets while preserving rich details. It outperforms the compared methods in multiple fusion metrics and achieves real-time inference speed. Moreover, benefiting from its task-driven design, the proposed method also exhibits performance advantages on downstream vision tasks such as segmentation and detection.

    参考文献
    相似文献
    引证文献
引用本文
相关视频

分享
文章指标
  • 点击次数:
  • 下载次数:
  • HTML阅读次数:
  • 引用次数:
历史
  • 收稿日期:2025-09-25
  • 最后修改日期:2025-10-26
  • 录用日期:2025-10-28
  • 在线发布日期:
  • 出版日期:
文章二维码