多尺度空间注意机制引导红外小目标检测网络特征蒸馏方法
DOI:
CSTR:
作者:
作者单位:

1.中国空间技术研究院;2.国防科技大学电子科学学院;3.哈尔滨工业大学空间光学工程研究中心

作者简介:

通讯作者:

中图分类号:

基金项目:

国家自然科学基金(61972435, 61401474, 61921001, 62001478)


Multi-scale spatial attention mechanism-guided feature distillation method for infrared small target detection networks
Author:
Affiliation:

1.Institute of Remote Sensing Satellite, China Academy of Space Technology;2.College of Electronic Science and Technology, National University of Defence Technology;3.Research Center for Space Optical Engineering, Harbin Institute of Technology

Fund Project:

  • 摘要
  • |
  • 图/表
  • |
  • 访问统计
  • |
  • 参考文献
  • |
  • 相似文献
  • |
  • 引证文献
  • |
  • 资源附件
  • |
  • 文章评论
    摘要:

    现有的深度学习方法在红外小目标检测中取得显著成效,但由于其高昂计算成本,难以满足资源受限的场景需求。亟需探索能够兼顾轻量级和高精度的知识蒸馏方法,以提高红外小目标检测网络运行效率。然而,由于红外小目标极端特性,直接将常规蒸馏方法应用于红外检测网络轻量化时,会出现小目标在知识传递中的丢失弥散和教师学生网络之间的层级特征表征失配,从而损害学生网络对小目标的特征学习能力,检测能力难以进一步提升。为解决上述问题,论文提出一种基于多尺度空间注意机制引导的特征蒸馏方法,首先设计一种多尺度空间注意机制(Multi-scale Spatial Attention, MSA),捕获目标特征的多尺度信息并融合,从而有效获取目标区域。然后设计面向图像特征的L2归一化策略,该策略重点关注教师学生网络的特征分布差异。最终提出一种自适应加权蒸馏损失函数(Adaptive Weighted Mean Square Error, AWMSE),引导学生网络强化对关键目标区域的学习。在两个公认数据集(NUDT-SIRST, NUAA-SIRST)的实验结果表明,本文所提出的蒸馏方法能够取得更优的检测性能,学生网络在取得同教师网络的可比的检测表现的同时,轻量级模型在HUAWEI和NVIDIA边缘设备上部署推理,实现相较于教师网络超过2倍以上的推理加速,综合性能相较于现有蒸馏方法更优。

    Abstract:

    Existing deep learning methods have achieved significant results in infrared small target detection, but their high computational cost makes them unsuitable for resource-constrained scenarios. There is an urgent need to explore knowledge distillation methods that can balance light weightiness and high accuracy to improve the operational efficiency of infrared small target detection networks. However, due to the extreme characteristics of infrared small targets, conventional distillation methods for infrared detection networks suffer from the loss and diffusion of knowledge about small targets during knowledge transfer and the mismatch of hierarchical feature representations between teacher and student networks. This impairs the student network"s ability to learn features of small targets, hindering further improvement in detection capabilities. To address these issues, this paper proposes a feature distillation method guided by a multi-scale spatial attention mechanism. First, a multi-scale spatial attention (MSA) mechanism is designed to capture and fuse multi-scale information of target features, thereby effectively acquiring the target region. Then, an L2 normalization strategy for features is designed to address the differences in feature distribution between teacher and student networks. Finally, an adaptive weighted mean square error (AWMSE) loss function is proposed to guide the student network to strengthen its learning of key target regions. Experimental results on two recognized datasets (NUDT-SIRST, NUAA-SIRST) demonstrate that the proposed distillation method achieves superior detection performance, with the student network even matching the detection performance of the teacher network. Furthermore, the lightweight model after distillation achieves more than 2x inference acceleration when deployed on HUAWEI and NVIDIA edge devices.

    参考文献
    相似文献
    引证文献
引用本文
分享
相关视频

文章指标
  • 点击次数:
  • 下载次数:
  • HTML阅读次数:
  • 引用次数:
历史
  • 收稿日期:2025-12-03
  • 最后修改日期:2026-02-13
  • 录用日期:2026-02-15
  • 在线发布日期:
  • 出版日期:
文章二维码