Lightweight remote sensing multimodal large language model based on knowledge distillation
CSTR:
Author:
Affiliation:

1.Key Laboratory for Information Science of Electromagnetic Waves (MoE), Fudan University,Shanghai 200433, China;2.Image and Intelligence Laboratory, School of Information Science and Technology, Fudan University,Shanghai 200433, China

Clc Number:

Fund Project:

Supported by National Natural Science Foundation of China (62371140) and National Key Researchand Development Program of China (2022YFB3903404)

  • Article
  • |
  • Figures
  • |
  • Metrics
  • |
  • Reference
  • |
  • Related
  • |
  • Cited by
  • |
  • Materials
  • |
  • Comments
    Abstract:

    Remote sensing multimodal large language models (MLLMs), which integrate rich visual-linguistic modal information, have shown great potential in areas such as remote sensing image analysis and interpretation. However, existing knowledge distillation methods primarily focus on the compression of unimodal large language models, neglecting the alignment of features across modalities, thus hindering the performance of large language models in cross-modal tasks. To address this issue, a lightweighting method for remote sensing MLLMs based on knowledge distillation is proposed. This method achieves effective alignment of multimodal information by aligning the outputs across modalities at the feature level. By introducing the reverse Kullback-Leibler divergence as the loss function and combining optimization strategies such as teacher mixed sampling and single-step decomposition, the generalization and stability of the student model are further enhanced. Experimental results demonstrate that the proposed method achieves higher accuracy and efficiency in four downstream tasks of remote sensing image scene classification, visual question answering, visual localization, and image description, significantly reducing the number of model parameters and the demand for computational resources, thereby providing a new solution for the efficient application of MLLMs in the field of remote sensing.

    Reference
    Related
    Cited by
Get Citation
Related Videos

Share
Article Metrics
  • Abstract:
  • PDF:
  • HTML:
  • Cited by:
History
  • Received:November 14,2024
  • Revised:November 28,2025
  • Adopted:December 31,2024
  • Online: November 28,2025
  • Published:
Article QR Code