Abstract:Accurate retrieval of Land Surface Temperature (LST) from satellite thermal infrared data remains challenging due to the reliance of physical models on real-time atmospheric profiles and the difficulty in characterizing surface emissivity over heterogeneous landscapes. To address these limitations, this study proposes SE-ResUNet, a deep learning framework for Landsat 9 thermal infrared images. To overcome the scarcity of large-scale in-situ measurements for training, we construct a high-quality synthetic dataset by coupling the MODTRAN 5 radiative transfer model with ERA5 atmospheric reanalysis data. The network adopts a U-Net encoder-decoder structure with a modified ResNet50 backbone to capture multi-scale features. Squeeze-and-Excitation (SE) attention modules are embedded in the residual blocks and physical prior knowledge is directly added to the input tensor. By integrating skip connections and an adaptive calibration mechanism for thermal signals under physical constraints, our method achieves precise pixel-by-pixel temperature reconstruction. Experiments show that SE-ResUNet effectively mitigates the overfitting problem linked to spatial autocorrelation. The model shows strong robustness against simulated noise and complicated terrain variability. Evaluations on multiple datasets show that it achieves a Root Mean Square Error (RMSE) of around 0.7 K and a Mean Absolute Error (MAE) of 0.5 K. These results confirm the effectiveness of SE-ResUNet as a high-precision, end-to-end solution for LST retrieval without real-time external atmospheric inputs at the inference stage.