一种基于变量稳定性与集群分析相结合的近红外波长的选择方法
投稿时间:2019-12-19  修订日期:2020-01-13  点此下载全文
引用本文:
摘要点击次数: 28
全文下载次数: 0
作者单位E-mail
张峰 西安交通大学 电力设备电气绝缘国家重点实验室 陕西 西安 710049 774149296@qq.com 
汤晓君 西安交通大学 电力设备电气绝缘国家重点实验室 陕西 西安 710049 xiaojun_tang@mail.xjtu.edu.cn 
仝昂鑫 西安交通大学 电力设备电气绝缘国家重点实验室 陕西 西安 710049  
王斌 西安交通大学 电力设备电气绝缘国家重点实验室 陕西 西安 710049  
王经纬 西安交通大学 电力设备电气绝缘国家重点实验室 陕西 西安 710049  
基金项目:国家重点基础研究发展计划(973计划)
中文摘要:为了提高分析模型的效率与性能,本文提出了一种基于变量稳定性与集群分析相结合(VSPA)的特征波长选择方法,该算法会经历多次迭代。首先将变量分为样本空间与变量空间,在样本空间里计算变量的稳定性,根据稳定性值,利用加权自举采样技术将变量划分为有用变量与无用变量;而后在变量空间中,计算变量空间的均方根误差(RMSE),选择RMSE值较小的模型,统计每个变量出现的频率,利用指数衰减函数在无用变量中去掉变量频率较低的变量;最后,将提出的算法应用在近红外光谱玉米数据集中来预测玉米中淀粉的含量,并与蒙特卡洛无信息变量消除法(MCUVE)、竞争自适应重加权抽样(CARS)、自举柔性收缩(BOSS)所测结果进行了对比。实验结果表明,本文提出的方法预测的结果最好,其预测集均方根(RMSEP)与相关系数(Rp)分别为0.0409和0.9974,筛选后的特征变量仅为原始光谱数据的2.7%,说明提出的变量选择方法能够提高模型的运算效率与预测能力,是一种有效的变量选择方法。
中文关键词:波长选择  加权自举采样  近红外光谱  偏最小二乘[sub_s][sup_s] [sup_e][sub_e]
 
A near infrared wavelength selection method based on the variable stability and population analysis
Abstract:In order to improve the efficiency and performance of the analysis model, in this paper, a feature wavelength selection method based on variable stability and population analysis (VSPA) is proposed, the algorithm will go through many iterations. Firstly, the variables are divided into sample space and variable space, and the stability of variables is calculated in the sample space. According to the stability value, the variables are divided into useful variables and useless variables by weighted bootstrap sampling technology. Then, in the variable space, the root mean square error (RMSE) is of PLS models is calculated, the models with smaller RMSE value is selected to calculate frequency of each variable, and the exponential decline function is used to remove the variables with lower frequency from the useless variables. Finally, the proposed algorithm is applied to corn NIR dataset to predict the starch content, and the results are compared with Monte Carlo non-information variable elimination method (MCUVE), competitive adaptive reweighted sampling (CARS) and bootstrap soft shrinkage (BOSS). The experimental results show that the proposed method has the best prediction results, with predicted root mean square error (RMSEP) and predicted correlation coefficient (RP) are 0.0409 and 0.9974, respectively. The feature variables after selection is only 2.7% of the original spectral data. It shows that the proposed variable selection method can improve the operational efficiency and prediction accuracy of the model, and is proved to be an effective variable selection method.
keywords:Wavelength selection  weighted bootstrap sampling  near infrared spectral  partial least squares.
  HTML  查看/发表评论  下载PDF阅读器

版权所有:《红外与毫米波学报》编辑部