国产日韩欧美一区二区三区三州_亚洲少妇熟女av_久久久久亚洲av国产精品_波多野结衣网站一区二区_亚洲欧美色片在线91_国产亚洲精品精品国产优播av_日本一区二区三区波多野结衣 _久久国产av不卡

?

Spectrum Quantitative Analysis Based on Bootstrap-SVM Model with Small Sample Set

2016-07-12 12:43MAXiaoZHAOZhongXIONGShanhai
光譜學(xué)與光譜分析 2016年5期
關(guān)鍵詞:針入度方根光譜

MA Xiao, ZHAO Zhong, XIONG Shan-hai

College of Information Science and Technology, Bejing University of Chemical Technology, Beijing 100029, China

Spectrum Quantitative Analysis Based on Bootstrap-SVM Model with Small Sample Set

MA Xiao, ZHAO Zhong*, XIONG Shan-hai

College of Information Science and Technology, Bejing University of Chemical Technology, Beijing 100029, China

A new spectrum quantitative analysis method based on Bootstrap-SVM model with small sample set is proposed in this paper.To build the spectrum quantitative analysis model for bitumen penetration index, altogether 29 bitumen samples were collected from 6 companies.Based on the collected 29 bitumen samples, spectrum quantitative analysis model with proposed method for predicting bitumen penetration index has been built.To verify the feasibility and effectiveness of the proposed method, the comparative experiments of predicting the bitumen sample penetration index with the proposed method, partial least squares (PLS) and support vector machine (SVM) have also been done.Comparative experiment results have verified that the minimum prediction root mean squared error (RMSE) is achieved by using the proposed Bootstrap-SVM model with the small sample set.The proposed method provides a new way to solve the problem of building the spectrum quantitative analysis model with small sample set.

Spectrum quantitative analysis; Small sample set; Bootstrap; Support vector machines; Partial least squares

Introduction

Spectrum quantitative analysis is an important research area in spectroscopy.Building a stable and accurate prediction model is the premise of spectrum quantitative analysis for unknown samples.Successful applications of spectrum quantitative analysis methods can now be seen in a wide variety areas, such as multiple linear regression (MLR)[1], principle component regression (PCR)[2], partial least squares (PLS)[3], artificial neural networks (ANN)[4]and support vector machine(SVM)[4].MLR, PCR and PLS are usually applied to build the linear prediction model and ANN, SVM can be applied to build the nonlinear prediction model.In the real applications, it is often difficult to obtain complete information from samples due to the limitations of the sample sources.It is noticed that less effort has been made to the studies of spectrum quantitative analysis based on small sample set, while the spectrum quantitative analysis based on large sample set has been well studied[1-4].In the cases of small sample set, it is usually difficult to build the stable and accurate prediction models for spectrum quantitative analysis with traditional methods.Hence, it is important to study the modeling methods for spectrum quantitative analysis with small sample set.

In this paper, how to build quantitative analysis model of the bitumen penetration index spectrum with small sample set is studied.Bitumen as pavement gumming material is widely used in road engineering.Bitumen penetration index is one of the important indicators which reflect the hardness of the pitch, consistency and ability to resist shear failure.Although the bitumen penetration index is a physical property, it is closely related with the content of the bitumen components.Aromatics saturation and aromatics have the high penetration indexes,while the penetration indexes of the resin and asphaltene are very low.According to the JTGF40-2004 issued by Ministry of Transportation of the People's Republic of China, the bitumen penetration index is measured by “Standard Test Methods of Bitumen and Bituminous Mixtures for Highway Engineering (JTJ 052—2000)”.This is time-consuming, difficult to operate and is also found of using toxic solvents.Therefore, it is necessary to work out a fast, clean and convenient method to measure the bitumen penetration index.Infrared spectroscopy analysis is a nondestructive detection and also a rapid analysis method, which can be applied to measure the bitumen penetration index.In this paper, a new spectrum quantitative analysis method based on Bootstrap-SVM model with small sample set is proposed for building the bitumen penetration index prediction model.The paper is organized as follows: in Section 1, the sample processing with Bootstrap algorithm and machine learning with SVM are presented.The detailed description of the experiment is presented in Section 2.Section 3 is devoted to comparative experiments and discussion.The paper is concluded in Section 4.

1 Algorithms and theory

1.1 Sample processing

In the sample processing, Bootstrap resampling was applied to expand the sample set.Bootstrap resampling was proposed by Professor Efron[6].It is essentially a non-parametric resampling method which needs no assumption of the sample distribution.The basic idea of Bootstrap resampling is to simulate the sample generation process by repeating resampling data .Due to the limitations of the sample sources, the spectrum quantitative analysis model for predicting the bitumen penetration index has to be built based on small sample set.In this paper, Bootstrap resampling is applied to expand the sample set.The steps of sample processing with Bootstrap resampling are as follows:

(1) Define the original sample set asX=(X1,X2,…,Xn).Randomly generate the integers asi1,i2,…,in∈[1,n];

1.2 Noise injection

In order to simulate the sampling process and improve the stability of the spectrum quantitative analysis model, noise injection[7-8]was applied to the expanded samples after resampling.Noise injected to the input values, output values and both input and output values are three ways of injection.The noise injection can be described as

ZV=Z+V

(1)

ZVis the data matrix after the noise injection,Zis the source data matrix andVrepresents the noise matrix.So,

then,

(2)

Mis the total number of samples.pis the length of each data sample.zvijdenotes the data items after noise injection.zijdenotes the original data item andvijdenotes the noise which is added tozij.In this paper, Gaussian white noise matrix withVi∈N(0,σ2) was chosen as the noise matrix.The noise intensity can be adjusted byσ.

1.3 Support Vector Machine

Support vector machine (SVM) was proposed based on statistical learning by Vapnik[9].The SVM is a machine learning method based on structural risk minimization which can be used to deal with small sample set, nonlinear and high dimensional machine learning problems.In order to obtain the best generalization ability, the precision of data approximation and the complexity of approximation functions are compromised during the machine learning process in SVM and the learning process is transferred into solving a convex quadratic programming problem.Therefore, the global optima can be gained.The problem of local minima can be avoided compared with the traditional machine learning methods with multilayer feed forward neural networks.In SVM, the nonlinear transformation is applied to transfer the samples into the high-dimensional feature space and the linear decision function can be constructed to classify the original samples in the high-dimensional feature space.Therefore, the complexity of learning process has nothing to do with the dimensions of sample set.In this paper, SVM is applied to build the spectrum quantitative analysis model for predicting the bitumen penetration index.

2 Experiment

2.1 Sample information

29 bitumen samples have been collected from different factories.According to crude oil producers, the collected samples can be divided into two classes, the South America’s heavy oil and Xin Jiang’s thickened oil.The bitumen penetration indexes of the samples penetration were measured under the “Standard Test Methods of Bitumen and Bituminous Mixtures for Highway Engineering (JTJ 052-2000)”.The calibration set and validation set are shown in the table 1.

Tabel 1 Bitumen samples category and distribution

2.2 Instrument and working conditions

The spectrum of bitumen was collected by attenuated total reflectance infrared spectroscopy in the analytical instrumentation center of Beijing University of Chemical Technology.The instrument parameters were set as follows: the wave number range was 4000~650 cm-1, resolution was 4 cm-1and scan times were 32.The samples needed to be heated to 70 ℃ when the infrared spectrum was measured and a few samples were evenly coated on the surface of the ATR crystal.The same sample was repeated three times and then the average spectrum was used as the infrared spectrum of the sample.

2.3 Data processing

The quantitative models of PLS, SVM and Bootstrap-SVM have been compared in this paper.The methods of first-order differential, data smoothing and mean center were applied to PLS.The data normalization was applied to SVM and Bootstrap-SVM.

3 Result and discussion

3.1 Spectrum analysis

The main components of the road bitumen samples studied in this work are hydrocarbon, hydroxyl compound!and oxygenated compound.The penetration index is one of the physical properties of bitumen, but it is closely related to the chemical composition and content in bitumen.The infrared spectrum reflects the molecular vibration and rotational basic information of material.Therefore, the penetration index quantitative predicting model can be built with infrared spectrum analysis.The bitumen infrared absorption spectrum is shown in figure 1.

Fig.1 ATR IR spectrum of Bitumen samples

3.2 The spectrum quantitative analysis model with PLS

PLS is widely applied to the quantitative analysis of infrared spectrum currently.The PLS model in this paper was built with the data after pre-processing.The first three principal components were selected by cross-validation and the input and output data mapping.The input and output principle components and the proportion of eigenvalues are shown in figure 2 and in figure 3 respectively.The prediction result of PLS is show in table 2.

Fig.2 Eigenvalue vs.PC Number

Fig.3 Eigenvalue vs.PC Number

Table 2 Result of PLS

sampleprediction16567.021268.865.729369.863.918462.164.19856667.74467073.296766.966.65887168.87496567.094106567.842RMSE2.889

3.3 The spectrum quantitative analysis model with SVM

For convenience,the Libsvm tools developed by Professor Lin Chih-Jen were applied to build the spectrum quantitative model with SVM.The parameter settings are as follows: the SVM model type selected as ε-SVR, the kernel function selected as RBF, the parameters set as -p1.5,-c0.01.The prediction results are shown in table 3.

Table 3 Result of SVM

3.4 The spectrum quantitative analysis model with Bootstrap-SVM

Firstly, the original sample set was expanded by resampling method as described in 1.1.The calibration set of 19 samples was expanded to 200.Then, the 200 samples were injected with noise as described in 1.2.The noise intensity should be adjusted because the noise level has a great influence on the accuracy of the analysis model.If the intensity of noise is too small, the samples after noise injection are similar to the original samples.And if the intensity of noise is too large, it would generate the abnormal samples.Man-made factors, instrument factors, temperature and other factors may result in subtle differences in measurement of the spectrum.It is found that the subtle differences of spectrum would cause large errors of prediction.So the intensity of noise can be determined by several tests.In this paper, the intensity of noise was taken asσx=0.001,σy=0.1.The SVM model was built by using Libsvm tool.The parameters were chosen as -p2.0,-c0.03.The prediction results with 10 validation samples are shown in table 4.

Table 4 Result of Bootstrap-SVM

4 Conclusion

In this paper, a new spectrum quantitative analysis method based on Bootstrap-SVM model with small sample set is proposed.Based on the collected 29 bitumen samples, spectrum quantitative analysis model with proposed method for predicting bitumen penetration index has been built.The comparative experiments of predicting the bitumen sample penetration index with the proposed method, partial least squares (PLS) and support vector machine (SVM) have also been done.Comparative experiment results have verified that the minimum prediction root mean squared error (RMSE) is achieved by using the proposed Bootstrap-SVM model with the small sample set.In this paper, it is found that the nonlinear models such as SVM and Bootstrap-SVM could predict the bitumen penetration index more precisely.Though SVM based on statistical learning theory can be applied to build the predicting model with small sample set, the accuracy and generalization ability of SVM model with small sample set can be improved obviously by Bootstrap resampling and noise injection.

[1] BIAN Zhao-qi,ZHANG Xue-gong.Pattern Recognition.Beijing: Tsinghua University Publishing Company, 2000.192.

[2] Luo Wentao, Liu Guili.Modern Scientific Instruments, 2013, 6(3): 94.

[3] Roggo Y, Roeseler C, Ulmschneider M.J.Pharm.Biomed.Anal., 2004, 36(4): 777.

[4] Fontalvo-Gomez M, Colucci J A, Velez Natasha, Romanach R J.Applied Spectroscopy, 2013, 67(10): 1142.

[5] Mao R, Zhu H, Zhang L.A.Chen.Proc.ISDA, 2006, (1): 17.

[6] Lanouette R, Thibault J, Valade J L.Comput.Chem.Eng.,1999, 23(9): 1167.

[7] Luigi Fortuna, Salvatore Graziani, Maria Gabriella Xibilia.IEEE Transaction on Instrumentation and Measurement, 2009, 58(8): 2444.

[8] Efron B.The Annals and Statistics,1979, 7(1): 1.

[9] Grandvalet Y, Boucheron S.Neural Comput.,1997, 9(5): 1093.

*通訊聯(lián)系人

O657.3

A

基于Bootstrap-SVM在小樣本條件下光譜定量分析研究

馬 嘯,趙 眾*,熊善海

北京化工大學(xué)信息科學(xué)與技術(shù)學(xué)院,北京 100029

提出了一種在小樣本條件下建立光譜定量分析的新方法-Bootstrap-SVM模型。以道路瀝青為研究對象,共收集29個(gè)來自6個(gè)不同單位的瀝青樣本,利用所提方法建立了瀝青針入度定量分析模型。Bootstrap-SVM由Bootstrap重抽樣、噪聲注入及SVM三個(gè)步驟組成。為了對比所提方法的優(yōu)勢,對比了目前常用的PLS模型以及SVM模型。研究結(jié)果表明Bootstrap-SVM,PLS,SVM預(yù)測均方根誤差分別為0.773 5,2.889,1.784 4,所提方法預(yù)測精度最好,為小樣本條件下光譜定量分析提供了一種新的有效方法。

小樣本; Bootstrap; 支持向量機(jī)

2015-03-02,

2015-07-09)

Foundation item:Fundamental Research Founds for Central Universities (YS1404)

10.3964/j.issn.1000-0593(2016)05-1571-05

Received:2015-03-02; accepted:2015-07-09

Biography:MA Xiao, (1990—), Master degree candidate in Beijing University of Chemical Technology e-mail: maxiao2014job@163.com *Corresponding author e-mail: zhaozhong@mail.buct.edu.cn

猜你喜歡
針入度方根光譜
基于三維Saab變換的高光譜圖像壓縮方法
道路瀝青材料針入度與溫度的關(guān)聯(lián)及其數(shù)學(xué)模型的驗(yàn)證
道路石油瀝青針入度與溫度的關(guān)聯(lián)優(yōu)化及其數(shù)學(xué)模型的建立
高光譜遙感成像技術(shù)的發(fā)展與展望
改善SBS改性瀝青產(chǎn)品針入度指數(shù)的方法探究
我們愛把馬鮫魚叫鰆鯃
瀝青針入度測量不確定度評定
均方根嵌入式容積粒子PHD 多目標(biāo)跟蹤方法
星載近紅外高光譜CO2遙感進(jìn)展
數(shù)學(xué)魔術(shù)——神奇的速算
清水县| 兴山县| 邵东县| 连平县| 浑源县| 海伦市| 河南省| 久治县| 泰和县| 柳江县| 禄劝| 磐安县| 宜宾县| 喀喇| 扶余县| 晋江市| 普陀区| 临猗县| 鄂伦春自治旗| 红原县| 临夏县| 龙陵县| 江安县| 娱乐| 清远市| 屏东县| 颍上县| 合阳县| 河津市| 南召县| 南澳县| 宜宾市| 哈巴河县| 平塘县| 洪洞县| 新沂市| 来安县| 株洲市| 浮梁县| 蛟河市| 海晏县|