Abstract:To explore the influence of data processing methods on identification of rice from different geographical origins by Raman spectroscopy, this paper took Panjin rice, Xiangshui rice, Xijiang rice, Jiansanjiang rice, Wuchang rice and Yanbian rice as examples, the accuracy of identification model under different data processing methods were investigated. Rice samples were carefully finished, crushed and screened to collect rice flour with particle size of 100-140 mesh. Raman spectra are collected at five measuring points for each sample. Then, relative standard deviation analysis and hierarchical clustering analysis were used to eliminate the difference data. Finally, all the data before and after eliminating the difference data and the data before and after taking the average value are separately constructed for establishing classification model by support vector machine. The results showed that hierarchical cluster analysis could identify the potential difference data. Relative standard deviation analysis could preliminarily determine whether there were difference data and ultimately verified whether they were indeed the difference data. In addition, the data after taking the average value could narrow the difference in the same rice samples and expand the difference between different rice samples, which could effectively improve the recognition accuracy of the model. The data processing method explored in this paper, that was eliminating the difference data followed by taking the average value, could improve the recognition accuracy by 12.89%, provided a more accurate and effective data for identification of rice from different geographical origins.