The need for effective methods to classify high-dimensional spectral data is increasing in tasks such as rapid and non-destructive detection of object features and chemical species using spectroscopy. Partial least squares discriminant analysis (PLS-DA) is an effective, multivariate regression based method for spectral data classification. Although powerful, PLS-DA suffers from performance degradation under complex conditions such as nonlinearity, class imbalance and multiclass, which are common in real-world applications. Collaborative representation-based classifier (CRC) is a new machine learning algorithm which represents a query by a linear combination of training samples and classifies the query based on the representation. It offers the possibility of good classification performance even under nonlinearity, class imbalance and multiclass conditions. In this paper, we present a novel method for spectral data classification, namely CRC-WPLS, which reaps the benefits of both PLS regression and CRC. This method searches for a weighted, linear combination of all training samples to represent the query by using PLS regression, and then assigns the query to the class which yields the least approximation error. CRC-WPLS is compared to PLS-DA, kernel PLS-DA, support vector machine (SVM), random forest (RF) and representation-based classifiers on fourteen general machine learning datasets and three spectral datasets. Experimental results show the proposed method can outperform 7 baseline methods in most cases, and achieve a high classification accuracy (>90%) for low grade spectra obtained from portable instrumentation.
|Number of pages||8|
|Journal||Chemometrics and Intelligent Laboratory Systems|
|Early online date||21 Aug 2018|
|Publication status||E-pub ahead of print - 21 Aug 2018|
- Partial least squares
- Collaborative representation
- Spectral data