ID:
publications-306
Type:
PEER REVIEWED ARTICLE
Year:
2010
Authors:
Gowen, A., Downey, G., Esquerre, C., OâDonnell, C.
Title:
Preventing over-fitting in PLS?calibration models of near infrared (NIR) spectroscopy data using regression coefficients?
Venue/Journal:
DOI:
10.1002/cem.1349
Research type:
Simulation & Modeling
Water System:
River Basins
Technical Focus:
Abstract:
AbstractSelection of the number of latent variables (LVs) to include in a partial least squares (PLS) model is an important step in the data analysis. Inclusion of too few or too many LVs may lead to, respectively, under or overâfitting of the data and subsequently result in poor future model performance. One wellâknown sign of overâfitting is the appearance of noise in regression coefficients; this often takes the form of a reduction in apparent structure and the presence of sharp peaks with a high degree of directional oscillation, features which are usually estimated subjectively. In this work, a simple method for quantifying the shape and size of a regression coefficient is presented. This measure can be combined with an indicator of model bias (e.g. root mean square error) to aid in estimation of the appropriate number of LVs to include in a PLS model. The performance of the proposed method is evaluated on simulated and and real NIR spectroscopy datasets sets and compared with several existing methods. Copyright © 2010 John Wiley & Sons, Ltd.
Link with Projects:
237819
Link with Tools:
Related policies:
ID: