|
|
Department of Statistics, The Ohio State University
Statistics and Biostatistics Colloquium Series
A New Robust Partial Least Squares Regression Method: RoPLS
Asuman Turkmen
Department of Mathematics & Statistics, Auburn University
3:30PM - Thursday, January 24, 2008
Room 240, Cockins Hall (CH 240)
ABSTRACT
Most traditional statistical techniques are especially designed
for low dimensional data sets where the number of observations
(n) is greater than the number of variables (p). Application of
these methods for the problems such as, the survival time or the
tumor class prediction of a patient, based on a high-dimensional
data (n < < p), is a difficult and challenging task. The partial least
squares regression (PLSR) method is gaining importance in many
scientific fields that require preprocessing and analyzing high-
dimensional data. The main idea in PLSR is to summarize high-
dimensional and/or collinear predictor variables into a smaller set
of uncorrelated, so called latent variables, which have the best
predictive power. Despite of the fact that PLSR handles the
multicollinearity problem, it fails to deal with data containing outliers
since it is based on maximizing the sample covariance matrix between
the response(s) and a set of predictor variables, which is known to
be sensitive to outliers. Existence of multicollinearity and outliers is
no exception in real data sets, and it leads to a requirement of robust
PLSR methods.
The aim of this presentation is proving a brief overview of PLSR and
introducing the proposed robust PLSR method, RoPLS, based on
the weights calculated by BACON or PCOUT algorithms, and the
robust criteria for determining the optimal number of components,
which is a very important issue in building the PLSR model.
Benchmark data sets and simulation studies are employed to
demonstrate the performance of the proposed method along with
diagnostic plots to visualize and classify the outliers. Non-robustness
of the classical PLSR is illustrated by its unbounded sensitivity curve,
whereas RoPLS, yielding a bounded sensitivity curve, is shown to
be a robust method.
Meet the speaker in Room 212 Cockins Hall at 4:30
p.m. Refreshments will be served.
|