The support vector machine is a widely used tool for classification.
In this talk, we consider two types of the support vector machine:
the 1-norm SVM and the standard 2-norm SVM. Both types can be
written as regularized optimization problems. In all current
implementations for fitting an SVM model, the user has to supply a
value for the regularization parameter. To select an appropriate
regularization parameter, in practice, people usually pre-specify a
finite set of values for the regularization parameter that covers a
wide range, then either use a separate validation data set or use
cross-validation to select a value for the regularization parameter
that gives the best performance among the given set. In this talk,
we argue that the choice of the regularization parameter can be
critical. We also argue that the 1-norm SVM may have some advantage
over the standard 2-norm SVM under certain situations, especially
when there are redundant noise features. We show that the solution
paths for both the 1-norm SVM and the 2-norm SVM are piecewise
linear functions of the regularization parameter. We then derive two
algorithms, respectively for the 1-norm SVM and the 2-norm SVM, that
can fit the entire paths of SVM solutions for every value of the
regularization parameter, hence facilitate adaptive selection of the
regularization parameter for SVM models.
It turns out that the piecewise linear solution path property is not
unique to the SVM models. We will propose some general conditions
on the generic regularized optimization problem for the solution
path to be piecewise linear, which suggest some new useful
predictive statistical modeling tools.
This is joint work with Saharon Rosset (IBM T.J.Watson), Trevor
Hastie (Stanford U.), and Rob Tibshirani (Stanford U.)