|
|
Department of Statistics, The Ohio State University
Statistics and Biostatistics Colloquium Series
High Dimension, Low Sample Size Data Analysis
Jeongyoun Ahn
University of North Carolina
3:30PM - Thursday, February 23, 2006
Room 170, Eighteenth Avenue Bldg. (EA 170)
ABSTRACT
Most of the literature regarding the statistical analysis of High
Dimension, Low Sample Size (HDLSS) data deals with the situations
where both the dimension d and the sample size n go to infinity
together. In this talk the case where d tends to infinity while n is
fixed is examined. We show that the sample covariance matrix behaves
as if the underlying distribution is spherical if d is much larger
than n. This result plays a key role in extending to more general
settings the asymptotic geometric representation of HDLSS data, which
says the randomness of the data only lies in random rotations of a
regular n-simplex. The classification problem with HDLSS data is also
considered in this presentation. There exists a one-dimensional
direction in the data space (i.e., the n dimensional subspace
generated by the data vectors) such that the projected data have only
two distinct values. This direction is uniquely determined in the data
space and lies within the affine set of the data. It has a similar
formula to the Fishers linear discrimination direction and is shown
to be equivalent in non-HDLSS cases.
In the second part of the talk the bandwidth selection problem in the
kernel method is considered. The usual cross validation method is
observed to be subjective to sampling variation and computationally
expensive. A new method is proposed, based on the geometrical
understanding of kernel based classification: a nonlinear
classification that is actually a linear one in the embedded feature
space. A bandwidth that makes this linear classification task the
easiest is chosen. This method is empirically shown to be robust
to sampling variation and take much less computing time.
Meet the speaker in Room 212 Cockins Hall at 4:30
p.m. Refreshments will be served.
|