OSU Navigation Bar

The Ohio State University

Department of Statistics

Cockins Hall
rollover image OSU Statistics
            Home

design element

OSU Statistics

Home

News

Research & Consulting Groups

People

For Visitors

For Prospective Students

For Current Students & Faculty

Contact Us



rollover image

News

rollover image

Newsletter

rollover image

Seminars

Department of Statistics, The Ohio State University
Statistics and Biostatistics Colloquium Series

Hosted jointly with the Initiative in Population Research

A Rank-Based Clustering Method for the Analysis of Social Inequality Data

Tim F. Liao
Chair, Department of Sociology, University of Illinois

3:30PM - Thursday, October 23, 2008
Room 170, Eighteenth Avenue Bldg. (EA 170)

ABSTRACT

When studying social, economic or health inequality, the analyst must estimate clusters or classes contained in the data. The commonly used methods such as latent class/cluster models or the k-mean method assume the multivariate normal distribution. Most inequality data, however, are non-normal in distribution. This paper proposes a rank- based cluster analysis, which can take the form of a latent class/ finite mixture model or a basic cluster method such the k-means algorithm; in either case, the multivariate normal distributional assumption is no longer crucial. There are two theoretical foundations for the proposed method—relative deprivation theory in sociology and relative income concept in economics on the one hand, and topological distance in mathematical thinking on the other. This method offers an alternative view on inequality, and is nonparametric in essence. A simulation analysis of three-clusters mixtures indicated by two or three variables using three different data- generating mechanisms shows that when data are normal, either the (real) value-based or rank-based method would produce similar results. When data depart from normality, the results are more mixed: finite mixture models do somewhat better for data of real values while the k-means method performs much better for ranked data. Three empirical data applications further demonstrate the usefulness of the rank-based method: an analysis of the 1991 British Household Panel Survey data with three variables for socioeconomic classification, a re-analysis of the classic diabetes data, and an exploration of fertility inequality using the 2006 U.S. General Social Survey data. All three examples suggest some new substantive insights unobtainable from the parametric analysis of the original data and require much reduced computation time for estimation.

Meet the speaker in Room 212 Cockins Hall at 4:30 p.m. Refreshments will be served.



If you have trouble accessing this page, or need an alternate format contact webmaster@stat.osu.edu.