|
Published Articles >> Table of Contents >> Abstract
47th Annual IEEE Symposium on Foundations of Computer Science (FOCS'06)
pp. 165-176
The Effectiveness of Lloyd-Type Methods for the k-Means Problem
Rafail Ostrovsky, UCLA, USA
Yuval Rabani, Israel Institute of Technology, Israel
Leonard J. Schulman, Caltech, USA
Chaitanya Swamy, Caltech, USA
Full Article Text:

DOI Bookmark: http://doi.ieeecomputersociety.org/10.1109/FOCS.2006.75
Send link to a friend
| Abstract |
|
We investigate variants of Lloyd's heuristic for clustering
high dimensional data in an attempt to explain its popularity
(a half century after its introduction) among practitioners,
and in order to suggest improvements in its application.
We propose and justify a clusterability criterion for data
sets. We present variants of Lloyd's heuristic that quickly
lead to provably near-optimal clustering solutions when applied
to well-clusterable instances. This is the first performance
guarantee for a variant of Lloyd's heuristic. The
provision of a guarantee on output quality does not come
at the expense of speed: some of our algorithms are candidates
for being faster in practice than currently used variants
of Lloyd's method. In addition, our other algorithms
are faster on well-clusterable instances than recently proposed
approximation algorithms, while maintaining similar
guarantees on clustering quality. Our main algorithmic
contribution is a novel probabilistic seeding process for the
starting configuration of a Lloyd-type iteration.
|
Additional Information
|
Citation:
Rafail Ostrovsky, Yuval Rabani, Leonard J. Schulman, Chaitanya Swamy,
"The Effectiveness of Lloyd-Type Methods for the k-Means Problem,"
focs,
pp. 165-176,
47th Annual IEEE Symposium on Foundations of Computer Science (FOCS'06),
2006
|
|