Cluster Structure of K-means Clustering via Principal Component Analysis
暂无分享,去创建一个
K-means clustering is a popular data clustering algorithm. Principal component analysis (PCA) is a widely used statistical technique for dimension reduction. Here we prove that principal components are the continuous solutions to the discrete cluster membership indicators for K-means clustering, with a clear simplex cluster structure. Our results prove that PCA-based dimension reductions are particularly effective for K-means clustering. New lower bounds for K-means objective function are derived, which is the total variance minus the eigenvalues of the data covariance matrix.
[1] Maurice K. Wong,et al. Algorithm AS136: A k-means clustering algorithm. , 1979 .
[2] J. A. Hartigan,et al. A k-means clustering algorithm , 1979 .
[3] I. Jolliffe. Principal Component Analysis , 2002 .
[4] K. Fan. On a Theorem of Weyl Concerning Eigenvalues of Linear Transformations: II. , 1949, Proceedings of the National Academy of Sciences of the United States of America.