K-Means Cluster Algorithm
Move! Average! Cluster! Move! Average! Cluster! …
The k-Means cluster algorithm may be regarded as a series of iterations of: finding cluster centers, computing distances between sample points, and redefining cluster membership.
The data given by x
is clustered by the k-means method, which aims to partition the points into k groups such that the sum of squares from points to the assigned cluster centers is minimized. At the minimum, all cluster centres are at the mean of their Voronoi sets (the set of data points which are nearest to the cluster centre).
Animation
R code
library(animation) saveHTML({ani.options(interval =2, nmax =50)par( mar =c(3, 3, 1, 1.5), mgp =c(1.5, 0.5, 0)) cent =1.5*c(1, 1, -1, -1, 1, -1, 1, -1) x = NULL for(i in1:8) x =c(x, rnorm(25, mean= cent[i])) x =matrix(x, ncol=2)colnames(x)=c("X1", "X2") kmeans.ani(x, centers =4, pch =1:4, col=1:4)},img.name="kmeans_ani", htmlfile ="kmeans.html", ani.height=500, ani.width=600, title="K-means Cluster Algorithm", description =c("Move! Average! Cluster! Move! Average! Cluster!...", "Which might be helpful in understanding the concept of K-means Culster Algorithm Analysis ."))## R version 2.12.1 (2010-12-16)## Platform: i386-pc-mingw32/i386 (32-bit)## Other packages: animation 2.0-0
df = cbind(x1 = c(rnorm(100, -2), rnorm(100, 2)), x2 = rnorm(200)) library(animation) ## 2 centers saveHTML({ kmeans.ani(df, centers = 2) }) ## 4 centers saveHTML({ kmeans.ani(df, centers = 4) })