Eric S Bullington

Software Engineering and Data Visualization

K-means Clustering Visualization

I’ve recently been attempting to create an educational visualization of the k-means algorithm. It seemed like the kind of algorithm that would lend itself well to a visualization. Unfortunately, it’s not been so easy. Have a look below the fold to see what I’ve come up with so far.

This is it:

Please note that I just wrote the d3-based visualization: the k-means clustering algorithm itself was taken from harthur’s clusterfck Javascript cluster analysis library.

My plan is eventually to use this as part of a tutorial on machine learning techniques. However, I’m not satisfied with how the visualization came out. The problem, as you may or may not have noticed, is that although it’s very clear when a new centroid is generated (that’s the red dots), the resulting changes in the clusters – those are the groups of black, blue, and green dots – are not very noticable. If you look very closely, you can see the clusters change after each new centroid flies onto the graph. But it’s almost imperceptible to me, and I know what to look for. I need to find a way to emphasize the underlying clusters.

Some people have suggested trying a Voronoi diagram, but that’s essentially just a continuous implementation of this graph. Any other suggestions on how to illustrate k-means clustering visually?

<< Older