Hard and Soft Clustering Explained

November 17, 2016

I read “An Introduction to Clustering and Different Methods of Clustering.” Clustering, it seems, remains a popular topic among the quasi-search and content processing crowd. What’s interesting about this write up is that it introduces hard clustering and soft clustering. I had assumed that clustering was neither hard nor soft. Here’s the distinction:

  • In hard clustering, each data point either belongs to a cluster completely or not. For example, in the above example each customer is put into one group out of the 10 groups.
  • In soft clustering, instead of putting each data point into a separate cluster, a probability or likelihood of that data point to be in those clusters is assigned.

The write up then highlights these go-to methods of clustering:

  • K means clustering
  • Hierarchical clustering.

The write up introduces the idea of supervised learning. I noted that the article did not point out that training is a time consuming and often expensive exercise. The omission complements the “quick look” approach in the write up.

I am not sure that a person interested in clustering will be able to make a giant leap forward. Perhaps the effort will result in a hard soft landing?

Stephen E Arnold, November 17, 2016

Comments

Comments are closed.

  • Archives

  • Recent Posts

  • Meta