Hard and Soft Clustering Explained

November 17, 2016

I read “An Introduction to Clustering and Different Methods of Clustering.” Clustering, it seems, remains a popular topic among the quasi-search and content processing crowd. What’s interesting about this write up is that it introduces hard clustering and soft clustering. I had assumed that clustering was neither hard nor soft. Here’s the distinction:

In hard clustering, each data point either belongs to a cluster completely or not. For example, in the above example each customer is put into one group out of the 10 groups.
In soft clustering, instead of putting each data point into a separate cluster, a probability or likelihood of that data point to be in those clusters is assigned.

The write up then highlights these go-to methods of clustering:

K means clustering
Hierarchical clustering.

The write up introduces the idea of supervised learning. I noted that the article did not point out that training is a time consuming and often expensive exercise. The omission complements the “quick look” approach in the write up.

I am not sure that a person interested in clustering will be able to make a giant leap forward. Perhaps the effort will result in a hard soft landing?

Stephen E Arnold, November 17, 2016

Written by Stephen E. Arnold · Filed Under algorithms, News

Comments

Comments are closed.

Search the site
Subscribe to Beyond Search
Feature archive
News archive

Stephen E. Arnold monitors search, content processing, text mining and related topics from his high-tech nerve center in rural Kentucky. He tries to winnow the goose feathers from the giblets. He works with colleagues worldwide to make this Web log useful to those who want to go "beyond search". Contact him at sa [at] arnoldit.com. His Web site with additional information about search is arnoldit.com.