Why Good Enough Is the New Norm in Search

September 29, 2014

Navigate to “Postgres Full Text Search Is Good Enough.” I first heard this argument at a German information technology conference a few years ago. The idea is surprisingly easy to understand. As long as a user can bang in a couple of key words, scan a result list, and locate information that the user finds helpful—job done. The search results may consist of flawed or manipulated information. The search results may be off point for the user’s query when evaluated by old fashioned methods such as precision and recall. The user may be dumb and relies on what the user finds accurate.

Whatever.

This write up explains the good enough approach in terms of PostgreSQL, a useful open source Codd type data management system. Please, note. I am not uncomfortable with good enough search. I understand that when the herd stampedes, it is not particularly easy to stop the run. Prudence suggests that one take cover.

Here’s the guts of the write up:

What do I mean by ‘good enough’? I mean a search engine with the following features:

  • Stemming
  • Ranking / Boost
  • Support Multiple languages
  • Fuzzy search for misspelling
  • Accent support

Luckily PostgreSQL supports all these features.

The write up contains some useful code snippets to make use of search features. The discussion of full text search is coherent and addresses a vast swath of content. Note that proprietary vendors have tilled acres of marketing earth and fertilizer to convert search into a mind boggling range of functions.

This article includes code snippets to tackle full text within PostgreSQL.

Querying is included as well. Again, code snippets are included. (My teenage advisors said, “Very useful snippets.” Okay. Good.

The write up concludes:

We have seen how to build a decent multi-language search engine based on a non-trivial document. This article is only an overview but it should give you enough background and examples to get you started with your own….Postgres is not as advanced as ElasticSearch and SOLR but these two are dedicated full-text search tools whereas full-text search is only a feature of PostgreSQL and a pretty good one

Reasonable observation. Worth reading.

If you are a vendor of proprietary search technology, there will be more individuals infused with the sprit of open source, not fewer. How many experts are there for proprietary systems? Fewer than the cadres of open source volk I surmise.

Stephen E Arnold, September 29, 2014

Comments

Comments are closed.

  • Archives

  • Recent Posts

  • Meta