How Do I Search Thee? Let Me Count the Ways
June 14, 2011
“Open Source Search Engines Every Developer Should Know About” provides snapshots of about a dozen different “open source systems.”
There are quite a few enterprise search and Web search systems available. Confusing the enterprise products with the Web products is an all too common error. But within each of these two types of search engines, some vendors provide free or hobbled systems to allow system managers to do some tire kicking.
Overviews like this one are useful because new finds do turn up. Unfortunately, overviews can mix up the different types of search systems and increase the confusion instead of clarifying the situation.
Paul Anthony, on the blog webdistortion.com, describes a number of search engines and names some sites use the systems. He lists the individual search engines with a brief write up on each along with a video showing how the engine operates. Among those included are:
- Constellio, www.constellio.com. The system is, according to Constellio, the “first complete open source enterprise content search solution.”
- Search Blox, www.searchblox.com. The system is built around Lucene.
- Sphinx, http://sphinxsearch.com. The system is “an open source full text search server, designed from the ground up with performance, relevance (aka search quality), and integration simplicity in mind.”
The author is critical of the application of search engines on seemingly a large number of web sites. He writes:
typically search is one of the most poorly implemented pieces of technology on a site, with developers opting for the standard the out of the box solution which comes with most modern content management systems – and in many cases doesn’t do justice to your content. I thought I’d take a look at what other enterprise level and open source search engines out there to find and index the information on your site faster, and provide users with a deeper, more relevant result set.
The write up does include some red herrings; for example, Coveo. The company has emphasized customer service applications, not search if I recall the PR person’s description of the “new direction” for Coveo. Also, I expect the investors to be somewhat surprised to be listed as an open source search system. You can download a version of Coveo that is limited to the number of documents on my iPad. The get the “real deal,” you have to pay a license fee for this proprietary system. There are also some unusual omissions. I don’t expect many of today’s analysts to be familiar with the Lucene-based Tesuji.eu system but I do expect a reference to FLAX.
Like most write ups about search and retrieval, the attempts to explain, categorize, and clarify usually increase the confusion. This is, in my opinion, a highly desirable condition for the unemployed “real journalists”, the pundits, the failed CMS system administrators, and the majors in dance theory. Consultants in enterprise search have to come from somewhere other than computer science programs.
Stephen E Arnold, June 14, 2011
Sponsored by ArnoldIT.com, the resource for enterprise search information and current news about data fusion