The Open Source Search Ostriches
April 30, 2012
ArnoldIT, located in Harrod’s Creek, Kentucky, has spotted a new species of search, content processing, and text mining vendor: The Scrutans Struthioniformes. Believed to be related to the ratites, this new subspecies is known to be indifferent to ignorant of the predator from the open source jungle.
The proprietary search vendor, Scrutans Struthioniformes, ignores the impact of open source search and information retrieval systems.
ArnoldIT has completed a couple of exploratory expeditions thought he wilds of open source search, clustering, and related disciplines. Sparked by the bimonthly feature on open source search which is currently appearing in Information Today’s Online Magazine, the discovery of the Scrutans Struthioniformes was unexpected.
For almost 50 years, information retrieval meant proprietary systems built upon innovations by academic researchers. When the influence was from the number crunching of the Cornell school or the semantic shenanigans from Stanford, search and retrieval translated to:
- Expensive to license, install, optimize, and maintain systems
- Licensing restrictions which prevented client-specific tailoring and fast cycle problem remediation or feature addition
- High levels of user dissatisfaction from the CFO’s office (the lady who pays the bills) to the user in the sales department (the person who has to find out what happened to a particular customer’s order).
What’s changed, according to ArnoldIT, is that open source options are readily available. Smart outfits like IBM killed off in house, brute force search efforts and embraced the open source Lucene/Solr technology. IBM is a proprietary outfit, but the use of Lucene/Solr allowed more effort to be put into value-adding projects such as the “wrappers” which make Watson a game show winner. IBM has also used its billions to purchase proprietary vendors to deliver “additional value.” The purchase of Vivisimo is a good example of a quick way to get clustering, deduping, and federating functions to bolt on the open source plumbing. IBM may disagree, but we have our views.
Other vendors have built businesses on open source search. One example is the emergence of Lucid Imagination and its Lucid Works Enterpriser 2.0 solution. Licensees get speedy search and retrieval, a staff able to answer questions, and a the rapid cycle innovation of the open source Lucene/Solr software.
Clever Amazon is a “sort of” open outfit. On one hand, the company uses open source software to make the Amazon cloud work. However,the CloudSearch solution is based on A9. Amazon, however, provides “sort of” open application programming interfaces. Open source as a business angle is part of the CloudSearch play along with making life easy for developers to deliver “good enough” search.
The Basho Riak Search angle is a variation. Riak Search is proprietary but Basho has made it open source. (A free profile of Basho is available by registering at TheSeed2020, an ArnoldIT content delivery Web site.) Good citizens and good marketing. For a company with a problem which requires Basho data management, the Riak Search solution is available, and it is open source.
There are other variations as well, and these are explained in the ArnoldIT briefing about open source search, its opportunities, and its challenges. Unlike the technology payloads delivered by blogs, the ArnoldIT briefing focuses on the business angle of open source search, and the research has delivered some shockers; for example:
- In a sample of 35 proprietary search vendors, 25 assert that their systems are in some way open source. Good marketing, better technology, or great hyperbole?
- In a sample of 100 search vendors, two thirds of those pinged by ArnoldIT know about or are on top of open source search. Quite an assertion as the Lucid Imagination Lucene Revolution approaches with dozens of case studies that reveal large companies’ willingness to shift from proprietary solutions to open source search. Are most vendors of proprietary search systems ignoring reality? Sure looks like some are confident the search world tomorrow will look the way it did in 2003.
- Hosted search is gaining traction in some specific niches. Two of these niches have long been dominated by proprietary systems. More surprising in the fact that the greatest inroads are being made among the Fortune 1000. That’s the market where money often is for enterprise software vendors.
Will vendors of proprietary search and retrieval systems be able to keep their investors and stakeholders happy as open source becomes a greater force in 2013? The briefing considers the scenario when firms pour more funds into open source search and content processing start ups. If this happens, life becomes more difficult from “on the bubble” vendors of taxonomy, clustering, search, and basic information retrieval systems.
Net net: Another search revolution is brewing. Is your proprietary search vendor a Scrutans Struthioniformes? A better question: Are you? For more information about the ArnoldIT open source search briefing, write seaky2000 at yahoo dot com for options and fees. ArnoldIT may create an open source search ostrich T shirt. Stay tuned. Max and Tess are working on this project now.
Stephen E Arnold, April 30, 2012
Sponsored by Ikanow