Weird Math: Open Source Cost Estimates
February 11, 2009
IT Business Edge ran a story by Ann All called “Want More Openness in Enterprise Search? Open Source May Fill Bill?” If you are an IT person named Bill and you don’t know much about open source search, open source may turn “fill bill” into “kill Bill.” On the surface, open source offers quite a few advantages. First, there are lots of volunteers who maintain the code. The reality is that a few people carry the load and others cheerlead. For Lucene, SOLR, and other open source search systems, that works pretty well. (More about this point in a later paragraph.) Second, the “cost” of open source looks like a deal. Ms. All quotes various experts from the azure chip consulting firms and the trophy generation to buttress her arguments. I am not sure the facts in some enterprise environments line up with the assertions but that’s the nature of folks who disguise deep understanding with buzzword cosmetics. Third, some search systems like the Google Search Appliance cost $30,000. I almost want to insert exclamation points. How outrageous. Open source costs less, specifically $18,000. Like some of the Yahoo math, this number is conceptually aligned with Jello. The license fee is not the fully burdened cost of an enterprise search system. (Keep in mind that this type of search is more appropriately called “behind the firewall search”.)
What’s the Beyond Search view of open source?
In my opinion, open source is fine when certain conditions are met; namely:
- The client is comfortable with scripts and familiar with the conventions of open source. Even the consulting firms supporting open source can be a trifle technical. A call for help yields and engineer who may prefer repeating Unix commands in a monotone. Good if you are on that wave length. Not so good if you are a corporate IT manager who delegates tech stuff to contractors.
- The security and regulatory net thrown over an organization permits open source. Ah, you may think. Open source code is no big deal. Sorry. Open source is a big deal because some organizations have to guarantee that code used for certain projects cannot have backdoors or a murky provenance. Not us, you may think. My suggestion is that you may want to check with your lawyer who presumably has read your contracts with government agencies or the regulations governing certain businesses.
- The top brass understand that some functionality may not be possible until a volunteer codes up what’s needed or until your local computer contractor writes scripts. Then, you need to scurry back to your lawyer to make sure that the code and scripts are really yours. There are some strings attached to open source.
Does open source code work? Absolutely. I have few reservations tapping my pal Otto for SOLR, Charles Hull at Lemur Consulting for FLAX, or Anna Tothfalusi at Tesuji.eu for Lucene. Notice that these folks are not all in the good old US of A, which may be a consideration for some organizations. There are some open source search outfits like Lucid Imagination and specialists at various companies who can make open source search sit up and roll over.
It is just a matter of money.
Now, let’s think about the $18,000 versus the Google Search Appliance. The cost of implementing a search system breaks into some categories. License fees are in one category along with maintenance. You have to do your homework to understand that most of the big gun systems, including Google and others have variable pricing in place. Indexing 500,000 documents is one type of system. Boosting that system to handle 300 million documents is another type of system.
The Google Search Appliance costs as much, if not more, than Autonomy or Endeca at scale.So, an enterprise search system easily hits six or seven figures in the first year for large scale document collections. Open source can knock these license fees out of the goal. Free beats $1 million worth of Google Search Appliances, right? What about free beating Autonomy or Endeca when one considers the fully burdened cost of the system?
Let’s look at the other costs of an enterprise search system, excluding the license fees for the system. Here’s a short list:
- Setting up, tuning, and deploying the system. The cost is the same for any search system.
- Customizing the system. This is a killer because most licensees don’t now what they don’t know and usually end up in a sea of red ink trying to get the deployed search system to deliver on the Star Trek fantasies that the procurement team and the sales person generated. Remember time X rate = cost. Since programming is often an unknown in terms of time, it’s easy to blow through tens of thousands of dollars before realizing that a task is beyond the reach of the licensee.
- Maintaining the system. Search systems break. There are many reasons. If you want more detail, take a look at the discussions of search elsewhere in this Web log. Open source systems break just like commercial systems. Here’s why. The customized system is unknown to the search vendor’s engineer. What works on the vendor’s system may not work on the licensee’s customized system. This is bad and it costs time X rate to figure out the problem and remediate it. If the glitch can’t be fixed within the licensee’s budget, the system has to be rolled back. More time X cost. It doesn’t take a CFO long to pinpoint the cause of this problem and take corrective action.
- Scaling the system. Even the Google Search Appliance is expensive to scale. Here’s why. You buy an appliance, say for instance, the GB 5005 for five million documents. This puppy will set you back a $190,000, maybe more. In order to make this puppy redundant, you need a hot spare. That’s another $130,000. The GB 5005s are fast but let’s assume you have a heavy search load. No problem. You just order up another production gizmo and another hot spare or buy Google’s clustering solution. Same deal occurs with Autonomy and Endeca. You add more documents then you build up and out on the technical infrastructure. Think capital expenditure. The same capital equipment cost holds true for Microsoft SharePoint and the oh so wonderful Fast ESP system.
That’s enough of the cost picture. My point is that generalization about license fees are important to mom and pop shops. Mid sized and large sized organizations rack up significant costs that dwarf the license fees. Open source makes sense when most of the tech work can be done with existing staff. As soon as the open source licensee turns to outside experts, the costs begin their gentle rise upwards. Goof up and those costs shift into overdrive.
I don’t want to rain on the open source parade. I do want those who read about open source from nice and probably well intentioned individuals to keep the costs firmly in mind. Let me repeat. Under certain circumstances, open source is a good choice for search. In other situations, the problems one encounters will be identical to the challenges with commercial search systems. The total costs for the search system are what’s important. Asserting that open source is great because it reduces a license fee is misleading. Don’t believe me. Download Lucene and slap it in to your mid-sized organization. Send me a summary of your costs at the end of the first year. If my data are accurate, the costs will be hefty unless you are an avid developer for the open source software you use. The easiest way to control cost is to chop headcount. Want my view on who will be among the first to go?
Stephen Arnold, February 11, 2009
Comments
One Response to “Weird Math: Open Source Cost Estimates”
I think the the following statement is problematic:
“If my data are accurate, the costs will be hefty unless you are an avid developer for the open source software you use.”
Hefty is relative. For most companies even the relatively cheap GSA is expensive. Endeca, FAST, etc. charge hundreds of thousands of dollars. You could hire a *pile* of contractors to make a *pile* of customizations just for your needs for that kind of money.
What happens when you buy a “solution” from an enterprise search vendor? You pay a steep license price to get a “solution” and then pay their expensive professional services people to make the needed customizations.