Open Source Scaling Metrics
May 21, 2010
Big day for open source. A reader sent me a link to the Lucid Imagination blog post “Search News–Content Would Be King.” The post launches from the firm’s open source search conference in Prague the week of May 17. At that conference the UK’s Guardian shared some information about its use of the open source Solr search system. The key passage for me was:
To put not too fine a point on it: The Guardian Open Platform disintermediates Google’s pay-per-click-to-see-news model. Guardian developers innovate using open source Lucene/Solr to match users with data for competitive advantage; application developers build new apps with the Guardian API. Open delivers the innovation, Lucid delivers the foundation: in working with Guardian to tune their Solr implementation, we reduced index time from 15 hours on their prior Commercial search engine to less than an hour with Solr. Schmidt, Brin, and Page lose sleep? Maybe, maybe not.
I then spotted “Economics Of Scaling”, which presented some useful open source scaling metrics. Hard cost data can be scare as hen’s teeth. Here’s what this write up revealed:
It’s running a 50-node cluster, which spans three data centers on Amazon’s EC2 service for about $10,000 a month, says CTO Joe Stump, who previously used Cassandra at Digg. By contrast, MySQL premium support would cost about $5,000 per year per node, or $250,000 per year–more than double the Cassandra setup, Stump says, and Microsoft SQL Server can cost as much as $55,000 per processor per year.
My take away. Flexibility and economics will add evidence to the conversation about the merits of open source versus commercial software. In the unsettled financial weather systems, the Guardian and economics data flash like a signal flare for some CFOs.
Stephen E Arnold, May 22, 2010
Freebie