Open Source Scaling Metrics

May 21, 2010

Big day for open source. A reader sent me a link to the Lucid Imagination blog post “Search News–Content Would Be King.” The post  launches from the firm’s open source search conference in Prague the week of May 17. At that conference the UK’s Guardian shared some information about its use of the open source Solr search system. The key passage for me was:

To put not too fine a point on it: The Guardian Open Platform disintermediates Google’s pay-per-click-to-see-news model. Guardian developers innovate using open source Lucene/Solr to match users with data for competitive advantage; application developers build new apps with the Guardian API. Open delivers the innovation, Lucid delivers the foundation: in working with Guardian to tune their Solr implementation, we reduced index time from 15 hours on their prior Commercial search engine to less than an hour with Solr. Schmidt, Brin, and Page lose sleep? Maybe, maybe not.

I then spotted “Economics Of Scaling”, which presented some useful open source scaling metrics. Hard cost data can be scare as hen’s teeth. Here’s what this write up revealed:

It’s running a 50-node cluster, which spans three data centers on Amazon’s EC2 service for about $10,000 a month, says CTO Joe Stump, who previously used Cassandra at Digg. By contrast, MySQL premium support would cost about $5,000 per year per node, or $250,000 per year–more than double the Cassandra setup, Stump says, and Microsoft SQL Server can cost as much as $55,000 per processor per year.

My take away. Flexibility and economics will add evidence to the conversation about the merits of open source versus commercial software. In the unsettled financial weather systems, the Guardian and economics data flash like a signal flare for some CFOs.

Stephen E Arnold, May 22, 2010

Freebie

Quote to Note: Android and the Future

May 21, 2010

I tucked this in my speech file. Source: “Why Google Made Android.” Here’s the quote from a Googler:

“If we did not act, we faced a draconian future. Where one man, one company, one carrier was the future.” — Google

This will raise the level of discourse and improve upon the civility among commercial enterprises, right? Discourse? What’s that. Snicker. Snort.

Stephen E Arnold, May 22, 2010

Freebie

400 Million Users and One Scull in the Closet

May 21, 2010

I seem to be reading a lot about legal eagles and their “matters”. Microsoft tackles Salesforce. Google is struggling with Viacom and who knows who else? Now Facebook’s founder is shadowed by circling birds of prey. Navigate to “Facebook CEO’s Latest Woe: Accusations of Securities Fraud.” Here is the passage I found interesting:

The real question here is why Facebook’s lawyers haven’t succeeded in making this lawsuit go away. Before, ConnectU’s founders were just after a piece of the Facebook pie. Now, the stakes keep getting higher as the case drags on. An actual finding of securities fraud would make it difficult for Zuckerberg to remain Facebook’s CEO if it were to go public. However unlikely that is, why take the risk?

With Facebook maybe pointing the way to a different approach to search and retrieval, a boat anchor has been hooked to the Facebook scull.

Stephen E Arnold, May 22, 2010

Freebie.

Google Satisfying Every Search

May 21, 2010

Sam Hartman, one of the goslings with an advanced degree in math, sent me this picture. The scene is Louisville, Kentucky. Location is a church named “St Stephen Martyr”. No relation to me, Stephen E Arnold, obviously. The message to those driving by the church on Audubon Parkway is, “Google cannot satisfy every search.” Interesting assertion. I wonder if the Googlers agree?

A happy quack to Mr. Hartman for his quick thinking and his instant upload:

google church copy copy

This is the actual sign. I am posting it to illustrate the diffusion of Google in today’s social fabric. I fully support any religious institution that is named “Stephen” and evidences awareness of Google’s strengths and limitations.

Stephen E Arnold, May 21, 2010

Freebie, although I sometimes have to pay the goslings. Bread crumbs for everyone in the goose pond.

Open as a Concept Questioned

May 21, 2010

I read “Investor Dave McClure: Open Is for Losers” and was a bit surprised. I was thinking about RedHat, which seems to be doing okay. I also know that there are some open source search companies generating revenue and growing. These include Lucid Imagination and the lesser known Tesuji, run by a friend of mine. I also “debated” Charlie Hull at Lemur Consulting in December 2009, and he knocked down one of my arguments with his statement that Lemur Consulting doubled its revenue in a handful of months. I did what any addled goose would do. I ignored him and changed the argument to the lousy interfaces some search vendors push on their customers.

In the Mobile Beat write up, the argument was a reverse back flip. In short, the assertion that “open source is a positive” has been reversed so that “open source is bad.” Extending the argument, the closed methods of Apple and many other proprietary software vendors is the path that leads to the treasure chest filled with gold, diamonds, and iPads.

image

One presumes these are underachievers in this drunk tank. Source: http://mylalife.files.wordpress.com/2008/12/today-in-photos-dui-checks-communism-and-other-moments-in-l-a-history287875856.jpg

Here’s the key point: “Open is for losers”.

The argument continued in comments to the original write up (see no news here, gentle reader):

While I was being intentionally bombastic, I do believe some open standards foundations generate network effects & benefits for all… However, all things equal I believe the default case is that proprietary IP is useful to preserve some value for the IP creator, and thereby provide sustainability the company. Open standards are great once you get there, but the initial climb to get there can be long & expensive (for the climber), and when you get to the summit the benefits are usually for everyone. While that’s good for most everyone, the costs of climbing are usually borne by the climber alone…

What’s this tell the goose?

  1. Great link bait. I wish the goslings were this creative.
  2. Glittering generalities contain both truth and falsehood. Many folks under the age of 40 use this rhetorical method. Since few of these folks were the top performers on their college’s debate team, the notion of demographic and financial vectors deflecting traditional business models won’t make much sense. See, in this goose pond, there’s a reason open source exists and a reason for RedHat’s revenues.
  3. The quote is an interesting one. More of this ad hominen stuff seems to be coming each day.  Dr. Johnson, come back!

My position is that “open” is just one of many possible business models. As Enterprise 1.0 companies bite the dust, maybe some of the Enterprise 2.0 outfits are trying a different angle. Neither good nor bad. Just adaptive like azure chip consultants desperate to keep their jobs.

Stephen E Arnold, May 21, 2010

Freebie.

Free and Quality: Google and Its Open Video Codec

May 21, 2010

I am not too interested in video. Too old. Eyes bad. Brain does not process short ADD-shaded outputs. Nevertheless, I know a couple of my two or three readers are into digital video. A couple of the goslings watch Netflix stuff on the iPads. Seems like a waste of time to me, but to each goose his own paddling area. If you are a video lover, you probably think about the visual experience. Fuzzy video is annoying to many people, but I think fuzzy video improves many of the programs I have seen.

The author of “Diary of an x264 Developer” is a person deeply interested in video in general and its deeper gears and wheels. In fact, the write up provides considerable detail about the differences between Google’s free video codec and the not free H.264. The issues boil down to “quality”, which is a difficult concept in information. There are, after all, azure chip consultants, who know about “real” and “quality”, two notions I avoid like the Ohio River. The write up said in Addendum C: Summary for the Lazy”, which is definitely this goose:

VP8, as a spec, should be a bit better than H.264 Baseline Profile and VC-1.  It’s not even close to competitive with H.264 Main or High Profile.  If Google is willing to revise the spec, this can probably be improved. VP8, as an encoder, is somewhere between Xvid and Microsoft’s VC-1 in terms of visual quality.  This can definitely be improved a lot, but not via conventional means. VP8, as a decoder, decodes even slower than ffmpeg’s H.264.  This probably can’t be improved that much. With regard to patents, VP8 copies way too much from H.264 for anyone sane to be comfortable with it, no matter whose word is behind the claim of being patent-free. VP8 is definitely better compression-wise than Theora and Dirac, so if its claim to being patent-free does stand up, it’s an upgrade with regard to patent-free video formats. VP8 is not ready for prime-time; the spec is a pile of copy-pasted C code and the encoder’s interface is lacking in features and buggy.  They aren’t even ready to finalize the bitstream format, let alone switch the world over to VP8. With the lack of a real spec, the VP8 software basically is the spec–and with the spec being “final”, any bugs are now set in stone.  Such bugs have already been found and Google has rejected fixes. Google made the right decision to pick Matroska and Vorbis for its HTML5 video proposal.

Fascinating but not germane to the goose. However, for those who want  a piece of the big Internet video file, the Diary’s author suggests that Google’s marketing is a little ahead of the code analyzed by the author of the write up.

With Amazon, Apple, Netflix, and probably your mom getting into digital video, Google may have to take even more bold steps to create a viable revenue stream from its investments in its digital video push. Casting a shadow over the YouTube.com footprint is Google’s nemesis, Viacom. Fascinating business situation with legalities, technology, and other issues mashed up.

Stephen E Arnold, May 21, 2010

Freebie.

ArnoldIT Podcasts

May 20, 2010

The ArnoldIT and Beyond Search goslings have dipped their beaks in podcasts. These are short audio programs on topics related to information management. We are trying to be somewhat broader than search because – quite frankly – search is being absorbed into other types of software. You can find the podcasts on the ArnoldIT.com Web site at http://arnoldit.com/podcasts/. We have posted two podcasts. After we did some research, we have settled on the 15 minute program. Long enough to get the point across. Short enough to fit into a session on an exercise machine.

The first is with Erik S. Arnold, president of Adhere Solutions, and the son of Stephen E Arnold. I always find it interesting when people wonder if I am objective. Of course not. Erik is a talented individual but I am his dad, and I want to showcase his company and his expertise. Think he pays me for this? Think again!

The other podcast is with Sam Mefford, who runs the search practice at Avalon Consulting. In theory, I am going to send Avalon Consulting a huge bill, but the founder is from my home town, and I like what these folks are doing with their middleware and search implementation methods.

The podcasts are fun to do, and the goslings have more planned with Access Innovations, Exalead, and several other companies. If you want to know more about what we do to get these podcasts in circulation or want to talk about a podcast or videocast, write me at seaky2000@yahoo.com.

These are not “webinars”. We have a different angle of attack that produces traffic and gives those who participate a marketing “hook”.

Stephen E Arnold, May 20, 2010

Right, I am going to pay myself, bill my son, and charge a guy who is from my hometown for this post. Regulators and journalists, I am not of your ilk. Free write up.

Google, StreetView, and Allegations in the US

May 20, 2010

A happy quack to the reader who sent me a link to TechEye.net’s “Google Sued over Snaffled Street View Data.” I am not an attorney, not a journalist, not qualified to do much more than point to this write up. According to the article,

Google has received a writ from Vicki Van Valin and Neil Mertz as part of a class action that their privacy was violated by Street View vehicles picking up data from open wireless internet connections used at home. They also want a court to prevent Google from destroying the data that’s been collected.

The article includes quite a few references to legal things. I did recognize the phrase “class action.”

Assume that the article is accurate and that the legal references in it are germane to the allegations. Here are the questions I want to capture before the slip from my goose brain:

  1. Are the Department of Justice or the Federal Trade Commission likely to take an interest in this matter?
  2. What happens if the legal eagles move the matter into court and some of the alleged “information” is deleted or otherwise unavailable?
  3. How will the “we’re sorry” and “we goofed” method work in the face of international and US actions related to the alleged Google StreetView data collection scope?

I don’t know, but I remember one person said in a lunch conversation, “Never ask for permission. Do it. It is easier to ask for forgiveness.”

Will this work as a method of deflecting the allegations?

Stephen E Arnold, May 20, 2010

Freebie.

Cognition and Bing

May 20, 2010

Cognition Technologies to power Microsoft’s Bing now!”  discloses Cognition Technologies’  semantic technology as applied to Microsoft’s “decision engine” Bing. How will this improve Bing? At the core, the technology will help Bing deal with an “understanding” of the English language, says the official press release .

The “semantic map,” as it is dubbed, contains a gigantic collection of semantic contexts (over ten million), including representations, taxonomy, and word meaning distinctions. Cognition writes in their press release that over “540,000 word senses; 75,000 concept classes; 8,000 nodes; and 510,000 word stems” and other high-level features of semantic processing exist to help Bing process queries properly. The resources were codified and reviewed by lexicographers and linguists over a period of 25 years.

Will the semantic map make Bing understand our garbled search pecks instantaneously and deliver accurate results? Maybe, but with Google’s “humongous amount of data it indexes” and loyal site traffic, it may be a long battle. According to the blog article’s author, what Bing does have going for it is a clean interface, excellent “information aggregation,” and solid concept/summary extraction. The semantic technology should only add to that and make Bing stronger.

Samuel Hartman, May 20, 2010

Freebie

How Does Social Network Content Affect the Quality of Search?

May 20, 2010

Punch Communications proposes the addition of real time social media factors into search results could serve to galvanize the quality of an organic search.

The evolution of real time social networks impacts the speed at which we receive news and current events, as well as produce more online discussion around those topics. This is one of the most popular attributes of social media. However, according to Benzinga’s article “Social Search May Galvanise Organic Results”  PR, search and social media agency Punch Communications, suggests this appearance of social network content in search engines has the potential to make their results much more organic than ever before. They suggest the influence of algorithms and Google’s page rank system on non-paid search responses could be lessened and will only continue to grow in tandem with the popularity of social media. This is a good thing, right?

Melody K. Smith, May 20, 2010

Note: Post was not sponsored.

« Previous PageNext Page »

  • Archives

  • Recent Posts

  • Meta