Exalead Acquired by Dassault

June 11, 2010

I have done some work for Exalead over the last five years, and I have gone down in history as one of the few people from Kentucky to talk my way into the Exalead offices in Paris without an appointment. L’horreur. I had a bucket of KY Fry in my hand and was guzzling a Coca Lite.

Out of that exciting moment in American courtesy, I met François Bourdoncle, a former AltaVista.com wizard. He watched in horror as I gobbled a crispy leg and asked him about the origins of Exalead, his work with then-Googler Louis Monier, and his vision for 64 bit computing. I wrote up some of the information in the first edition of the Enterprise Search Report, a publication now shaped into a quasi-New Age Cliff’s Notes for the under 30 crowd. I followed up with M. Bourdoncle in February 2008, and published that interview as part of the ArnoldIT.com Search Wizards Speak series. The last time I was in Paris, I dropped by the Exalead offices and had a nice chat. I even made a video. Several Exaleaders took me to dinner, pointing out that McDo was not an option. Rats.

image

So what’s with the sale of Exalead to Dassault Systèmes?

The azure chip crowd has weighed in, and I will ignore those observations. There is some spectacular baloney being converted into expensive consulting burgers, and I will leave you and them to your intellectual picnic.

Here’s my take:

Differentiator

There are lots of outfits asserting that their search and content processing system will work wonders. I don’t want to list these companies, but you can find them by navigating either to Google.com or Exalead.com/search and running a query for enterprise search. The problem is that most of these outfits come with what I call an “interesting history.” Examples range from natural language processing companies that have been created from the ashes of not-so-successful search vendors to Frankenstein companies created with “no cash mergers.” I know. Wild, right. Other companies have on going investigations snapping like cocker spaniels at their heels. A few are giant roll ups, in effect, 21st century Ling Temco Vought clones. A few are delivering solid value for specific applications. I can cite examples in XML search, eDiscovery, and enhancements for the Google constructs. (Okay, I will mention my son’s company, Adhere Solutions, a leader in this Google space.)

The point for me is that Exalead combined a number of working functions into a platform. The platform delivers search enabled applications; that is, the licensee has an information problem and doesn’t know how to cope with costs, data flows, and the need for continuous index updating. The Exalead technology makes it easy to suck in information and give different users access to the information they need to do their job. For some Exalead customers, the solution allows people to track packages and shipments. For other licensees, the Exalead technology sucks in information and generates reports in the form of restaurant reviews or competitive profiles. The terminology is less important than solving the problem.

That’s a key differentiator.

Technology

Google and Exalead were two outfits able to learn from the mistakes at AltaVista.com. Early on I learned that the founder of Exalead could have become a Googler. The reason Exalead exists is that M. Bourdoncle wanted to build a French company in France without the wackiness that goes along with tackling this mission in the US of A. Americans don’t fully understand the French, and I can’t do much more than remind you, gentle reader, that French waiters behave a certain way because of the “approach” many Americans make to the task of getting a jambon sandwich and a bottle of water.

I understood that M. Bourdoncle wanted to do the job his way, and he focused on coding for a 64 bit world when there were few 64 bit processors in the paws of enterprise information technology departments. He tackled a number of tough technical problems in order to make possible high performance, low cost scaling, and mostly painless tailoring of the system to information problems, not just search. Sure, search is part of the DNA, but Exalead has connectors, text to voice, image recognition, etc. And, happily, Exalead’s approach plays well with other enterprise systems. Exalead can add value with less engineering hassles than some of the firm’s competitors can. Implementation can be done in days or weeks, and sometimes months, not years like some vendors require.

So the plumbing is good.

That’s a high value asset.

Read more

Quote to Note: Data Pig

June 10, 2010

I don’t use an iPhone. Yes, I pay AT&T for one of my broadband landlines. Yes, I have an AT&T landline. I am not sure if I sympathize with people who make a conscious choice to purchase services which can impose punitive variable pricing. Maybe most people don’t remember the pre-Judge Green days when a person rented a Western Electric telephone device and never owned it? I was at the Piscataway IBM facility when the order was enforced with one part of the building becoming Bellcore and other part remaining Bell Labs. The object of the company was to make money, pay for the fancy stuff like PICS, and build phones you could toss from the second floor of the Western Electric building confident that the clunky thing would work after the 26 foot fall to the concrete below.

Money.

When a telephone carrier with the “old” AT&T DNA offers a deal, I chuckle. I used to put on my Young Pioneers hat, but Tess ate it. Sigh. Memories of a monopoly don’t face quickly.

Point your browser at “AT&T Learns Exactly The Wrong Thing About Data Usage.” Agree or disagree with the write up. What I noted was:

AT&T says that 65% of its users use less 200 megabytes per month; a whopping 98% use less than 2 gigabytes. (NYT) AT&T looked at these numbers and concluded it was time for tiered pricing; time to soak these “data pigs”.

Now that’s a quote to note: “data pigs.” You can take the old AT&T out of the phone business but you can’t alter than DNA easily. Ah, “data pigs”.

Stephen E Arnold, June 10, 2010

Freebie, unlike a long distance call in 1950 when a ringy dingy to Brazil was a major event. Remember differential pricing by class of customer? Ah, remember.

Is Outsourced Search a Money Saver?

June 10, 2010

I found “The Outsourcing Low Cost Lie” thought provoking. I conducted an interview with an integrator for a podcast and learned quite a bit about search outsourcing. I have never given outsourced search much thought. We have been using Blossom.com’s service for years. I do not think of my use is outsourcing, probably because I know Dr. Alan Feuer and his operation. After reading the article, I realized that the goslings and I had been early adopters of outsourced search, a method used in a number of engagements, including the index for FirstGov.gov, now USA.gov.

The angle the article placed in front of me was that outsourcing was a loser. Here’s a passage I noted:

Nearly 50% of all outsourced projects fail outright or fail to meet expectations in the first place. Essentially, you’re taking the same gamble as red vs. black in Roulette about your project’s success right off the bat, and only then if you pass that hurdle, you’ll get on average, 25% savings over having it done locally.

I don’t want to throw water on this parade, but the original FirstGov.gov deal involved Inktomi and worked pretty well based on my information. We then used the Blossom.com system for the Threat Open Source Intelligence Gateway, and my recollection is that the folks with access to TOSIG were pretty happy for law enforcement and intelligence professionals. We also use “outsourced search” for clients today and for the search system on Beyond Search.

In each of these instances, the horrors described in the write up did not manifest themselves. If you are running a project and crater, maybe the reason is you, not the notion of outsourcing?

Stephen E Arnold, June 10, 2010

Freebie

Google Addresses Index Staleness

June 10, 2010

Next week, I am giving two lectures about what is now one of the touchstones of 2010: real time. I will put up some extracts from these lectures in the next week or so. What I want to do this morning is cal your attention to a post from Google called “Our New Search Index: Caffeine.” I think the nod to the fizzy drinks that gives club goers and sluggish 20 somethings is interesting.

Most users of a search and retrieval system have zero clue about when the index was updated or assembled. The 20 something wizards assume that if an index is available from an electronic device, that index is up to the minute or even more current.

Most online system users have zero clue about when data were created, when those data were processed, when those index pointers were updated, or what other factors may have slammed on the search system’s air brakes. Ever hear this in an organization: “I know my version of the PowerPoint should be in the system but I can’t find it.” I do. Frequently.

The Google write up makes clear in a Googley sort of way wants to try and cope with streams of information from Twitter and Facebook. Traffic from social sites either has reached parity with search traffic or it has surpassed the traffic. I have some information in Overflight, and I will post one or two items that document this usage shift. Users seem to prefer the what looks to most people like “real time”. A traditional indexing system does not do real time with Mikhail Nikolaevich Baryshnikov’s agility.

Here’s what the Googlers said:

Caffeine lets us index web pages on an enormous scale. In fact, every second Caffeine processes hundreds of thousands of pages in parallel. If this were a pile of paper it would grow three miles taller every second. Caffeine takes up nearly 100 million gigabytes of storage in one database and adds new information at a rate of hundreds of thousands of gigabytes per day. You would need 625,000 of the largest iPods to store that much information; if these were stacked end-to-end they would go for more than 40 miles. We’ve built Caffeine with the future in mind. Not only is it fresher, it’s a robust foundation that makes it possible for us to build an even faster and comprehensive search engine that scales with the growth of information online, and delivers even more relevant search results to you. So stay tuned, and look for more improvements in the months to come.

The idea is that Google which has numerous ways of skinning the content processing cat has grabbed the digital Red Bull.

image

© Google 2010.

I have no doubt that the freshness of certain types of content is going to benefit. However, I am not sure that Google will be able to handle its vast content processing needs with the ballet grace the nuclear logo in the blog post suggests. Furthermore, I don’t think that most users understand that whatever Google does to process content more quickly and update its indexes deals with some of the thorny underlying issues. I address these in my lecture, but the user is unlikely to know about latency elsewhere in the content ecosystem.

The notion of “real time” is slippery. The notion of an index’s “freshness” is slippery. The problem is a complex one. Why do you think that financial institutions pay really big bucks for products from Exegy and Thomson Reuters to deal with freshness? The reason? Speed that can be documented from moment of information creating, acquisition, processing, and availability. For freshness, be prepared to spend big money.

For a temporary pick me up, guzzle the caffeine-laced beverages from the 7-11. I might just recommend that you turn to http://search.twitter.com and look for a tip on where to buy a Jolt at a discount. Just my opinion.

Stephen E Arnold, June 10, 2010

A freebie. No coupons for a complementary can of Jolt.

Another Upstart Nation State Bans Google

June 10, 2010

I may have to fire up my old copy of XyWrite III+, create a template, and assign standing text to an Alt key. I read “Turkey Bans Use of Google, Services.” If I weren’t so busy with my World Cup paperwork, I would create a chart with such categories as “banned”, “sued”, “threatened”, and probably a couple of other categories.

The most recent nation state to get frisky with Google is Turkey. Long viewed by the US as a cheerleader, Turkey seems to be willing to make pals with certain countries which are annoyed with the United States.

Here’s the passage I noted:

In an official statement, Turkey’s Telecommunications Presidency said it has banned access to many of Google IP addresses without assigning clear reasons. The statement did not confirm if the ban is temporary or permanent….The banned IP addresses include translate.google.com, books.google.com, Google-analytics.com, tools.google.com and docs.google.com.

I thought companies had an obligation to shareholders to maximize returns. Getting in hot water in countries where there are potentially lucrative markets strikes me as losing an opportunity to make money. After the World Cup, I will work through the countries in which Google faces push back. Fascinating that a single company can become the focal point for frequent hassles with nation states.

Maybe this is a trend, not an outlier? A good question in my opinion: “Who is at fault? The country, a politician, a government, a company?” I can hear my seventh grade teacher now: Discuss in less than 250 words. What’s next? Educational institutions?

Stephen E Arnold, June 10, 2010

Freebie

The UX Crowd Does Harvey the Rabbit

June 9, 2010

I like the blinking dot interface. The 20 somethings poke with fingers. Sigh. The user experience chatter goes unheard by me. I find the cartoons, the mini motion pictures, and cluttered “assisted navigational aids” annoying. The future of interfaces is certainly less cluttered. Point your browser  at “‘Imaginary’ Interface Could Replace Real Thing.” And, for a bonus, you get the “real thing”, a phrase much loved by some azure chip poobahs. The point of the write up is that the interface is – well – imaginary. For me, the key passage was:

Researchers are experimenting with a new interface system for mobile devices that could replace the screen and even the keyboard with gestures supported by our visual memory.

I have a gesture in mind.

Stephen E Arnold, June 9, 2010

Freebie.

Yahoo Is Committed to Search—Cutbacks, That Is

June 9, 2010

Short honk: Lots of pre-Microsoft deal chatter about Yahoo search.I thought that the Yahooligan search wizards lacked experience getting bought.Lots of excitement when a big outfit cuts a big deal. Point your browser at “Another Round Of Layoffs At Yahoo, Search Team Gets Hit.” Simple story. Search experts are expendable. What’s that mean for search at Yahoo? Your guess is better than mine.

Stephen E Arnold, June 9, 2010

Freebie

Google Builds Buzz for Wave, Er, Google Builds Wave for Buzz?

June 9, 2010

Addled geese are easily confused. Google, it seems is building buzz for Wave. When I first read “Google Launches Tools to Get People Using Wave More”, I knew it was buzz. You may have a way to differentiate between Wave and buzz or Buzz and a wave. For me,the most interesting comment in the write up was:

“Now, a wave wouldn’t be a wave if all you could do was copy over some plain old text,” says Osinga [a Googler]. “Websites that want to incorporate some interactivity into the resulting waves can specify a helper gadget.”

Okay, I think I get it: “a wave wouldn’t be a wave if all you could do was copy over some plain old text”—like this blog post.

Stephen E Arnold, June 9, 2010

Freebie

Exclusive Interview with Seth Grimes, Alta Plana Now Available

June 9, 2010

In April 2010, I spoke with Seth Grimes, the founder of Alta Plana. Mr. Grimes is an analytics strategy consultant. He is founding chair of the Sentiment Analysis Symposium and of the Text Analytics Summit, contributing editor at Intelligent Enterprise magazine, and text analytics channel expert at the Business Intelligence Network. He founded Washington DC-based Alta Plana Corporation in 1997. Mr. Grimes consults, writes, and speaks on information-systems strategy, data management and analysis systems, industry trends, and emerging analytical technologies.

In the interview, he highlighted one of the challenges search and content processing systems face. He said:

I’ve in the past characterized search as evidence of a failure of design. If information were correctly and adequately categorized and organized and made accessible, we wouldn’t need search, would we?  I’ve retreated from that view as I’ve seen search evolve into information access, into technology that not only finds but also organizes results from sources the user likely-as-not didn’t know about.  Yet I’d call my statement still largely true when it comes to the enterprise’s own data holdings: Search is necessitated by a failure of design.  Do a better job organizing information as it’s created or acquired, and also, by the way, stop allowing application vendors to bring in siloed search applications, and the in-organization situation will improve.

To read the full interview, navigate to Search Wizards Speak and click on the Seth Grimes’ interview or click this link.

Stephen E Arnold, June 9, 2010

Unsponsored post.

Evidence of an Open Source Boomlet?

June 8, 2010

I read “What Is Data Science?” with interest. This is a long O’Reilly Radar essay by Mike Loukides. The write up has a message that is going to be of interest to those looking for the next big and some giant companies with data and not much leverage from that asset. The key point in the write up is that there is money to be made by converting data into products. Note that this is not the tired old data-information-knowledge mantra. The days of the quasi-intellectual approach to making money is not sufficiently pragmatic for these economic times. The key is to take data and make a product. When I read the essay, I thought about various online vendors who are doing this now. Candidates for poster children include Google, Facebook, and Yahoo along with lots of other folks. Statistics Canada once signed a deal with a vendor to crunch the StatsCan stuff into more saleable products.

But for me the most interesting item in the write up was a chart that showed the number of job listings for a couple of open source products; specifically, Hadoop and Cassandra.

image

You can see the lines trending upwards.

My take: there is some tangible data that indicates open source software in the data management sector is gaining traction. I am not sure what this means for other open source software. But I found this factoid interesting.

Stephen E Arnold, June 8, 2010

Freebie

« Previous PageNext Page »

  • Archives

  • Recent Posts

  • Meta