Intel’s Interest in Medical Terminology Translation

October 4, 2008

Intel continues to be a slippery fish when it comes to search and content processing. The ill fated Convera deal burned thorough millions in the early 2000s. Earlier this year, Intel pumped cash into Endeca, one of the two high profile enterprise search systems, known for their ecommerce and information access systems. (The other vendor is Autonomy. Fast Search & Transfer seems to be shifting from a vendor to an R&D role, but its trajectory remains unclear to me.)

Intel has one engineer thinking about language. The posting on an Intel Software Network Web log “Designing for Gray Scale: Under the Hood of Medical Terminology Translation” is suggestive. The author is Joshua Painter, who identifies himself with Intel. You can read this post here. Translation of scientific, technical, and medical terminology is somewhat easier than translating general business writing. The task is difficult, particularly when a large pharmaceutical company wants to monitor references to a drug’ formal and casual names in English and non-English document sets.

Mr. Painter’s write up concerns standards; specifically, “data standards in enabling interoperability in healthcare.” For me the interesting passage in this write up was:

An architecture for Health Information Exchange must accommodate choice and dealing with change – it must be designed for grayscale. This includes choice of medical vocabularies, messaging standards, and other terminology interchange considerations. In my last post I introduced the notion of a Common Terminology Services to deliver a set of capabilities in this space. In this post, I will discuss a technical architecture for enabling this.

The word grayscale, I think, means fuzziness. Intel makes these tantalizing pieces of information available, and I continue to watch for them. My hunch is that Intel wants to put some content centric operations in silicon. Imagine. Endeca on a multi core chip. So far this is speculation, but it is clear that juiced hardware can deliver some impressive content processing performance boosts. Exegy’s appliance demonstrates the value of this hardware angle.

Stephen Arnold, October 4, 2008

Birst Says We Are First with On Demand, Automated Business Intelligence

October 3, 2008

I am interested in companies with true innovations. The headline in this October 1, 2008, article here caught my attention. Birst (clever name, that) asserts that the service provides “complete, end-to-end BI solution that solves the two greatest challenges of BI: cost and complexity.” I agree that most of the business intelligence systems with which I have experimented are complicated. The reasons extend beyond the SAS, SPSS, Cognos or Business Objects software and interfaces. The problems I recall are:

  1. Obtaining valid data
  2. Forming statistically valid subsets or cubes
  3. Knowing what specific statistical methods are appropriate for the data
  4. Making intelligent decisions about the statistical operations themselves.

The easy part is looking at the graphs, charts, and tables. An error in the data means that the ease of use and eye candy are going to give the user information that may be misleading at best and completely incorrect at worst. Business intelligence, then, is more than indexing some unstructured content, performing entity extraction, migraine structured and unstructured data. These are quite challenging tasks. The real issue is the selection and management of the mathematical methods. With the black swam problem the grim reminder that the clever creators of Black Scholes partial differential equation were not really as informed as the lads convinced themselves they were, one would think that “automated” business intelligence vendors would be more conservative in their claims.

Success Metrics, the developer of Birst, uses this diagram to explain how the system ingests data and makes analyses and reports available as a cloud service:

overview

The process involves three steps. You upload your data to Birst. Via the browser interface, you analyze and generate reports. The reports can be shared with colleagues via a “shared space” on the Birst servers. Prices for the service range from a no charge for an account processing 10 megabytes per month to $200 a month for 100 megabytes of data per month. Custom quotes are available for data transfers greater than 100 megabytes. If you bump into a limit, Birst offers a reasonable price for upgrading an account; for example, to move from 100 megabytes to 500 megabytes per month, the additional fee is $399 per month.

According to the report on Cardholders:

As a general BI solution, Birst handles any type of data, including finance, operations, marketing, customer service, and sales information. It also accepts columnar data held in csv, Access databases, or Excel files. Fully automated, it’s easy and quick to get started with Birst. Within a few minutes of signing up and uploading data, users can be creating their own reports and analysis. Birst even creates some dashboard reports for users automatically; these automatic dashboards are known as Quick Dashboards.

A Birst report showing sales data appears below:

image

Other screen shots are available at www.birst.com/media.

My view is that Birst and similar services will become increasingly popular in the coming months. The ability to move data from one cloud service such as Salesforce.com to Birst.com is particularly interesting to me. On premises business intelligence systems are not likely to be displaced quickly, but for some applications, the Birst-type service makes sense and slashes costs because no dedicated SAS, SPSS, or Cognos expert has to be available to support a field office, a department, or other entity with a need to analyze data via the Web.

Stephen Arnold, October 4, 2008

Mindbreeze Profile Now Available

October 3, 2008

A profile of the Austrian enterprise search system Mindbreeze is now available here. A unit of Fabasoft, Mindbreeze offers enterprise and Web search systems. The company was recently recognized as “hot” by KMWorld Magazine. The system supports open standards, features an SDK, and ships with connectors to permit the system to index email, Web sites, and standard office document types. The company’s Web site is www.mindbreeze.com.

Stephen Arnold, October 3, 2008

Mindbreeze Enterprise Search

October 3, 2008

Mindbreeze, headquartered in Linz, Austria, has caught the fancy of several of my European readers. The name was familiar to me, but I knew nothing about the company. KMWorld, the outfit who pays me to write a monthly column about the Google identified Mindbreeze Enterprise Search as “trend setting product of 2008.” I thought I was able to keep up to date on trend setting search systems, but Mindbreeze was a new player to me. You can read the news release about this recognition here. The news release–perhaps in the adrenaline rush of receiving the KMWorld award said, “US magazine KMWorld acclaims the ‘hottest’ products of the year.” Mindbreeze’s parent company–Fabasoft–seems to be working to reverse a decline in revenues.

With US search engines singing happy tunes to me, I have heard that several of the vendors are really struggling to “make their numbers.” Mindbreeze, it seems, is chugging along quite happily. Earlier this year, Intellisearch reallocated its resources. When I pinged the Intellisearch offices in San Francisco, I was redirected to the company’s offices in Europe. The flagships in search and content processing remain Autonomy (more of a diversified services vendor) and Exalead (a real challenger to the GOOG in engineering) dominate the European scene. I know there are many specialized vendors–for instance, Polderland in the Netherlands–and the revivified Oslo operation for Microsoft Fast Search & Transfer. I heard on my last trip to Europe of a number of new search and content processing vendors, and I will try to cover these as I get more information.

MES (Mindbreeze Enterprise Search)

Mindbreeze has an office in Beverly, Massachusetts. The US contact is David Cloyd, according to the document I reviewed. The managing director of the main company is Daniel Fallmann, who works from the Linz office. The marketing angle is what the company’s brochure calls CEVA or Content Enable Vertical Application. Here’s a diagram of how this works. The various components refer to software available from other Fabasoft companies.

ceva

The components show are a work flow component (Folio), a compliance archive (iArchive), a case management system (DUCX), and an operations manager. In this context, Mindbreeze seems to be heading in the same direction as MarkLogic, but I need to do more digging.

And what about Mindbreeze? (For more details you can download the Mindbreeze product brochure here.)

The company’s Web site provides no information about “latest news” as of October 3, 2008, at 8 30 am Eastern time. I expect that the news about the KMWorld award will be posted at some time in the future. The news archive reported on March 30, 2006 (the most recent entry) that a service pack was available for Mindbreeze Enterprise Search 1.6 was available.  My firth thought was, “No news in two years. Hmmmm.”

I did locate FAQs for Versions 1.6, 2.x, and 3.x. The most interesting items I noted were:

Read more

Endeca Pursues Publishers

October 3, 2008

MarkLogic has been making headway in the world of publishing. I know that I have predicted the demise of traditional newspaper, magazine, and book publishers, but there is life in a number of publishing sectors. Publishers–spurred by amateur journalists like this addled goose and fast changing Web companies like Google–have been increasingly open to new technology. Nstein, a former content processing vendor, has worked hard to reposition some of its technology specifically for the publishing industry. Now Endeca is hopping on the bandwagon. One of the early entrants from the search and content processing sector was Fast Search & Transfer. The company acquired a company in Utah and created a remarkable PowerPoint presentation showing Fast ESP (enterprise search platform) as the foundation of a next-generation newspaper. I’m not sure what happened to that initiative since Microsoft gobbled up Fast Search and turned Oslo’s engineers into the heart of Redmond’s search innovation effort.

Endeca, therefore, years ago made a well considered move to tailor its technology to the needs of publishers. I heard that the company has more than 150 publishing clients. You can read about the services in the company’s news release here or a boiled down version from Customer Interaction Solutions here. According to Endeca’s Steve Papa:

Media and publishing represents one of Endeca’s largest and fastest growing areas of focus. Web and mobile platforms, once seen as a required complement to traditional print and broadcast mediums, have rapidly become the primary area for new product creation and revenue growth. We’re working closely with our most innovative clients and partners to develop next-generation offerings that deliver a differentiated cross-medium experience, simplify the re-use of content across media platforms, and create new opportunities to monetize text, audio and video assets.

The question becomes, “With more search and content processing vendors chasing publishing companies, will the vendors be able to deliver enough value to warrant the high license fees some vendors charge?” What may happen is that price competition may force some of the smaller, less well known vendors to park on the side of the information highway hoping another ride comes along. “Value”, as I use the term, means that these potent systems scale economically, deliver good performance, and accommodate change without requiring a Roman legion of programmers. In my experience, publishers often lack a good understanding of the problems their own content creates for them. Publishers often don’t want search; publishers want the ability to create new information products from existing content. The ideal system delivers what publishers call “content repurposing” without requiring expensive, vain, and erratic human editors. Publishers would prefer life without equally expensive, vain, and erratic authors if possible. Publishing looks like an ideal market, but in some ways it is a difficult sector in which to gain traction and make sales. Sci-tech publishers want to “own” a solution so competitors can’t enjoy the benefits of a level playing field.

You can learn more about Endeca here.

Stephen Arnold, October 3, 2008

Attensity and Tremendous Momentum

October 3, 2008

With the economy in the US stumbling along, I found Attensity’s September 30, 2008, “Momentum” news release intriguing. The information issued by the the analytics company is here. I had to struggle to decipher some of the jargon. For example, First Person Intelligence. This is a product name with a trademark.  The idea is that email or phone calls from a customer are analyzed by Attensity. The resulting insights yield information about a particular customer; hence, First Person Intelligence. You can see FPI in action by clicking here. The company won an award called the Stevie. If you are curious or you want to enter to compete to snag the 2009 award, click here. I think I know what text analytics is, so I jumped to VoC. The acronym means “voice of the customer.” I think the notion is that a company pays attention to emails, call center notes, and survey data. I’m not certain if VoC is a subset of FPI or if VoCis the broader concept and FPI is a subset of VoC.

The core of the news release is that Attensity has landed some major accounts. Customer names are tough to come by, so you may want to note these organizations who have licensed the Attensity technology but hopefully not the jargon:

  • JetBlue
  • Royal Bankk of Canada
  • Travelocity

For me, the most useful part of the company-written article was this passage:

The text analytics market is rapidly moving out of the early adopter stage. Industry analyst firm Hurwitz & Associates estimates an annual growth rate for this market at 30 to 50 percent. According to a survey conducted last year by the firm, the largest growth area is in customer care-related applications. In fact, over 70 percent of the companies surveyed that had deployed, or were considering deploying the technology, cited customer care as a key application area.

The growth rate does not match my calculation which pegs growth at a more leisurely 10 to 18 percent on an annual basis. The Hurwitz organization is much larger than this single goose operation. Endangered species like this addled goose are more conservative, and its estimates in a grim financial market are less optimistic than other consultants’ and analysts’.

In my Beyond Search study for the Gilbane Group, published in April 2008, I gave Attensity high marks. Its deep extraction technology yields useful metadata. Since my early 2008 analysis, Attensity has worked hard to productize its system. Calls centers are a market segment in need of help. Most companies want to contain support costs.

In my opinoin, Attensity’s technology is better than its explanation of its products and those products names. I wonder if the addition of marketers to a technology-centric company is a benefit or a drawback. Thoughts?

Stephen Arnold, October 3, 2008

Autonomy and the European Union

October 3, 2008

Not many details. AFX is reporting that Autonomy has reached an agreement to license its IDOL system to the European Union. The brief news item is here. When I checked Autonomy’s Web site, the story had not yet been posted.

Stephen Arnold, October 3, 2008

Microsoft Sees Google as Goliath

October 2, 2008

Imagine my surprise when the $65 billion dollar Microsoft allegedly characterized Google as “Goliath.” By the time I flapped from my nest of reeds and mud, my newsreader refreshed with another 15 stories on this topic. I, quite naturally for an addled goose, dived in. Here’s a quick rundown of the “Goliath” metaphor. I will wrap up with several observations about this wordsmithing. I think the larger issue behind the trope has been overlooked, which says more about how pundits perceive both Google and Microsoft.

The Zero Ambiguity of Goliath

Rory Cellan-Jones, technology correspondent, BBC News wrote “Google Goliath Microsoft Says. You can find the article here. In an interview, Mr. Ballmer allegedly characterized Microsoft as “David” in search. Google, Mr. Cellan-Jones reports, is “Goliath.” Mr. Ballmer, the BBC story reports, said: “We may be the David up against Goliath but we’re working on it…. We probably missed the power of the advertising model, not so much the technology.” My quotes don’t do justice to this excellent article.

The Guardian, a paper that is quite a bit paper than our local weekly Harrod’s Creek shopper, picks up the theme. “Ballmer Says Microsoft is David to Google’s Goliath.” The Guardian piece added for me a useful item of information: “Ballmer says that search is his ‘favourite business’ because when you have nothing the only way is up: ‘Everything is possible, we have nothing to lose. (Of course, you can also just continue along flatlining. But his salesman’s instinct probably won’t let him consider that.)”

Silicon Alley Insider, a Web log I quite like, picks up the theme of Microsoft’s response to Google in its “Ballmer Talks Up Windows Cloud. Don’t Believe It.” You can read Eric Krangel’s article here. Mr. Krangel focuses on the wisp-like Cloud OS, but it’s clear to me that Mr. Ballmer is setting the stage for a major announcement at the upcoming Windows conference on October 27. For me, the most interesting point in the piece was this statement attributed to Mr. Ballmer: “The last thing we want is for somebody else to obsolete us, if we’re gonna get obseleted [sic] we better do it to ourselves.” The somebody else, in my reading, is our pal Goliath.

image

In my opinion, Google equals Goliath.

What’s with Goliath?

In Kentucky, despite the high rate of illiteracy and the miserable education system, there’s no shortage of opinions about David and Goliath. For example, there’s quite a range of opinions about the David and Goliath clash. These range from Goliath won to there were two Goliaths and David only nailed one of them.

My hunch is that the purpose of the metaphor is to make clear that Microsoft with its control of 90 percent or more of traditional personal computer operating systems and common applications like word processing, its 100 million or so SharePoint licenses, its thousands of resellers, its hundreds of thousands of VisualStudio.Net developers, and its activities in games, mobile software, and consumer audio players is an underdog. David is the under dog, a wimp, a Mr. Peepers. Some of the sources I had to grind through in a required ancient history class said he was a musician. He wasn’t a rapper wearing shades, sporting tats, and wearing FBI sunglasses and prison clothes. David played a harp. He was, as I recall, untrained for war. In short, a wimp.

Goliath, on the other hand, is your classic André the Giant professional wrestler. Slow moving and slow of speech, Goliath was the equivalent of a roid-crazed street fighter. Goliath would have made a good power forward for a pick up game in the Bronx. The key point was that this fellow Golyat (standard Hebrew) was a philistine. Forget Goliath’s size. His real transgression may have been that he was perceived as an invader or intruder with access to hot technology; specifically, iron smithing. Goliath had armor; David wore a cotton tunic. Although cool, cotton does not withstanding a sword thrust too well.

The metaphor, then, operates for me on two levels. The little guy (David) has to fight off the big guy (Goliath or Golyat). And, Goliath was an outsider, at least to David and his pals.

The rest of the story is well known even in Kentucky. David uses a sling and throws a stone at Goliath. The stone knocks Goliath down. Then, depending on your preference for murky sources, chops off Goliath’s head or walks up to the prone Goliath and checks out the prostrate enemy. The sling, the stone, the unexpected victory–that’s the metaphor.

The Reality

Google’s revenue for 2008 will be in the $20 billion range or close enough for horse shoes. Microsoft’s revenue for 2008 will be north of $65 billion. Google has 19,000 full time equivalents, give or take 2,000. Microsoft has 55,000 full time equivalents, give or take 5,000 happy workers. Microsoft has a de facto monopoly in desktop operating systems, standard office software for word processing and spreadsheets, and the 100 million SharePoint licenses. Other Microsoft businesses are big, but none is in the monopoly category.

Google, on the other hand, has about 70 percent of the Web search market. Google touches more than two-thirds of the Web search related advertising. Google has a modest footprint in several other businesses, but it is a one-trick Goliath in terms of revenue.

The big difference between the two companies is that Microsoft represents the status quo in personal computing. Google represents the next-generation in personal computing. In 2005, I created this diagram for my The Google Legacy study.

!google three eraas

© Stephen E. Arnold and Infonortics Ltd., 2005

The conclusion of that analysis was that most of the companies in the software business were blissfully ignorant of Google’s single minded build out of an application infrastructure. Furthermore, most pundits looked at Google as a one trick revenue pony and did not abstract that revenue model into a broader business model; that is, someone pays to get access to Google’s systems and users. As a result, Google was running free with no significant oversight, competition, or technical challenges since 1995. Yes, 1995. The Google kids were fiddling with BackRub in the mid 1990s and learning from the AltaVista.com service. Google’s biggest technical guns have roots in one of three companies: AltaVista (Digital Equipment), Bell Labs (AT&T), and Sun Microsystems. What these clever folks did was take the best from research computing and integrate those insights into a distributed, massively parallel architecture. The Internet was the equivalent of the connections in a desktop PC. The Google infrastructure was the computer just as Scott McNealy (Sun Microsystems) allegedly said.

What’s happening is that Microsoft’s business model, not its technology, is colliding with the Google business model. Furthermore, the collision has nothing to do with David and Goliath. The issue is Darwinian. Dragging metaphors into what is a strategic confrontation after a decade of inattention is misleading and indicative of why Microsoft can’t bridge the gap. Microsoft cannot catch up by following its present 10,000 sailboats going in the same general direction approach. Google is doing what it has done for a decade, and the company is now finding itself pulled into new, potentially lucrative new opportunities. David needs to get a Ph.D. in math, publish a couple of important papers, and apply for work at Google in my opinion.

Stephen Arnold, October 2, 2008

Google’s Big Competitive Advantage

October 2, 2008

Larry Dignan’s “Google Talks Efficient Data Centers” here makes a point that some people overlook, even ignore. The write up points to a Googler’s Web log post. The Googler in question is Urs Hölzle, a definite wizard. The topic is the water cooling technology disclosed in US20080209234 here. Two thoughts:

  • Floating data centers outside the three mile limit might raise the question, “Who has jurisdiction over these constructs?”
  • Several times in the last 12 months Googlers and former Microsoft executives have told me that a patent does not really mean very much. I stand by my assertion that Google does not patent trivial systems and methods.

The GOOG is an engineering firm. Talented engineers innovate and their employers seek to protect that intellectual property. The excitement about floating data centers and water cooling underscores Google’s ability to seize journalists’ and competitors’ attention.

Stephen Arnold, October 2, 2008

Microsoft’s Search Ambitions

October 2, 2008

I hope that the Daily Telegraph survives the coming Ice Age for newspapers. This paper often contains articles I find useful and often amusing. British humor. Nothing like it. Click here to read “Microsoft’s Steve Ballmer Sets Out Internet Search Ambitions” by Dominic White, the communications editor. (I don’t know what that title means, but Mr. White does a good interview.)

Several points jumped out at me:

Mr. White reports that Microsoft wants to become number two in search. The stunning point was, “Acquiring Yahoo! is not key to becoming number two in search.” Wow. Quite a statement after the stock buy back and the tanking of Yahoo’s shares.

Next, Mr. White presents this Mr. Ballmer statement: “Sure, should we have embraced the opportunity in search and online advertising a few years earlier? The answer is yes,” admitted Mr Ballmer. “But there is nothing to be afraid of. It’s all upside, we have a small market share, we are David, Goliath is out there, the opportunity is ours and we need to seize it.” Could time be running out?

The last point that adhered to my addled goose brain was this passage:

The Microsoft boss also had few words for the first ‘Googlephone’, the mobile handset which was unveiled last week and uses Android, the search engine company’s rival operating system to Windows Mobile. “It’s a V1,” laughed Mr Ballmer. “They got a long way to go. A long way to”.

Harsh words for a product that is not yet available, a bit like the forthcoming Microsoft Cloud OS.

Stephen Arnold, October 2, 2008

« Previous PageNext Page »

  • Archives

  • Recent Posts

  • Meta