Linguistic Agents Unveils RoboCrunch

August 21, 2008

Linguistic Agents, based in Jerusalem, has rolled out RoboCrunch.com, a semantic search engine. You can read the company news release here. Like Powerset and Hakia, Linguistic Agents’ technology makes it possible to locate information by asking the system a question. This platform enables software to respond and act upon natural human language in the most intuitive fashion for users.

As of August 21, 2008, the system operates with two functions:

  1. Natural Language Inquiries are transformed to Advanced Search Queries
  2. Results are semantically sorted by relevance.

The developer–founded in 1999–plans to change the present method of Web Navigation by using its advanced semantic technology to better understand user’s information requests. The company says, “Linguistic Agents has developed an integrative language platform that is based on the most current research in the field of theoretical linguistics.”

I have written about this company before. Check out the demo. Let me know your impressions.

Stephen Arnold, August 21, 2008

Search Points of View: One Home Run, Three Singles

August 21, 2008

Keri Morgret (I think) prepared a summary of a round table session devoted to the joys and heart aches of operating an in-house search engine for behind-the-firewall search and retrieval. The participants were:

The article is “Enterprise Search: Running Your Own Search Engine”, and it appeared in the August 18, 2008, Search Engine Roundtable. Ms. Morgret’s summary identifies the key points of each participants’ remarks. I found this summary quite interesting and very suggestive. I agree with many of the points contained in the compendium of remarks. There are several issues that warrant a comment from the addled goose.

First, the assumption implicit in this summary is that organizations need an on premises search system. I don’t agree. My research suggests that organizations need multiple search, content processing, and text analytics systems. The idea of a “one size fits all” solution doesn’t now work and won’t work in the future. In fact, my understanding of MarkLogic’s point is that success comes from defining a problem and solving it. Each of the examples offered–customer support, for example–can benefit from a solution. One system could deliver several distinct solutions, but my understanding is that success comes with focus. My view is that an organization will have multiple systems. Some will be tightly integrated. Some will not be tightly integrated.

Second, I am not too happy when search engine optimization techniques find their way into enterprise or behind-the-firewall search. The focus in an organization is making the system support a user’s business information need. MarkLogic’s approach seems anchored in this type of real world focus. Blurring the fiddling to spoof Google’s crawler with the challenge of information access in an organization is intellectually messy. Apples are apples; oranges are oranges. Enterprise search is a distinct, complex challenge.

Third, I struggle with the co-mingling of search, communication, and collaboration. I’m not sure which is more misleading: blending enterprise search with search engine optimization or turning search into some Groove-variant. Search is sufficiently complex without adding dollops of social functionality. Adding a metatag (index term) is not collaboration. It is indexing. Suggesting otherwise triggers an asthma attack for me.

I think the value of these multi-player discussions is generally high. I know I learn from the different points of view even if I don’t agree with them. For this set of remarks, I’m pleased with how MarkLogic approaches the problem of information access in an organization. The other participants offer some good ideas, but I think that other agendas are informing their remarks. For example, Vivisimo wants to emphasize that its search platform support collaboration. I think it permits a user to add an index term, and I am reluctant to make the leap that Vivisimo and its approach to search empower collaboration in an enterprise. SearchTools.com’s presentation is a bit of laundry lists of ideas. Some of these ideas are solid; others seem to be out of place. Mixing and matching buzzwords may not clarify search or contribute to enterprise information access. MarkLogic’s approach follows a path that makes business sense. Consultants are–well–consultants.

Read this interesting article yourself. For me, the bulk of the information in this summary was of interest because it makes clear the difficulty of discussing search, content processing, and text analytics without a clear definition and scope to bound the remarks. My thought is, “Give MarkLogic more time on the next panel.”

Stephen Arnold, August 20, 2008

Metadata Modeling

August 21, 2008

Embarcadero, in my opinion, is a software tools company. The company’s products allows developers and database administrators to design, build and run software applications in the environment they choose. The company says it has more than three million users of its CodeGear and DatabaseGear products.

The firm announced on August 19, 2008, that it had rolled out its ER/Studio Enterprise Portal. As I read the announcement here, ER/Studio Enterprise Portal is a search engine for data. The system “transforms the way metadata, business rules and models are located and accessed in the enterprise.”

As I thought about this phrasing, it struck me that Embarcadero wants to piggyback on the interest in search, portals, and metadata–all major problems in many organizations. The news story released on Business Wire includes this statement:

‘We’re doing for metadata what Google did for Web search. Today’s enterprise data explosion has made collecting and refining information time consuming for the architect and hard to understand for the business user,’ said Michael Swindell, Vice President of Products, Embarcadero Technologies. ‘ER/Studio Enterprise Portal dramatically simplifies the process for assembling, finding and communicating technical metadata.’

A couple of thoughts. Embarcadero tools can assist developers. No question about it. I am unsettled by two points:

  1. The suggestion that ER/Studio Enterprise Portal is a search engine. Search is a commodity in many ways. The term is ambiguous. I find it hard to figure out what this “portal” delivers. My hunch is that it is a metadata management tool.
  2. The suggestion that Embarcadero, founded in 1993, is “doing for metadata what Google did for Web search” is an example of wordsmithing of an interesting nature. “Google” is a magic word. The company generates billions of dollars and unnerves outfits like Verizon and Viacom. The notion that a software tool for managing metadata will have a “Google” effect amused me.

I find it harder and harder to figure out what business a company is in (“portal”, “search”, “metadata”) and what specific problem an company’s product solves. I’m a supporter of reasonably clear writing. Metaphors like “addled goose” can be useful as well. But mixing a stew of buzzwords leaves me somewhat confused and perhaps a bit suspicious of some product assertions.

Other companies in the metadata game are Wren and Access Innovations. What do you think?

Stephen Arnold, August 21, 2008

Powerset as Antigen: Can Google Resist Microsoft’s New Threat

August 20, 2008

I found the write ups about Satya Nadella’s observations about Microsoft’s use of the Powerset technology in WebProNews, Webware.com, and Business Week magnetizing. Each of these write ups converged on a single key idea; namely, Microsoft will use the Powerset / Xerox PARC technology to exploit Google’s inability to deal with tailoring a search experience to deliver a better search experience a user. The media attention directed at a conference focused on generating traffic to a Web site without regard to the content on that site, its provenance, or its accuracy is downright remarkable. Add together the assertion that Powerset will hobble the Google, and I may have to extend my anti-baloney shields another 5,000 kilometers.

Let’s tackle some realities:

  1. To kill Google, a company has to jump over, leap frog, or out innovate Google. Using technology that dates from the 1990s, poses scaling challenges, and must be “hooked” into the existing Microsoft infrastructure is a way to narrow a gap, but it’s not enough to do much to wound, impair, or kill Google. If you know something about the Xerox PARC technology that I’m missing, please, tell me. I profiled Inxight Software in one of my studies. Although different from Xerox PARC technology used by Powerset, it was close enough to identify some strengths and weaknesses. One issue is the computational load the system imposes. Maybe I’m wrong but scaling is a big deal when extending “context” to lots of users.
  2. Microsoft is slipping further behind Google. The company is paying users, and it is still losing market share. Read my short post on this subject here. Even if the data are off by an order of magnitude, Microsoft is not making headway in the Web search market share.
  3. Cost is a big deal. Microsoft appears to have unlimited resources. I’m not so sure. If Google’s $1 of infrastructure investment buys 4X the performance that a Microsoft $1 does, Microsoft has an infrastructure challenge that could cost more than even Microsoft can afford.

So, there are computational load issues. There are cost issues. There are innovation issues. There are market issues. I must be the only person on the planet who is willing to assert that small scale search tweaks will not have the large scale effects Microsoft needs.

Forget the assertion that Business Week offers when its says that Google is moving forward. Google is not moving forward; Google is morphing into a different type of company. “Moving forward” only tells part of the story. I wonder if I should extend my shields of protection to include filtering baloney about search emanating from a conference focused on tricking algorithms into putting a lousy site at the top of a results list.

Agree? Disagree? I’m willing to learn if my opinions are scrambled.

Stephen Arnold, August 20, 2008

Clarabridge: Cash Infusion

August 20, 2008

Following in the footsteps of Endeca and Vivisimo, Clarabridge nailed $12 million in  financing. The additional infusion marks the company’s third round of funding. The money came from Grotech Ventures, Harbert Venture Partners,Boulder Ventures, and Intersouth Partners. You can read the company’s official news release here. VCAonline provides additional color here. The money will allow Clarabridge to increase its presence in a crowded market. The expectations for fast growth are often high, and pumping Clarabridge to $100 million in revenue in 24 to 36 months seems to be a relatively challenging job. At my age, it makes my heart palpitate just thinking of the work ahead.

The question that nags at me is, “Will these cash infusions in text and content processing pay off?” Despite the lousy market, smart money seems to say, “Absolutely.” The challenge will be to break through the glass ceiling that keeps most text crunching companies well below Autonomy’s lofty $300 million in revenue, achieved after a decade of hard work, and Endeca’s $110 million (another 10 years of effort as well). Agree? Disagree? Use the comments section to offer your views.

Stephen Arnold, August 20, 2008

Attensity Lassos Brands with BzzAgent Tie Up

August 20, 2008

Attensity, a text analytics and content processing company, applies its “deep extraction” methods to law enforcement and customer support tasks. The company has formed a partnership with BzzAgent. You can find out more about this firm here. This Boston-based firm specializes in the delightfully named art of WOM, shorthand for “word of mouth” marketing. The company’s secret sauce is more than 400,000 WOM volunteers. Attensity’s technology can process BzzAgent’s inputs and deliver useful brand cues. Helen Leggatt’s “Marketers to Get ‘Unrivaled Insights’ into WOM.” You can read this interesting article here. For me, the most interesting point is Ms. Leggatt’s article was:

Each month, BzzAgent’s volunteers submit around 100,000 reports. Attensity’s text analytics technology will analyze the data contained within these reports to identify “facts, sentiment, opinions, requests, trends, and trouble spots”.

Like other content processing companies, Attensity is looking for ways to expand into new markets with its extraction and analytic technology. Is this a sign of vitality, or is it a hint that content processing companies are beginning to experience a slow down in other market sectors? Anyone have thoughts on this type of market friction?

Stephen Arnold, August 20, 2008

Microslump: If Search Data Are Accurate, Bad News for Microsoft

August 20, 2008

Statistics are malleable. Data about online usage are not just malleable, they are diaphanous. Silicon Valley Insider reported Web search market share data at Silicon Alley Insider here. The article by Michael Learmonth was “Google Takes 60% of Search Market, While MSN Loses Share.” The highlight of the write up is a chart, which I am reluctant to reproduce. I can, I believe, quote one statement that struck me as particularly important; namely:

MSN, which lost more than two percentage points of market share from month to month, going from 14.1% of searches to 11.9%. So if Microsoft’s “Cashback” search engine shopping gimmick actually helped boost search share in May and June, its impact seems to be dropping.

The data come from Nielsen Online, specifically the cleverly named MegaView Search report. Wow, after pumping big money into data centers, buying Fast Search & Transfer and Powerset, and ramping up search research and development, the data suggest that:

  • A desktop monopoly doesn’t matter in search
  • Microsoft’s billions don’t matter in search
  • Aggressive marketing such as the forced download for the Olympic content doesn’t matter.

Google is like one of those weird quantum functions that defy comprehension. What else must Redmond do? Send me your ideas for closing the gap between Microsoft and Google.

Stephen Arnold, August 20, 2008

Five Tips for Reducing Search Risk

August 20, 2008

In September 2008, I will be participating in a a conference organized by Dr. Erik M. Hartman. One of the questions he asked me today might be of interest to readers of this Web log. He queried by email: “What are five tips for anyone who wants to start with enterprise search but has no clue?”

Here’s my answer.

That’s a tough question. Let me tell you what I have found useful when starting a new project with an organization that has a flawed information access system.

First, identify a specific problem and do a basic business school or consulting firm analysis of the problem. This is actually hard to do, so many organizations assume “We know everything about our needs.” That’s wrong. Inside of a set you can’t see much other than other elements of the set. Problem analysis gives you a better view of the universe of options; that is, other perspectives and context for the problem.

Second, get management commitment to solve the problem. We live in a world with many uncertainties. If management is not behind solving the specific problem you have analyzed, you will fail. When a project needs more money, management won’t provide it. Without investment, any search and content processing system will sink under the weight of itself and the growing body of content it must process and make available. I won’t participate in projects unless top management buys in. Nothing worthwhile comes easy or economically today.

Read more

Silverlight Analysis: Not Quite Gold, Not too Light

August 19, 2008

In my keynote at Information Today’s eContent conference in April 2008, I referenced Silverlight’s importance to Microsoft. Since most organizations rely on Windows desktop operating systems and applications, Silverlight becomes a good fit for some organizations. I also suggested that Silverlight would play a much larger role in online rich media. I was not able at the time to reference the role Silverlight would play in the Beijing Olympics. Most in the audience of about 150 big time media executives were not familiar with the technology, nor did those in attendance see much relevance between their traditional media operations and Silverlight. Now that the Olympics have been deemed a success for both Microsoft and NBC, I hope that some of the big media mavens understand that rich media may be important to the survival of many information organizations. I’m all for printed books and journals, but the future beckons video and other types of TV-type material.

Tim Anderson’s excellent analysis of Silverlight is available in The Register, one of my favorite news services. The analysis is “Microsoft Silverlight: 10 Reasons to Love It, 10 Reasons to Hate It”, and you should read it here. Unlike most of the top 10 lists that are increasingly common on Web logs, Mr. Anderson’s analysis is based on a solid understanding of what Silverlight does and how it goes about its business. The write up provides the advertised 10 items of strengths and weaknesses, but he supports each point with a useful technical comment.

Let me illustrate just one of his 20 points, and then you can navigate to The Register for the other 19 items. For example, consider item five in the plus column is that Silverlight interprets XAML–Microsoft’s extensible application mark up language–is interpreted directly by Silverlight “whereas Adobe’s XML GUI language, MXML, gets converted to SWF at compiling time. In fact, XAML pages are included as resources in the compiled .XAP binary used for deploying Silverlight applications.”

Mr. Anderson also includes one of those wonderful Microsoft diagrams that show how Microsoft’s various moving parts fit together. I download these immediately because they come in handy when explaining why it costs an arm and a leg to troubleshoot some Microsoft enterprise applications. This version of the chart about Silverlight requires that you install Silverlight. Now you get the idea about Microsoft’s technique for getting its proprietary technology on your PC.

A happy quack to Tim Anderson for a useful analysis.

Stephen Arnold, August 19, 2008

When Old Media Finally Think Hard about Google

August 19, 2008

You need to read David Smith’s long Web log essay about Google. I’m not going to try and summarize it, nor will I pull out a single interesting comment. Published by the Guardian, a UK old media company, the essay combines the best and not so good aspects of old media’s take on Google. The essay is “Google, 10 Years in: Big, Friendly Giant or a Greedy Goliath?” You can find it here. You will get the received wisdom on garage to riches in Silicon Valley. You will learn that Google is a power politics player. You will get the run down on the highest profile products. The most important part of the essay is that it makes clear to me why traditional media are like deer in headlights. The balanced view of Google doesn’t work because the understanding of what Google has built has gone missing. Amazing article and it includes the PR fodder of a quirky picture of the Google guys hanging out and having fun. Magicians use misdirections to awe and entertain. Mr. Smith likes that magic, and it works on traditional media quite well too.

Stephen Arnold, August 19, 2008

« Previous PageNext Page »

  • Archives

  • Recent Posts

  • Meta