Autonomy and Xerox in Tie Up

January 20, 2009

Big news in the world of content processing and search: Xerox and Autonomy have struck a deal. According to this news story on Forbes.com “Xerox DocuShare Enters into OEM Agreement with Autonomy”, “The new license will allow Xerox to integrate Autonomy’s Intelligent Data Operating Layer (IDOL) technology into its DocuShare enterprise content management (ECM) platform.” Docushare is a content management system. The IDOL server will be integrated into the existing Docushare accounts worldwide.

David Smith, Xerox VP, said:

Content management technologies and services that help organizations save money, better manage content and improve efficiencies are essential in today’s business climate… The integration of Autonomy’s IDOL Server takes DocuShare’s ability to meet the needs of our global customer base to the next level.

Information about Docushare is here. Information about Autonomy IDOL is here. The content management sector has been hit by Microsoft’s SharePoint push. Other CMS vendors have beefed up their search and content processing services to withstand the “good enough” system available at competitive rates from Microsoft and its resellers. For example, Interwoven has a deal with Vivisimo.

The challenge for Xerox will be to hold on to its existing customers. The opportunity for Autonomy is to make upsells for other Autonomy functionality. If this deal works, perhaps Xerox will step forward and acquire Autonomy. The vendor has more than 16,000 licensees and a number of lucrative deals. Xerox has dabbled in search and content processing for many years. In fact, Microsoft licensed some of the Xerox search and content processing technology as part of Microsoft’s purchase of Powerset in 2008.

My question is, “What does Xerox know about Xerox PARC technology that prevents Xerox from using its own technology in the Docushare product?” This begs another question, “Does Microsoft know that Xerox has sidestepped Xerox PARC technology for the Autonomy IDOL system?”

Autonomy has a strong business in litigation support. I wonder if Xerox Litigation Services will avail itself of the Autonomy technology to address some of the shortcomings in the Xerox eDiscovery offerings. I don’t have any color for the financial terms of the deal. If I get some substantive information, I will post it.

Stephen Arnold, January 20, 2009

Search Spending Disambiguation

January 19, 2009

Search and content processing struggles with ambiguity; that is, figuring out the meaning of metaphors, homonyms, and other tricky parts of human utterance. The headline "Report: Search Spending Off 8 Percent in Q4" here is a headline that needs disambiguation. The "search" referenced is the online pay for placement and some other bits of the online advertising business. Beyond Search tracks spending in the enterprise search and content processing sector. Our update of the search and content processing vendors is prompting posts about companies that seem to have fallen by the wayside in enterprise search and enterprise content management. If you want a figure for how the financial crisis is affecting online advertising, the data reported by Greg Sterling are for you. For other slices of the industry, you will have to look elsewhere.

Stephen Arnold, January 19, 2009

Etymon: Maybe Another Lost Search Vendor

January 19, 2009

Etymon Systems Inc. was founded in 1998. The company set out to apply information systems research to solve problems through innovative software and consulting. The company’s name means “the source word of a given word.” In 2005, the company alerted me to its text retrieval systems: Amberfish and Isearch.

At that time, I learned that:

Amberfish was general purpose text retrieval software, developed at Etymon by Nassib Nassar and distributed as open source software under the terms of version 2 of the GNU General Public License (GPL). Its distinguishing features are indexing/search of semi-structured text (i.e. both free text and multiply nested fields), built-in support for XML documents using the Xerces library, structured queries allowing generalized field/tag paths, hierarchical result sets (XML only), automatic searching across multiple databases (allowing modular indexing), TREC format results, efficient indexing, and relatively low memory requirements during indexing (and the ability to index documents larger than available memory). Z39.50 support was available. Other features included support for Boolean queries, right truncation, phrase searching, relevance ranking, support for multiple documents per file, incremental indexing, and easy integration with other UNIX tools.

You can download from SourceForge.net a version of Amberfish here.

Isearch was:

open source text retrieval software developed in 1994 by Nassib Nassar at the Clearinghouse for Networked Information Discovery and Retrieval (CNIDR), which was funded by the National Science Foundation. Isearch was designed as a proof-of-concept software architecture for use in distributed information retrieval, known at the time as wide-area information systems, or WAIS. Isearch formed the text retrieval component of the Isite software, which was a complete prototype implementation of ANSI/NISO Z39.50 (ISO 23950)… The main features of Isearch included full text and field searching, relevance ranking, Boolean queries, and support for many document types such as HTML, mail folders, list digests, and text with SGML-style tags.

I had a link in my notes to a version of Isearch dated 2006. You can get that file here today (January 18, 2009). Nassib Nassar turns up as one of the people involved with this company. I had a pointer in my profile of this company to a technical paper about the company’s “grid” concept. You can locate this document here. Mr. Nassar’s blog here has not been updated since December 2005. The Renaissance Computing Institute lists Mr. Nassar on its Web site here.

I am inclined to move this company to my list of defunct search and content processing vendors. If anyone has information about the fate of Etymon, let me know.

Stephen Arnold, January 19, 2009

Unusual US Documents

January 19, 2009

A happy quack to the reader who called my attention to two sites. I haven’t checked the data on these two sites, and I don’t want to suggest the information is accurate. The first site is The Memory Hole here. The second site is Government Attic here. Both sites provides links and information about US government decisions, procedures, and activities. Government Attic’s information is indexed by Google. I’m not sure about Memory Hole at this time.

Stephen Arnold, January 19, 2009

Dot Net Caution

January 18, 2009

Here in the mine run off pond, we geese heard a rumor that Windows 7 has no Dot Net code. That sounded good to us. Now comes a disturbing news item on MSMobile.com, which if true adds another log to the Dot Net fire. There article “Warning to Developers: A Monkey with Its Eyes Closed Can Disassemble Microsoft .Net” here seems to be a bit harsh. The lead paragraph asserts that Microsoft inhabits its “own reality distortion field.” We thought the RDF was an Apple speciality. The most important part of the article is this snippet:

.Net is great in so many ways but for commercial apps? No way! Anybody can just look at your source code. A high end obfuscation will help a lot but any determined hacker will fix your code in less than a day. I know this from sad experience despite spending $1000s on anti-piracy and obfuscation tools. Unless you wish to make your code ‘open source’ then maybe give .Net a wide birth.

The conclusion to the article is pointed: “If you intend to develop commercial software for Windows Mobile, then forget Dot Net.” The geese will watch for more Dot Net intel to validate or invalidate this point.

Stephen Arnold, January 18, 2009

Ami Software

January 18, 2009

I have been updating my files. I was looking for search vendors who had dealings with the UK Ministry of Defence. That organization had some email trouble, and I was curious from which vendor the MoD was licensing software. My files contained a reference to Ami Software, a company based in France when I last looked at the system in 2007.

It’s been a couple of years since I looked at Ami Software (doing business as Albert in France), founded in 1999. The Web site for the company is here. The company has site for OEM partners here. The search and content processing company positions itself as “enterprise intelligence in action.” The company says that its professionals “specialize in developing software solutions to access and manage online information, turning unstructured data, wherever it resides, and in whatever form into usable knowledge.”

ami interface

File shot of the AMI interface.

The most recent news on the company’s Web site here dates from September 2008. That news story asserts growth in 2008 and identifies customer wins in engineering and trade publishing. In December 2007, the company released AMI Enterprise Intelligence Version 4.0. According to my file data, Version 3 of the system came out in late 2006. The release includes “knowledge modelling [sic], RSS publication, advance search capability, and [a] new analysis engine. Other details of the system from my files said:

  • The core system was developed in C++
  • Functions are accessed via htpp services in POST or GET via XML
  • The company developed a higher level scripting language called albScript
  • Higher level components are developed in albScript which is close to JavaScript
  • The system runs on Windows or Apache Web servers.

The company’s Web site points out that the company’s total research and development investment was €23 million. Two investment companies have supported the company: France Ile Development and OTC Asset Management. In my notes, I found a reference to the company’s location in Switzerland, but I can’t determine if that’s accurate.

In France the company does business as Albert France SA. In the UK, it is Albert UK. In North America, the company works with Propelion Internet Solutions Inc.

The company identifies Alain Beauvieux as the president of the company, Eric Fourboul as general manager of products, Phillippe Albert as services director, Remy Carron as sales director in France, and Mike Alderton as UK sales manager.

Mr. Beauvieux worked at IBM and LexiQuest. His most recent blog posting here was in July 2008. That same month, Ami (Albert) announced two job openings. There’s a useful write up here about the company.

What’s peculiar is that information about the company seems to have tapered off by October 2008, and I can’t determine if the search and content processing company is still open for business. With the problems at such companies as Lycos and SurfRay on my mind, I am curious. If anyone has information about the status of this company, please, use the comments section of the Web log to post the information.

Stephen Arnold, January 18, 2009

Financial Waves Swamp SurfRay

January 17, 2009

Just a quick update on SurfRay, the search roll up in Copenhagen. SurfRay bought Speed of Mind, Ontolica, and Mondosoft. You can run a query on Beyond Search and read the previous posts about this company. One of my readers in Europe alerted me to a bankruptcy notice. A happy quack to that person. We did some checking and found a document from Bolagsverket’s Official Register, the announcements for January 16, 2009 contained this information:

surfray bankrupt

If your Swedish is rusty, the snippet says that Surfray has been declared bankrupt at Solna court. Future announcements will be issued to some newspapers. I will keep chasing this story. I have no other details as of 3 36 Eastern on January 16, 2009.

Stephen Arnold, January 16, 2009

Internet and Moore’s Law

January 17, 2009

Physorg.com here published “Internet Growth Follows Moore’s Law Too”.The article points out that information, according to Chinese researchers, “has discovered that Moore’s Law can also describe the growth of the Internet. The key point in the write up for me was that “the Internet will double is size every 5.32 years.” I find a number like this interesting, but it does not match the data that my team has been gathering over the last few years. We focus on the enterprise, and our data suggest that digital information in an organization doubles every six to nine months. If these data are accurate, then the Internet is not growing as rapidly as digital information in organizations. Anyone have any other data? The Chinese estimate seems on the low side.

Stephen Arnold, January 17, 2009

Leapfish Launches

January 16, 2009

LeapFish is a metasearch engine. The company calls its system “a multi dimensional engine.” LeapFish Inc. is a privately held corporation headquartered out of CARR America Corporate Center in Pleasanton, California. The company’s metasearch technology uses a proprietary hyper threading technology. The Marketwatch.com story here said:

LeapFish pushes search to 2.0 and states “out with the search button.” LeapFish’s revolutionary new click free search interface gives life to a fast, fluid and dynamic search experience that extracts the variety of data from major online destinations such as Google, YouTube, eBay and others in a single search query. Consolidating a knowledge base of relevancy and variety from major online authorities, LeapFish effectively renders more comprehensive results than those returned by its providers.

The addled geese at Beyond Search find metasearch systems useful. Our favorite–EZ2Find.com–has exited business. Unlike EZ2Find.com, Leapfish uses an uncluttered splash screen. The naked screen of Ixquick.com is more spare, but Leapfish strikes a good balance between point and get started and the Google-type search box of Ixquick.com.

We liked the ability to limit the query to content type. Leapfish has done some work to tame the not-so-good search experience we have encountered on Google’s blogsearch service. The mouse action was a hair trigger when viewing results from a shopping query.

Leapfish is an interesting service, and we will add it to our list of systems to monitor. More information about the company is at http://www.leapfish.com/AboutUs.aspx.

Search in the Bartz Era at Yahoo

January 16, 2009

The Beyond Search geese have been honking speculatively today about Yahoo search in the post-floundering era. We decided that it was a miracle that Yahoo has been able to keep its revenues where they are and maintain a 20 percent share of the Web search market. Several of the Beyond Search goslings use Yahoo for mail, photo browsing, and bookmark surfing. Others don’t think too much of Yahoo for various reasons. These range from lousy performance over some wireless services to features that seem clunky compared to alternatives available from other vendors.

We read closely Rebecca Buckman’s “The Exacting Standars of Carol Bartz” and found the Forbes article interesting. You can read it here. Unlike some of the critical articles about Carol Bartz, Ms. Buckman focuses on her accomplishments. One interesting parallel is that the “freewheeling culture” of Autodesk and the wild and crazy approach at Yahoo may share some similarities. Ms. Bartz made staff changes and “professinalized” some departments. Yahoo may benefit from this type of management.

Our Beyond Search discussion focused on search, specifically what we perceive as the “problem” with Yahoo search. In order to make Yahoo search more useful, Yahoo has to find a way to address such shortcomings as the spotty relevancy for Web queries that are not about popular topics. The search available for Yahoo shopping is not very useful. In fact, it is on a par with eBay’s current system, and that is quite disappointing. Even convenience services such as finding currency conversion data becomes an exercise in navigating multiple pages. “Search without search” is something that Yahoo needs to master.

In order to remediate Yahoo search, we think that some serious engineering must be done and completed quickly. At lunch we ran several test queries. For example, one was “enterprise search”. The results were surprising. Here’s the display we saw:

yahoo result jan 15

We liked the search suggestions, but we found that the first four results were skewed to Microsoft. For example, there is the Microsoft paid ad in the blue box. That’s the second result. In the organic results, we saw a link to the Yahoo and IBM free search system, which is a boosted result. The Wikipedia result is okay. But the third and fourth results are for Microsoft search pages. The results are not “bad”; the results were just not what we expected. You can run your own queries and see how the Yahoo search results work for you.

A test shopping query was “discount quad core”. The system returned computer sytems from brand name vendors. I thnk each of these systems is tagged with the word “discount”. These are not discount systems, however.

yahoodiscount quad core

How can these search issues be fixed? Is tweaking enough? Will Yahoo’s many different search initiatives ultimately lead to a system that is “better” than Google’s in the eyes of the users?

Here’s the Beyond Search lunch time view:

  1. Yahoo has to work on relevance. Google has made a significant investment in technology to determine context and react to what other users find helpful. Yahoo seems to lag in these areas.
  2. In terms of mobile serarch, the Yahoo system requires menu navigation. Because of the clunkiness of the approach, it is difficult to determine if Yahoo is doing much more than dumping informaton into buckets and showing stories as those stories arrive.
  3. For shopping, Yahoo gets a user close to a product, but Yahoo makes it difficult to find a specific product. We don’t think eBay or Google have cracked the code on shopping search. Yahoo might be able to leapfrog some of the competitors with an innovative approach.

The problem with addressing all or some of these challenges is that it will take time to come up with a solution that is not a one-off, stand-alone island. Yahoo has not focused on search as part of the core fabric of the company. At Google, search and advertising are tough to separate. At Yahoo, search is one thing. Advertising is another. Yahoo, therefore, must think of ways to integrate so the two functions yield an advantage over Google.

Yahoo has the talent and the funds to address these issues. What Yahoo does not have, we concluded, is time. In fact, time may be Yahoo’s biggest single problem. Floundering can be rectified with time. Without time, Yahoo will remain a shadow of its former self. Even a deal with Microsoft can’t change that.

Meantime, the Google maintains its lead in search and advertising. A decade of search missteps cannot be fixed over night. Ms. Bartz may have the expertise, but does she have the time? We quacked loudly, “We don’t think so.”

Stephen Arnold, January 16, 2009

« Previous PageNext Page »

  • Archives

  • Recent Posts

  • Meta