Mindbreeze InSite DemoAugmentextPolySpot: Agile Enterprise Search Infrastructure

Jargon Means Shields Up for Consultants

February 21, 2010

I just read “Computer Jargon Baffles Users, Hinders Security.” This is a Thomson Reuters’ news story, and I don’t know if the wild and crazy url will work when you read this. Not my fault. Email Thomson Reuters, whose customer support crew is ready to help you.

The news story is one that runs every few months. The idea is that jargon is pretty much impossible for the average person to figure out. The argument in the Thomson Reuters’ story pivots on security, but the journalist could have picked on search, business intelligence, or any other common enterprise application. Jargon is a defense mechanism. Magic.

image

Source: http://s.bebo.com/app-image/7979726037/5411656627/PROFILE/i.quizzaz.com/img/q/u/08/04/08/Force_Field.jpg

For me, the key passage in the Thomson Reuters’ story was:

“The malicious and criminal use of cyberspace today is stunning in its scope and innovation,” said Dell Services President Peter Altabef. One problem is that computer “geeks” use jargon to cloak their work in scholarly mystique, resulting in a lack of clarity in everything from instruction manuals and systems design to professional training, the experts said. “If you don’t demystify security, people become anxious about it and don’t want to do it,” former U.S. Homeland Security Secretary Michael Chertoff told Reuters on the sidelines of the EastWest Institute security meeting in Brussels.

I had a conversation with a big wheel from a blue chip consulting firm. I really want to reveal which firm, but my legal eagle squawks when I provide certain information in this Web log. The guts of the conversation are easy to summarize.

Read more

Search Engine Convera Drifts Off

February 16, 2010

The journey was a long one, beginning with scanning marketing brochures in the 1990s has filed for a certificate of dissolution. I think this means that Convera has moved from the search engine death watch to the list which contains Delphes, Entopia, and other firms.

convera splash

Convera splash page on February 15, 2010

You can read the official statement for a few more days on the PRNewswire site. The title of the announcement is / was, “Convera Corporation Files Certificate of Dissolution, Trading of Common Stock to Cease after February 8, 2010 Payment Date Set.” I am no attorney so maybe my lay understanding of “dissolution” is flawed, and Convera under another name will come roaring back. For the purposes of this round up of my thoughts, I am going to assume that Convera is comatose. I hope it bounces back with one of those miracles of search science. I am crossing my wings, even thought each has a dusting of snow this morning. Harrod’s Creek has become a mid south version of Nord Kap.

For me, the key passage in the write up was:

Convera Corporation announced today that it filed its Certificate of Dissolution with the Delaware Secretary of State on February 8, 2010, in accordance with its previously announced plan of complete dissolution and liquidation.  As a result of such filing, the company has closed its stock transfer books and will discontinue recording transfers of its common stock, except by will, intestate succession or operation of law.  Accordingly, and as previously announced, trading of the company’s stock on the NASDAQ Stock Market will cease after the close of business on February 8, 2010.

My Overflight search archive suggested that Excalibur Technologies was around in the 1980s. The founder was Jim Dowe, who was interested in neural networks. The notion of pattern matching was a good one. The technology has been successfully exploited by a number of vendors ranging from Autonomy to Verity. Brainware’s approach to search owes a tip of its Prince Heinrich hat to the early content snow plowing at Excalibur. Excalibur used most of the buzzwords and catchphrases that bedevil me today, including “semantic technology.”

image

Sample of a category search on the Retrieval Ware system. The idea is that you would click a category.

One of my former Booz, Allen & Hamilton colleagues made some dough by selling his ConQuest Software search-related technology to Excalibur Technologies. The reason was that the original Excalibur search system did not work too well. Excalibur, according to my Overflight archive, described itself as “leading provider of knowledge and media asset management solutions.”

Read more

Microsoft Fast on Linux and Unix Innovation

February 15, 2010

It’s Valentine’s Day. I feel quite a bit of affection for the system professionals who have licensed Fast Search ESP, and I hope each finds search love. I think there will be a “tough” element to this love. And like other types of love, there will be ups and downs. Microsoft practiced some “tough love” for licensees of the Linux and Unix versions of Fast Search & Transfer’s Enterprise Search Platform recently. I am in a discursive frame of mind, and I will share my opinion about the “tough love” for the Linux and Unix licensees of the 1997 technology that comprises some of Fast Search & Transfer’s system.

The not-too-surprising announcement that Microsoft would stop supporting Fast Search & Transfer’s Linux and Unix customers surprised some folks. I think a handful of resellers were delighted because customers with non-Windows versions of Fast Search cannot change horses in the middle of the Tigris River, as Alexander the Great discovered in 331 BCE. Some poobahs pointed out that open source search would become a hot ticket for Fast Search Linux and Unix licensees. Others took a more balanced view of figuring out whether to rip and replace or supplement the aging Fast Search system with one of the more specialized solutions now available; for example, Exalead’s system could be snapped in without much hassle, based on my research for Successful Enterprise Search Management, published by Galatea in the UK last year. (Martin White was my co-author.)

image

Source: http://www.zastavki.com/pictures/1024×768/2008/Saint_Valentines_Day_St.Valentine_004959_.jpg

What I found interesting is that the Microsoft Enterprise Search blog contained some information from Bjørn Olstad, CTO, FAST and Distinguished Engineer, Microsoft. The write up’s title is “Innovation on Linux and Unix,” and it appeared on February 4, 2010.

Mr. Olstad wrote:

When we announced the acquisition two years ago, we said that we were committed to cross-platform innovation—that we’d “continue to offer stand-alone versions of ESP that run on Linux and UNIX,” and that we would provide updates to these versions to address customer concerns and add new features.  Over the last two years, we’ve done just that.

The deal was consummated in April 2008. In October 2008, the Norwegian authorities seized some company information, but there has not been much news about the investigation into the pre-acquisition Fast Search & Transfer’s activities. At any event, it is now February 2010, so Microsoft has been operating Fast Search for the period between April 2008 and February 2010. That’s not quite two years, which is a nit, but software works when details are correct. What’s clear is that Fast Search and its Enterprise Search Platform or ESP is pared down and focused on the Windows platform.

I also noted this passage:

When we announced the acquisition two years ago, we said that we were committed to cross-platform innovation—that we’d “continue to offer stand-alone versions of ESP that run on Linux and UNIX,” and that we would provide updates to these versions to address customer concerns and add new features.  Over the last two years, we’ve done just that.

Read more

A Free Pass for Open Source Search?

February 11, 2010

Dateline: Harrod’s Creek, February 11, 2010

I read Gavin Clarke’s “Microsoft Drops Open Source Birthday Gift with Fast Lucidly Imaginative?” I think that the point of the story was “a free pass” to “open source search providers like Lucid Imagination” is interesting. However, I am not willing to accept “free pass”, a variant of the “free lunch” in my opinion.

Here’s my view from the pleasant clime of snowy Harrod’s Creek.

First, in my opinion, most of the Fast Search & Transfer licensees bought into the “one size fits all” approach to search: facets, reports, access to structured and unstructured data, etc. As many of these licensees discovered, the cost of making Fast’s search technology deliver on the marketing PowerPoints was high. Furthermore, some like me learned how difficult it was for certain licensees to get the moving parts in sync quickly. Fast ESP consisted, prior to the Microsoft buy out, of keyword search, semantics from a team in Germany, third-party magic from companies like Lexalytics, home brew code from Norwegian wizards, and outright acquisitions for publishing and content management functionality. Wisely, many search vendors have learned to steer clear of the path that Fast Search & Transfer chopped through the sales wilderness. This means that orphaned Fast Search licensees may be looking at procurements that narrow the scope of search and content processing systems. In fact, there are only a handful vendors who are now pitching the “kitchen sink” approach to search.

no free lunch copy copy

Source: http://www.graceforlife.com/uploaded_images/no_free_lunch-772769.jpg

Second, open source search solutions are not created equal. Some are tool kits; others are ready-to-run systems. Lucid Imagination has a good public relations presence in certain places; for example, San Francisco. For those who monitor the search space, there are some other open source vendors that may provide some options. I particularly like the open source version of Lucene available from Tesuji.eu. Ah, never heard of the outfit, right? I also find the FLAX system available from Lemur Consulting useful as well. I think the issues with Fast Search & Transfer are not going to be resolved by ringing up a single vendor and saying, “We’re ready to go with your open source solution.” The more prudent approach is going to be understanding what the differences among various open source search solutions are and then determining if an organization’s specific requirements match up to one of these firms’ service offerings. Open source, therefore, requires some work and I don’t think a knee jerk reaction or a sweeping statement that the Microsoft announcement will deliver a “free pass” is accurate.

Read more

Online Pricing: Disruption Is the Game

February 8, 2010

It’s Monday morning. The Super Bowl is over, but the world football ecosystem is unfazed. The same cannot be said of for-fee content. I want to point out two seemingly unrelated developments and link them to one of the keystones of doing business in an online, Web-centric world. I am working on a couple of oh-so-secret write ups, and I will make oblique references to research findings by the goslings here in Harrod’s Creek that will be more widely known in the spring.

image

When world’s collide. The boundary is the exciting spot in my opinion. Image source: http://www.sciencedaily.com/images/2008/01/080112152249-large.jpg

First, consider the plight of Google Books. Suddenly the Department of Justice is showing some moxie. That’s a good thing, but I think the reality of derailing Google Books is like to have some interesting repercussions going forward. For now, the big story is that Google Books has become the poster child of Google being Google. You can get the received wisdom in the UK newspaper The Telegraph and its write up “Justice Department Cr5iticises Google Books Settlement.” The glee is evident to me in this write up, but perhaps I am jaded and worn down by the approach certain publications take to Google. The company is essentially the first examples of what will be a growing line up of firms that use technology to alter business processes. I will be talking about this in my NFAIS speech on March 1, 2010. I am the luncheon speaker, and I think some of those in the room will get indigestion. The reason is that Google comes from a domain that people within 20 years of my age of 65 don’t fully understand. The Telegraph doesn’t get it either, and I think this passage highlights that generational divide:

The ruling is a blow to Google and authors’ groups who had supported the search giant’s ambitious plan to create a vast online library of digitised books. The controversial Google Book Search project attracted fierce criticism from authors, who believed their rights were being eroded, while winning praise from other quarters for helping to widen access to classic, rare or useful works of literature.

Too bad the writer, a real journalist, omitted the word “goodie”. My hunch is that since national libraries have not shown any interest in creating digital collections, students and researchers will be doing their work the way John Milton and Andrew Marvell did. Great for those who have the time, money, and cursive writing skills. Not so great for those who need to sift through lots of content quickly. With library budgets shrinking and librarians forced to decide which books to keep, which to store, and which to trash, I think the failure of national libraries is evident. Google made a Googley and somewhat immature attempt to step into the breach and look what has resulted? A bureaucratic, legal eagle snarl. Books are an intellectual resource and I keep asking, “If not Google who?” Reed Elsevier? The British government? The National Library of China? A consortium of publishers? The answer is, in my opinion, now clear, “No one.” Maybe Google will keep going with this project. Hard to tell. Life might be easier to shift gears, go directly to authors, and cut specific deals for their future work. In a decade or so, end of problem. Also, end of traditional publishing. If Google actually talked to me, I would offer this advice, “Go for it, dudes.”

Read more

Microsoft and Mikojo Trigger Semantic Winds across Search Landscape

January 28, 2010

Semantic technology is blowing across the search landscape again. The word “semantic” and its use in phrases like “semantic technology” has a certain trendiness. When I see the word, I think of smart software that understands information in the way a human does. I also think of computationally sluggish processes and the complexity of language, particularly in synthetic languages like English. Google has considerable investment in semantic technology, but the company wisely tucks it away within larger systems and avoiding the technical battles that rage among different semantic technology factions. You can see Google’s semantic operations tucked within the Ramanathan Guha inventions disclosed in February 2007. Pay attention to the discussion of the system and method for “context”.

image

Gale force winds from semantic technology advocates. Image source: http://www.smh.com.au/ffximage/2008/11/08/paloma_wideweb__470x289,0.jpg

Microsoft’s Semantic Puff

Other companies are pushing the semantic shock troops forward. I read yesterday in Network World’s “Microsoft Talks Up Semantic Search Ambitions.” The article reminded me that Fast Search & Transfer SA offered some semantic functionality which I summarized in the 2006 version of the original Enterprise Search Report (the one with real beef, not tofu inside). Microsoft also purchased Powerset, a company that used some of Xerox PARC’s technology and its own wizardry to “understand” queries and create a rich index. The Network World story reported:

With semantic technologies, which also are being to referred to as Web 3.0, computers have a greater understanding of relationships between different information, rather than just forwarding links based on keyword searches.  The end game for semantic search is “better, faster, cheaper, essentially,” said Prevost, who came over to Microsoft in the company’s 2008 acquisition of search engine vendor Powerset. Prevost is still general manager of Powerset.  Semantic capabilities get users more relevant information and help them accomplish tasks and make decisions, said Prevost.

The payoff is that software understands humans. Sounds good, but it does little to alter the startling dominance of Google in general Web search and the rocket like rise of social search systems like Facebook. In a social context humans tell “friends” about meaning or better yet offer an answer or a relevant link. No search required.

I reported about the complexities of configuring the enterprise search system that Microsoft offers for SharePoint in an earlier Web log post. The challenge is complexity and the time and money required to make a “smart” software system perform to an acceptable level in terms of throughput in content processing and for the user. Users often prefer to ask someone or just use what appears in the top of a search results list.

Read more

The Blank Spaces in Social Media

January 25, 2010

For the last 14 months I have written a monthly column for Information World Review. I don’t recycle that information in this Web log. In fact, I try to steer clear of repeating information within and across my monthly columns and this Web log. I thought I would have a dearth of information with the writing demands these place upon me and the equally addled goslings.

I was wrong.

On February 1, 2010, we are going to create a second Web log with the very hot title of SSN. I won’t reveal what it is about. I can say that it will NOT discuss the social security numbering system. I am going to operate the information service as a test for several months. If we hit a comfortable stride, then we will shift from a public beta test to a full-scale operation.

Yes, we will accept advertising, advertorials, and other marketing tie ups. Some of the conventions of Beyond Search and the ArnoldIT.com services will be linked to the new Web log. No, we have not worked out the details, but one of the team is going to grab hold of this angle and manage this aspect of the new information service.

The broad topic area will fit between real time search (my Information World Review column), my Google write ups (the KMWorld column), and my area of expertise (large scale online search and systems). We will have the exact positioning hammered out by Wednesday of next week with the first content live online a few days later.

A Real Editor

The editor for this Monday through Friday Web log will be Jessica Bratcher, a former newspaper editor. She continues to instruct me in how “real” journalists work. I will never learn because I am a sales person with few skills and not much energy.

She has assembled a team of goslings to be who will follow the conventions of the Web log world with a heck of a lot more journalistic acumen than I bring to the write ups in this Web log.

The Content

The Web log will feature some new approaches to content germane to online information.

First, each week there will be a dialog about a particular online issue of interest to business professionals. The idea is to take a topic and look at it from different viewpoints. In Beyond Search, there is a single point of view, and we want to explore topics from different angles. The trope will be a semi-Socratic dialog involving my partners in this new, free online information service. Even though different people will be involved, you will recognize the dialog from its new icons:

goose head tern head

Notice that both icons represent squawking and noisy birds. The idea is to have an edge and present information a person involved in business will find somewhat useful.

Second, there will be lists. A traditional Web log forces certain content into a stack with the most recent information at the top and the older information buried at the bottom of the pile. The new Web log will put certain information—such as lists and reference information—on pages that are static. We think you will find it easier to locate some of the special content we are gathering for this new information service.

Read more

Google and Its Security Woes

January 18, 2010

There are some practical issues that must be addressed when dealing with security. First, the people working on the security problem have to be vetted. This requires time and organization. Organizations in a hurry and not well organized are at greater risk than a plodding, more methodical outfit. Although troubling to some, the security people have to be subject to some type of monitoring as well. The idea is that layers of security methods and procedures are required. Again, this takes expertise and experience. Short cuts can increase risk.

Then when something bad happens, it is a good idea to look for indications that someone close to the matter is involved, intentionally or unintentionally. Some countries use clever methods to socially engineer an opportunity to exploit a weakness in security. I know that the idea of a team implies that everyone is going to run the game plan. Alas, that’s not always accurate.

In my experience, keeping an issue contained is a prudent first step. The idea that quick reaction or chatter helps may be an inaccurate one. Some outputs are necessary, but crazy talk is rarely helpful whether from pundits, poobahs, satraps, or azure chip consultants.

I was surprised to read several widely circulated news stories that provide some additional “information” or “disinformation” about the Google security matter. The work “attack” is attached to this issue, but I don’t know enough to be able to say whether this was an “attack” or one of those cute things that math club members perpetrate as a way to get attention, change grades for the football team, or transfer cafeteria money to a charity like Midnight Auto Supply.

image

The Great Wall of China was built for a reason. Some of those reasons exist for today’s Chinese governmental entities. Those who build the Great Wall were not concerned with the environmental or financial impact of the Great Wall. Priorities may be different in China than in other geographic areas or nation states. Image source: http://www.globusjourneys.com/Common/Images/Destinations/great-wall.jpg

That’s the problem with lots of information or lots of disinformation. There is uncertainty, what I call a “cloud of unknowing”.

Here’s what’s caught my attention. (Keep in mind that I have no solid opinion on this matter because I only know what flops into my newsreader and that information or disinformation is suspect by definition.)

Read more

Search Vendors Working the Content Food Chain

January 13, 2010

In the last six months, I have noticed that three companies are making an effort to respond to ZyLAB’s success in the end-to-end content processing sector. There has been some uninformed and misleading discussion of search and content processing companies shift to vertical market solutions. I think this view distorts what some vendors are doing; namely, when one company finds a way to make sales, the other vendors pile into the Volkswagen. This is not so much “imitation as flattery”. What is happening is that sales are tough to make. When a company finds an angle, the stampede is on. In a short period of time, an underserved sector in search and content processing has more people stomping around than Lady Gaga.

Let’s go back in history, a subject that most of the poobahs, azure chip consultants, and self appointed experts avoid. The idea that certain actions have surfaced before is no fun. Identifying a “new” trend is easier, particularly when the trend spotter’s “history” extends to his / her last Google query.

stirp copy

The Mobius strip is non-orientable, just like search solutions that provide end-to-end solutions. A path on a Mobius strip can be twice as long as the original strip of paper. That’s a good way for me to think about end-to-end search and content processing systems. Costs follow a similar trajectory as well.

In the dim mists of time, one of the first outfits to offer and end-to-end solution to content acquisitions, indexing, and search was—believe it or not—Excalibur. The first demonstration I received of the Excalibur RetrievalWare technology included scanning, conversion of the scanned image’s text to ASCII, indexing of the ASCII for an image, and search. The information processed in that demonstration was a competitor’s marketing collateral. There were online search systems, but these were mostly small scale systems due to the brutal costs of indexing large domains of HTML. A number of companies were pushing forward with the idea of integrated scanning systems. Sure, in the 1990s you could buy a high end scanner and software. But in order to build a system that minimized the fiddly human touch, you had to build the missing components yourself. Excalibur hooked up with resellers of high end scanners from companies like Bell+Howell, Fujitsu, and others. The notion of taking a scanned image and then via an in memory processing performing optical character recognition of the page image and then indexing that ASCII was a relatively new method. UMI (a unit of Bell+Howell) had a sophisticated production process to do this work. Big outfits like Thomson were interested in this type of process because lots of information in the early 1990s was still in hard copy form. To make a long story short, the Excalibur engineers were among the first to create commercial product that mostly worked, well, sort of. The indexing was an issue. Excalibur embarked on a journey that required enhancing the RetrievalWare product, generating ready-to-use controlled vocabularies for specific business sectors like defense and banking. As you may know, Excalibur’s original vision did not work so the company mrophed into a search and content processing company with a focus on business intelligence. The firm renamed itself as Convera. The origins of the company were mostly ignored as the Convera package of services chased government work, commercial accounts like Intel and the National Basketball Association (data center SaaS functions for the former and video searching for the hoopsters). When those changes did not work out too well, Convera refocused to become a for fee version of the free Google custom search engine. That did not work out too well either, and the company has be semi-dissolved.

Why’s this important?

First, the history shows that end-to-end processing is not new. Like much of the hot search innovations, I find the discoveries of the azure chip crowd a “been there, done that” experience. Processing paper and making it searchable is a basic way to approach certain persistent problems.

Second, the synopsis of the Excalibur trajectory makes clear that senior managers of search and content processing companies scramble, following well worn paths. The constant repositioning and restating of what a technology allegedly does is a characteristic of search and content processing.

Third, the shifts and jolts in the path of the Excalibur / Convera entity are predictable. The template is:

  1. Start with a problem
  2. Integrate
  3. Sell
  4. Engineer fixes on the fly
  5. Fail
  6. Identify a new problem
  7. Rinse, repeat.

What has popped out of my Overflight intel system is that law firms are now looking for a solution to a persistent information problem; that is, when a legal matter fires up, most search systems work just fine with content in electronic form. The hitch is that a great deal of paper is produced. If something exists in digital form and one law firm must provide that information to another law firm, some law firms convert the digital information to paper, slap on a code, and have FedEx deliver boxes of paper. The law firm receiving this paper no longer has the luxury of paying minions to grind through the paper. The new spin on the problem is that the law firm’s information technology people want to buy a hardware-software combination that allows a box of paper to be put in one end and the magic between the hard copy and the searchable, electronic instance of the documents are magically completed.

Well, that’s the idea. Some of the arabesques that vendors slap on this quite difficult problem include:

  1. Audit records so a law firm knows who looked at what when and for how long
  2. A billing method. Law firms want to do invoices, of course
  3. A single point solution so there is “one throat to choke”.

What the companies want is what Excalibur asserted it had almost 20 years ago.

ZyLAB, under the firm hand of Johann Scholtes (a former Dutch naval officer), has made inroads in this market sector. You can read an interview with him in the Search Wizards Speak series, so I won’t recycle that information in this write up.

Autonomy was quick to move to build out its end-to-end solutions for law firms and other clients with a paper and digital content problem. In fact, Autonomy just received an award for its end-to-end  eDiscovery platform.

Brainware offers a similar system. That company, a couple of years ago, told me that it had to add staff to handle the demand for its scanning and search solution. Among the firm’s largest customers were law firms and, not surprisingly, the Federal government. You can read an interview with a Brainware executive (who is an attorney) in the Search Wizards Speak series.

I learned that Recommind has inked a deal with Daeja Image Systems for its various document processing software components. The idea is to be able to provide an end-to-end solution to law firms, government agencies, and other outfits that need a system that provides access to paper based content and digital content.

Let’s step back.

What this addled goose sees in these recent announcements is that the “new” is little more than a rediscovery that law firms have not yet cracked the back of the paper to digital job and been able to get a search system that provides access to the source material. Sure, there were solutions 20 years ago, but those solutions don’t meet a continuing need. Notice that this problem has been around for a long time, and I don’t think the present crop of solutions will solve the problem fully.

Read more

Lazarus, Azure Chip Consultants, and Search

January 8, 2010

A person called me today to tell me that a consulting firm is not accepting my statement “Search is dead”. Then I received a spam email that said, “Search is back.” I thought, “Yo, Lazarus. There be lots of dead search vendors out there. Example: Convera.

Who reports that search has risen? An azure chip consultant! Here’s what raced through my addled goose brain as I pondered the call and the “search is back” T shirt slogan:

In 2006, I was sitting on a pile of research about the search market sector. The data I collected included:

  • Interviews with various procurement officers, search system managers, vendors, and financial analysts
  • My own profiles of about 36 vendors of enterprise search systems plus the automated content files I generate using the Overflight system. A small scale version is available as a demo on ArnoldIT.com
  • Information I had from my work as a systems engineering and technical advisor to several governments and their search system procurement teams
  • My own experience licensing, testing, and evaluating search systems for clients. (I started doing this work after we created in 1993 The Point (Top 5% of the Internet) and sold it to Lycos, a unit of CMGI. I figured I should look into what Lycos was doing so I could speak with authority about its differences from BRS/Search, InQuire, Dialog (RECON), and IBM STAIRS III. I had familiarity with most of these systems through various projects in my pre Point (Top 5% of the Internet life).
  • My Google research funded by the now-defunct BearStearns outfit and a couple of other well heeled organizations.

What was clear in 2006 was the following:

First, most of the search system vendors shared quite a bit of similarity. Despite the marketing baloney, the key differentiators among the flagship systems in 2006 were minor. Examples range from their basic architecture to their use of stemming to the methods of updating indexes. There were innovators, and I pointed out these companies in my talks and various writings, including the three editions of the Enterprise Search Report I wrote before I fell ill in February 2007 and quit doing that big encyclopedia type publication. These similarities made it very clear to me that innovation for enterprise search was shifting from the plain old key word indexing of structured records available since the advent of RECON and STAIRS to a more freeform approach with generally lousy relevance.

image

Get information access wrong, and some folks may find a new career. Source: http://www.seeing-stars.com/Images/ScenesFromMovies/AmericanBeautyMrSmiley%28BIG%29.JPG

Second, the more innovative vendors were making an effort in 2006 to take a document and provide some sort of context for it. Without a human indexer to assign a classification code to a document that is about marketing but does not contain the word “marketing”, this was rocket science. But when I examined these systems, there were two basic approaches which are still around today. The first was to use statistical methods to put documents together and make inferences and the other was a variation on human indexing but without humans doing most of the work. The idea was that a word list would contain synonyms. There were promising demonstrations of software methods that could “read” a document, but there were piggy and of use where money was no object.

Third, the Google approach which used social methods—that is, a human clicking on a link—were evident but not migrating to the enterprise world. Google was new but to make their 2006 method hum, lots of clicks were needed. In the enterprise, most documents never get clicked, so the 2006 Google method was truly lousy. Google has made improvements, mostly by implementing the older search methods, not by pushing the envelope as it has been doing with its Web search and dataspace efforts.

Fourth, most of the search vendors were trying like Dickens to get out of a “one size fits all” approach to enterprise search. Companies making sales were focusing on a specific niche or problem and selling a package of search and content searching that solved one problem. The failure of the boil the ocean approach was evident because user satisfaction data from my research funded by a government agency and other clients revealed that about two thirds of the users of an enterprise search system were dissatisfied or very dissatisfied with that search system. The solution, then, was to focus. My exemplary case was the use of the Endeca technology to allow Fidelity UK sales professionals to increase their productivity with content pushed to them using the Endeca system. The idea was that a broker could click on a link and the search results were displayed. No searching required. ClearForest got in the game by analyzing the dealer warranty repair comments. Endeca and ClearForest were harbingers of focus. ClearForest is owned by Thomson Reuters and in the open source software game too.

When I wrote the article in Online Magazine for Barbara Quint, one of my favorite editors, I explained these points in more detail. But it was clear that the financial pressures on Convera, for example, and the difficulty some of the more promising vendors like Entopia were having made the thin edge of survival glint in my desk lamp’s light. Autonomy by 2006 had shifted from search and organic growth to inorganic growth fueled by acquisitions that were adjacent to search.

Read more

« Previous PageNext Page »

  •  Only search links from this page: