HP and Its New IDOL Categorizer

January 1, 2014

I read “Analytics for Human Information: Optimize Information Categorization with HP IDOL.” I noticed that HP did not reference the original reference to the 1998 categorization technology in its write up. From my point of view, news about something developed 15 years ago and referenced in subsequent Autonomy collateral is not something fresh to me. In fact, presenting the categorizer as something “amazing” suggests a superficial grasp of the history of IDOL technology which dates from the late 1980s and early 1990s. It is fascinating how some “experts” in content processing reinvent the wheel and display their intellectual process in such an amusing way. Is it possible to fool oneself and others? Remarkable.

Update, January 1, 2014, 11 am Eastern:

Hewlett Packard is publicizing IDOL’s automatic categorization capability. As a point of fact, this function has been available for 15 years. Here’s a description from a 2001 Autonomy IDOL Server Technical Brief, 2001.

DOL server can automatically categorize data with no requirement for manual input whatsoever. The flexibility of Autonomy’s Categorization feature allows you to precisely derive categories using concepts found within unstructured text. This ensures that all data is classified in the correct context with the utmost accuracy. Autonomy’s Categorization feature is a completely scalable solution capable of handling
high volumes of information with extreme accuracy and total consistency. Rather than relying on rigid rule based category definitions such as Legacy Keyword and Boolean Operators, Autonomy’s infrastructure relies on an elegant pattern matching process based on concepts to categorize documents and automatically insert tag data sets, route content or alert users to highly relevant information pertinent to the users profile. This highly efficient process means that Autonomy is able to categorize upwards of four million documents in 24 hours per CPU instance, that’s approximately one document, every 25 milliseconds. Autonomy hooks into virtually all repositories and data formats respecting all security and access entitlements, delivering complete reliability. IDOL server accepts a category or piece of content and returns categories ranked by conceptual similarity. This determines for which categories the piece of content is most appropriate, so that the piece of content can subsequently be tagged, routed or filed accordingly.

Stephen E Arnold, January 1, 2014

Comments

One Response to “HP and Its New IDOL Categorizer”

  1. Paul T. Jackson on January 2nd, 2014 12:30 pm

    Of course some of the search people are advocating what used to be known as Portals…now named something else (I forget what the book said)…pretty much the same thing. I guess this means we won’t need taxonomists any longer, or taxonomy dictionaries. I wonder if that means it can tell between drums and drums as in percussion or drum parts (hardware, accessories, etc.) It would be a real discovery for some of the music instrument dealer sites whose search engines should give up and use Google.

    Back in the late 1990s, Autonomy the British Company (before it sold out) was supporting a program called Autonomy which would search the client computer and the net at the same time. It was pretty slow on the PC, and they stopped supporting it, which is a shame. Later someone had a plug in for Outlook called Lookout that was able to search not only emails, but the entire client machine…and it was fast. Then it was also dumped.

    I wonder why these good things that seem to work keep getting buried and bumped from the scene. Maybe IDOL has come back and works better than what we have now.