IBMs ICAwES Red Book Available

April 17, 2014

The article on titled Building Enterprise Search Solutions Using IBM Content Analytics with Enterprise Search involves IBM rolling out information about ICAwES. That excellent acronym stands for IBM® Content Analytics with Enterprise Search, as you may have guessed. It allows for customized synonym dictionaries for search, annotators, and the integration of diverse kinds of repositories. The abstract explains,

“With ICAwES enterprise search solutions, you can integrate fields from multiple content repositories to create a single, integrated user search experience. In addition, the enterprise search solutions can use fields and facets in various ways to create diverse views of your search result set, thus helping you identify the hidden meaning of your unstructured content. This IBM Redbooks® Solution Guide explains, from a high level, how to build enterprise search solutions using ICAwES.”

A red book is available through IBM Redbooks. It offers information on using the “text classification capability”, the “LanguageWare Resource Workbench” and “IBM Content Assessment”. It is aimed at IT architects and business users interested in expanding their usage and improving customer satisfaction and business operations, all interesting information. The reference to the “billion dollar baby Watson” appears in the footer, but not in the explanation of the ICAwES.

Chelsea Kerwin, April 17, 2014

Sponsored by, developer of Augmentext

ArnoldIT Video: Search Brands Video

April 15, 2014

Whatever happened to Convera and the other four companies comprising the Top Five in enterprise search: Autonomy, Endeca, Fast Search & Transfer and Verity. The video also mentions Exalead and ISYS Search Software. The wrap up to the video points to three open source enterprise search options. For those who want to be reminded of the Golden Age of enterprise search, check out the free, six minute video from Stephen E Arnold, publisher of Beyond Search. Mr. Arnold is converting some of his research into brief, hopefully entertaining and useful free videos. You can access this short search history lesson at The next video in the series tackles the subject of buzzword, argot, jargon, lingo, and verbal baloney. What vendor is the leader in the linguistic linguini competition? The video will be available before the end of April. In the meantime, take a walk down memory lane and learn how Cornelius Vanderbilt obtained needed information in the early 19th century.

Kenneth Toth, April 15, 2014

Highspot Earning Fans of Enterprise Search

April 9, 2014

Enterprise search is like the weather, it never stays the same for very long. Case in point, upstart Highspot ( making some serious waves regarding money producing content. We learned more from a recent eWeek story, “Highspot Brings Machine Learning to the Enterprise.”

According to the article, which quotes the company’s CEO, Robert Wahbe:

“’We spent a ton of time and money producing content, and what came clear to me was that people are not able to find the content they are looking for when they need it,’ Wahbe told eWEEK. Wahbe cited a Forrester study that says there is a knowledge gap where the failure rate for not finding the information users are looking for is 56 percent, while the process of looking for the information wastes up to 12 percent of users’ time.”

Clearly, they have their head on straight. Thankfully, we are not the only ones noticing. No less than PC World praised Highspot recently raving about how the company helps find all the meaty data that slips through the cracks for most people. We are big fans and the world seems to be catching on that enterprise search has a new high water mark with Highspot.

Patrick Roland, April 09, 2014

Sponsored by, developer of Augmentext

RAVN: SOLR Search and Autonomy Services

March 22, 2014

My Overflight system flagged news about RAVN’s enterprise search and DocAuto, a company that “makes matter-centricity, email management, IDOL management, and other content management operations flexible, seamless, and secure. I must admit I was not sure what DocAuto did. I have a fleeting recollection of learning about RAVN when I was at a very disorganized enterprise search conference in London in 2013. I don’t know if the conference was in a tizzy or whether the speakers were suffering from jet lag.

RAVN’s Web site asserts that the company delivers “the power of understanding.” I’m okay with tag lines. I am not exactly sure what “understanding” means in the RAVN context, but most outfits offering “enterprise search” use words that sound like they are full of freight. I ask questions like “What is understanding?” and chuckle as I listen to the marketer explain “understanding” to me. Most of these folks are not epistemologists, however.

RAVN’s Web site offers solutions for Big Data, the power of understanding, real time understanding, and knowledge management. I am not sure what any of these buzzwords means. I write a column for KMWorld, and, truth be told, I have absolutely no idea about the meaning of “knowledge” or, for that matter, “management.” I worked at Booz, Allen & Hamilton—at one time one of the world’s leading management consulting firms—and I never understood what “management” meant. I think it was a way to bill client for 20 somethings to do outsourced work. Don’t hold me to this idea because at age 70, the past grows more hazy with each passing day.

The capabilities of RAVN include a knowledge graph, enterprise search, an expert locator, sentiment, and core. I clicked on the enterprise search link and and learned:


The words explaining this diagram embraced “connecting to and unifying diverse content repositories.” I think that means “federated search. RAVN “surfaces results in meaningful ways.” I am not sure what this means. RAVN search delivers relevance ranking, “enterprise scale content security,” enterprise search “scalability,” and “performance.”

The firm offers a power of understanding approach and provides a short video explaining how I can “harness the power of understanding.” The video replaces chaos with structure. The system learns the user’s interests. RAVN puts a user ahead of the competition. RAVN handles text, audio, video, and knowledge.


This manual work is not good.


The automatic RAVN system is good.

RAVN offers a core, a knowledge graph, and SharePoint support.

The company’s services include support for Autonomy IDOL, which appears to have influenced the bold assertions about RAVN’s own search system, and SOLR. My hunch is that RAVN will provide an open source solution with some connectors and software wrappers.

I will keep my eye on RAVN search. For now, the company is in buzzword marketing mode.

Stephen E Arnold, March 22, 2014

SearchBlox Offers Enterprise Search In Spirit of Endeca

February 17, 2014

The sponsored article titled Faceted Enterprise Search from Searchblox on Web Designer Depot promotes SearchBlox as a viable alternative to Google Mini or Search Appliance for enterprise search. The article provides screenshots to show the simplicity of setup in detail. The article explains,

“SearchBlox has crawlers that work for filesystems, websites, RSS feeds, and databases that work straight out-of-the-box. They can index both public and protected content, and can be set to crawl on a specified schedule so your users’ searches are always up to date.

The faceted search plugin that comes with SearchBlox is jQuery based, so it’s easy to integrate it into your website or application. Running WordPress? There’s a custom WP plugin for searching and indexing your WordPress site”

It sounds like the spirit of Endeca is still alive. Prior to SearchBlox being able to index and search the various file types all the user must do is set folder paths or root URLs. SearchBlox promises to be a quick and faceted search built on Apache Lucene. Users can manage everything through a web-based administrative console. SearchBlox allows for crawling third party websites, indexing API, synonym searches and customizable stopwords. All of these capabilities make SearchBlox an interesting choice for enterprise search.

Chelsea Kerwin, February 17, 2014

Sponsored by, developer of Augmentext

Google Puts Some Effort into the Google Search Appliance

February 12, 2014

Last I knew, the Google Search Appliance (GAS) had trimmed its product line, eliminated the impulse buy option for the Mini, and kept the price at the higher end of the appliance market.

I learned over the last two years that Google has placed more than 60,000 GSAs in organizations. I have no idea if the number is valid, but if it is, the GSA is one of the top dogs in enterprise search. I also heard that there was a small team working on the GSA and an even smaller team handling customer support. Google pushes functions to resellers who deal with the customers. Google outsources manufacturing of the GSA. Most important, Google seems to have an off-again, on-again interest in on premises search. The future, as I understand it, is the cloud. The GSA is, in my opinion, an anachronism in the Nest, X Labs, and Android-Chrome world. But, hey, I have been wrong before. I once asserted that basic search should not be a challenge for most organizations. Wow, did I get that wrong! Jail time, law suits, and DARPA’s almost admission that search is not working notwithstanding.


The GSA has been around almost a decade. Version 7.2 is “a leader in the Garnet Enterprise Search MQ.” I certainly don’t doubt the word of an estimable azure chip consulting firm. No, no, no.

The new version, according to Google, delivers:

  • Metadata sorting. A function available in the 1983 version of Fulcrum Technologies’ system
  • language translation. A function available from Delphes in the 1990s
  • A document preview function. iPhrase in 1999 delivered this feature
  • Entity recognition. Verity implemented this function in the 1980s
  • Dynamic navigation. Endeca rolled out this feature in 1998

In my opinion, the GSA is catching up to innovations available for many years from other vendors. Comparing the EPI Thunderstone and Maxxcat appliances to the GSA emphasizes that the GSA is not quite at parity with other products in the channel.

According to “Google Updates Enterprise Search Appliance Tool,”

The GSA 7.2 update comes more than a year after the firm upgraded the GSA to version 7.0, and builds on the features included in that update. The most notable includes the ability to improve the way data can be indexed with key attributes, such as author name, or the date it was created.

How much does a GSA cost? According to the US government’s, a 36 month license for a GB 7007 is $69,296 for 500,000 documents. Have more documents? Pay for an upgrade. However, I can use a hosted service like Blossom Software to index my content for about $2,400 per month. I can use the low cost dtSearch solution for $160 per seat. I can download an open source solution and do it myself.

For an organization with 20 million documents to index, the cost of the GSA solution noses into HP Autonomy territory. Too rich for my blood, and I think that lower cost appliance vendors will see the Google Search Appliance as a lead generator.

I wonder if those azure chip consultants have licensed the GSA to handle their Intranet information retrieval tasks?

Stephen E Arnold, February 12, 2014

Search Application Perspectives For 2014

January 20, 2014

With 2014 well under way, search experts are trying to predict what will happen for enterprise search. Search Appliance World has an article that takes a look on enterprise search in the past and future called, “The New Search Appliance Landscape: Reflections And Predictions With MaxxCAT.” Basic search commands that come in out-of-the-box system are old school and do not provide the robust solution enterprise systems need.

Search appliances became enterprise users’ favorite toys and everyone had to have the Google Mini Search Appliance, but those days are gone. Other search developers, such as MaxxCat, stepped up to the plate.

The article states:

“ ‘In 2013, we saw a lot of the fallout from that as customers realized they couldn’t replace their Google Mini appliance and went looking for viable alternatives that weren’t $30K. For us, this lead to a huge boost in sales of our entry level appliances and even some additional sales of our enterprise series appliances,’ MaxxCAT Director of Marketing & Sales Chris Whissen told Search Appliance World.”

The MaxxCat developers were interested in exploring new markets their search appliance could expand into. The company is also big on customer service and ensuring that clients know they are valued. The biggest endeavor being made, though, is offering MaxxCat’s clients an efficient solution to solve their search problems and to encourage more competition in the search application market. Google is no longer the small player, but some of its solutions have grown too expensive for its former clients. New companies like MaxxCat keep the market fresh and offer up new ideas.

Whitney Grace, January 20, 2014

Sponsored by, developer of Augmentext

SYL Semantics Resurfaces with News of Patent Bill

January 2, 2014

The article Patent Removal Regretted, But Search Firm Pushes On from ComputerWorld explores the consequences of the Patents Amendment Bill on SYL Enterprise Search in New Zealand. SYL distinguishes itself from most Enterprise Search companies by basing its work not on hype but on “access to relevant information.”

The article states:

“SYL’s platform is based on a dictionary of 580,000 English words, with records of associations among them, such as what words are synonyms and how the concepts they indicate are related; for example that Wellington is in New Zealand. Specialist dictionaries can be added to deal with particular business areas with their own vocabularies. Surveys indicate as much as 25 percent of an executive’s time can be consumed by searching for information”

Syl’s engine works to reduce time-wasting metadata creation by automatically generating plenty of metadata by making associations with words in the document. The clause in the New Zealand bill that a computer program does not qualify as a patentable invention would not effect the patent that SYL already holds on its techniques, but that has not stopped SYL CEO Sean Wilson from voicing his dissent. He suggests that the time and investment put into any invention would be wasted if it were impossible to patent and protect against imitation.

Chelsea Kerwin, January 02, 2013

Sponsored by, developer of Augmentext

Enterprise Search Market Diversifies and Competition Increases

December 30, 2013

The article Enterprise Search Pie on HadoopSphere makes an interesting analogy between a heating up pie and enterprise search. The article claims to bear witness to the altering landscape of the search market. Some of the trends noted include more in-your-face pricing by conservative software, a rising interest in Solr and Lucene-based offerings, cloud based setups and “key spike in the offerings basket.” Analytics for search and content also play a part in enterprise set up, especially for eDiscovery, e-commerce and decision and content management systems.

The article also explains how Cloudera Search is a part of this change:

“Cloudera Search has Apache Solr integrated with CDH, including Apache Lucene, Apache SolrCloud, Apache Flume, Apache Hadoop MapReduce & HDFS, and Apache Tika. Cloudera Search also includes integrations that make searching more scalable, easy to use, and optimized for both near-real-time and batch-oriented indexing. Cloudera has adapted the SolrCloud project  and leveraged Apache Zookeeper to coordinate distributed processing… From a customer perspective, this is an exciting time as Hadoop distributions venture out in broader territory offering them easier data mining capabilities.”

The article also emphasizes IBM Infosphere Data Explorer, once known as Vivismo, which works with BigInsights Hadoop distribution and LucidWorks Search with MapR, which provides data mining capabilities that ingests data into MapR through LucidWorks Search to make the data searchable. The article only imagines more “feature-rich” offerings in the future as competition and interest grow.

Chelsea Kerwin, December 30, 2013

Sponsored by, developer of Augmentext

HP Autonomy: Marketing Collateral from 2011

December 20, 2013

One of the ArnoldIT goslings called to my attention a 2011 PDF white paper with the title (I kid you not):

Human inFormation (sic): Cloud, pan enterprise search, automation, video search, audio search, discovery, infrastructure platfo9rm, Big Data, business process management, mobile search, OEMs, and advanced analytics.

I checked on December 19, 2013, and this PDF was available at

That covers a lot of ground even for HP with or without Autonomy. The analysis includes some “factoids”; for example:

  • Unstructured data represents 85% of all information but structure information is growing at 22% CAGR
  • Unstructured information is growing at 62% CAGR.
  • Users upload 35 hours of video every minute
  • Unstructured data will grow to over 35 zettabytes by 2020
  • Videos on YouTube were viewed 2 billion times per day, 20 times more than in 2006.

You get the idea. With lots of data, information is a problem. I need to pause a moment and catch my breath.

Well, “it’s not just about search.” Again, I must pause. One Mississippi, two Mississippi, and three Mississippi. Okay.

Fundamentally, the ability to understand meaning and automatically process information is all about distance, probabilities, relativeness (sic), definitions, slang, and more. It is an overwhelming and continually growing problem that requires advanced technology to solve.

One technique is to use structured data methods to solve the unstructured problem. (Wasn’t this the approach taken by Fulcrum Technologies, what? 25 or 30 years ago? I just read a profile of Fulcrum that suggested Fulcrum did this first and continues chugging along within the OpenText product line up which competes directly with HP in information archiving.

HP points out, “People are Lazy.” More interesting is this observation, “People are stupid.” I thought about HP’s write off of billions after owning a company for a couple of years, but I assume that HP means “other people” are stupid, not HP people.

Read more

Next Page »