Attivio and MC+A Combine Forces

April 7, 2018

Over the years, Attivio positioned itself as more than search. That type of shift has characterized many vendors anchored in search and retrieval. We noted that Attivio has “partnered” with MC+A, a search centric company. MC+A also forged a relationship with Coveo, another search and retrieval vendor with a history of repositioning.

We learned from “Attivio and MC+A Announce Partnership to Deliver Next-Generation Cognitive Search Solutions” at Markets Insider that:

“MC+A will resell Attivio’s platform, seamlessly integrate their enterprise-grade connectors into it, and provide SI services in the US market. ‘Partnering with MC+A extends our ability to address organizations’ needs for making all information available to employees and customers at the moment they need it,’ said Stephen Baker, CEO at Attivio. ‘This is particularly critical for companies looking to upgrade legacy search applications onto a modern, machine-learning based search and insight platform.’ …

The story added:

“By combining self-learning technologies, such as natural language processing, machine learning, and information indexing, the Attivio platform is helping Fortune 500 enterprises leverage customer insight, surface upsell opportunities, and improve compliance productivity. MC+A has over 15 years of experience innovating with search and delivering customized search-based applications solutions to enterprises. MC+A has also developed a connector bridge solution that allows customers to leverage existing infrastructure to simplify the transition to the Attivio platform.”

Attivio was founded in 2007, and is headquartered in Newton, Massachusetts. The company’s client roster includes prominent organizations like UBS, Cisco, Citi, and DARPA. Attivio in its early days was similar in some ways to the Fast Search & Transfer technology once cleverly dubbed ESP. No, not extra sensory perception. ESP was the enterprise search platform.

Based in Chicago and founded in 2004, MC+A specializes in implementations of cognitive search and insight engine technology. A couple of years ago, MC+A was involved with Yippy, the former Vivisimo metasearch system. When IBM bought Vivisimio, the metasearch technology morphed into a Big Data component of Watson.

If this walk down memory lane suggests that vendors of proprietary systems have been working to find purchase on revenue mountain, there may be  a reason. The big money, based on information available to Beyond Search, comes from integrating open source solutions like Lucene into comprehensive analytic systems.

In a nutshell, the rise of Lucene and Elastic have created opportunities for some companies which can deliver more comprehensive solutions than search and retrieval anchored in old-school solutions.

More than repositioning, jargon, and partnerships may be needed in today’s market place where “answers”, not laundry lists are in demand. For mini profiles of vendors which are redefining information access and answering questions, follow the news stories in our new video news program DarkCyber. There’s a new program each week. Plus, you can get a sense of the new directions in information access by reading my 2015 book (still timely and very relevant) CyberOSINT: Next Generation Information Access.

Stephen E Arnold,

Stephen E Arnold, April 7, 2018

Speeding Up Search: The Challenge of Multiple Bottlenecks

March 29, 2018

I read “Search at Scale Shows ~30,000X Speed Up.” I have been down this asphalt road before, many times in fact. The problem with search and retrieval is that numerous bottlenecks exist; for example, dealing with exceptions (content which the content processing system cannot manipulate).

Those who want relevant information or those who prefer superficial descriptions of search speed focus on a nice, easy-to-grasp metric; for example, how quickly do results display.

May I suggest you read the source document, work through the rat’s nest of acronyms, and swing your mental machete against the “metrics” in the write up?

Once you have taken these necessary steps, consider this statement from the write up:

These results suggest that we could use the high-quality matches of the RWMD to query — in sub-second time — at least 100 million documents using only a modest computational infrastructure.

Image result for speed bump

The path to responsive search and retrieval is littered with multiple speed bumps. Hit any one when going to fast can break the search low rider.

I wish to list some of the speed bumps which the write does not adequately address or, in some cases, acknowledge:

  • Content flows are often in the terabit or petabit range for certain filtering and query operations., One hundred million won’t ring the bell.
  • This is the transform in ETL operations. Normalizing content takes some time, particularly when the historical on disc content from multiple outputs and real-time flows from systems ranging from Cisco Systems intercept devices are large. Please, think in terms of gigabytes per second and petabytes of archived data parked on servers in some countries’ government storage systems.
  • Populating an index structure with new items also consumes time. If an object is not in an index of some sort, it is tough to find.
  • Shaping the data set over time. Content has a weird property. It evolves. Lowly chat messages can contain a wide range of objects. Jump to today’s big light bulb which illuminates some blockchains’ ability house executables, videos, off color images, etc.
  • Because IBM inevitably drags Watson to the party, keep in mind that Watson still requires humans to perform gorilla style grooming before it’s show time at the circus. Questions have to be considered. Content sources selected. The training wheels bolted to the bus. Then trials have to be launched. What good is a system which returns off point answers?

I think you get the idea.

Read more

Hulbee: Enterprise Intranet Search System

February 26, 2018

I associated the Hulbee brand with the Web search system Swisscows. Like Exalead and Qwant, Swisscows provides a user friendly search system. Key in some words, and Swisscows delivers the results. I ran a query for the UK smart software system SherlockML and received these results:


The distinctive features of the system struck me as:

  • Privacy-centric. The queries are not retained by Swisscows.
  • The system filters to eliminate violent and pornographic videos. Overt queries return no results. Certain queries return results which might raise some users’ hackles.
  • Search results appear to come from Microsoft Bing, a “partnership.”
  • Tile search which are clickable rectangular blocks of related content. which are similar to the Ben Shneiderman inspired visualizations for presenting search results, but Swisscows cleans up and makes more usable the visuals
  • An icon which sends the user to the page in the results list. Most search engines display a hyperlink, which can be difficult to top accurately on some mobile device display screens
  • The key search term presented in a white block with “closeness” of other concepts and terms shown by proximity to the white block; for example, ASI Data is the developer of SherlockML and the company is based in the UK. However, ASI does not appear in the blocks. The idea is a useful one in my idea, but some refinement may be warranted.

I learned from Telecompaper that Hulbee also offers an enterprise search system for Intranet content. The idea is that Hulbee, like Yippy and other search vendors, can be a replacement for the more than 55,000 orphaned Google Search Appliance customers. I often wonder how many of these GSAs are still in use because Google has never provided oodles of data about its misguided, overpriced, and odd ball “one size fits all” approach to what is a highly particularized problem.

Telecompaper reports:

Enterprise Search is flexible and scalable; in addition to internal use, it can also be used on the company’s website and external online shop. The advantage for companies is that they can tailor the search tool to suit their needs, without any external advertising included in the results. Customers can also choose Enterprise Search as a hosted service at Swisscom data centers or an on-premise service on their own servers.

One of company’s promotional videos features — wait for it — Swiss cows, although I am not able to differentiate among cow nationalities:


It seems that “enterprise search for an Intranet” has bundled a number of other search and retrieval functions; for example, Web site search and eCommerce. In my experience, some enterprise search vendors have offered “Swiss Army knife solutions” in the past. The reality of commercial enterprises is that search and retrieval needs are idiosyncratic; for example, lawyers require systems that can be used for eDiscovery, engineers have to locate drawings and their associated products, marketers want to pinpoint versions of PowerPoints, marketing collateral, and email, etc.

If you want more information about Swisscows, navigate to this link. You can check out the personal appeal for a donation from the company’s founder at this Web page.

Give the system a look, please.

Stephen E Arnold, February

Everyone Should Know the Term Cognitive Computing

December 19, 2017

Cognitive computing is a term everyone in the AI world should already be familiar with. If not, it’s time for a crash course. This is the DNA of machine learning and it is a fascinating field, as we learned from a recent Information Age story, “RIP Enterprise Search –AI-Based Cognitive Insight is the Future.”

According to the story:

The future of search is linked directly to the emergence of cognitive computing, which will provide the framework for a new era of cognitive search. This recognizes intent and interest and provides structure to the content, capturing more accurately what is contained within the text.


Context is king, and the four key (NOTE: We only included the most important two) elements of context detection are as follows:


Who – which user is looking for information? What have they looked for previously and what are they likely to be interested in finding in future? Who the individual is key as to what results are delivered to them.
What – the nature of the information is also highly important. Search has moved on from structured or even unstructured text within documents and web pages. Users may be looking for information in any number of different forms, from data within databases and in formats ranging from video and audio, to images and data collected from the internet-of-things (IOT).

Who and what is incredibly important, but that might be putting the cart before the horse. First, we must convince CEOs how important AI is to their business…any business. Thankfully, folks like Huffington Post are already ahead of us and rallying the troops.

Patrick Roland, December 19, 2017


The Future Is Search. Hmmm

December 17, 2017

I read an unusual chunk of content marketing. Navigate to “In the Rush to Big Data, We Forgot about Search.” Who’s the “we”? I think the “we” are customers who are migrating next generation information access systems. Lawyers have relativity. Manufacturers have SAP and Dassault solutions. Folks without much faith in commercial search vendors have Elasticsearch or low-cost systems which deliver a list of results which match a query. The “we”, therefore, seems to refer to the Lucid Imagination outfit now doing business as Lucidworks.

The write up explains that “we need to look at search to be the glue that lets us find the data and analyze it together no matter where it lives.”

That sounds super.

I think there are companies delivering this type of service as they have been for a number of years.

The reason is that vendors who are anchored in search and retrieval like Lucidworks have been bypassed.

In Dark Cyber I write about a stealthy outfit called Blackdot. The company complements the Relativity eDiscovery platform. Sure, there’s a search function, but Relativity does analytics, clustering, and functions which fit the needs of those engaged in eDiscovery. Search is part of the game, which for big cases, involves big data.

Blackdot enhances Relativity. You can learn about some of the functions of this company in the December 26, 2017, Dark Cyber video program.

So what?

The so what is that the services provided by Relativity and Blackspot deliver high value outputs that provide outputs which are immediately useful to analysts, investigators, lawyers, and others who use the integrated systems to solve problems.

A company which wants to deliver this type of service is likely wade into high water and thrash for purchase. The reason is that building a solution from open source tools and home brew scripts is a tough job.

Specialists have been using open source and proprietary code to roll out information access solutions. Relativity is just one example. By the way, Relativity has been plugging away for more than a decade.

A column which makes a case for a customer to let a vendor of open source search build from ground zero a next generation information access solution is going to be a vendor with a smile. However, once the solution fails to meet expectations, those smiles will turn to frowns.

Maybe that’s why Lucidworks has burned through one original founder, several presidents, and $59 million?

Search is a utility. It is not a headliner. Search works when it complements higher value functionality such as those delivered by Relativity and Blackdot or any of the other firms we track for our CyberOSINT research.

Search had its fling, but the glory days faded. When we look at the landscape of enterprise search or Big Data for that matter, we see winners. From our vantage point in Harrod’s Creek, the company leading the much smaller search parade is Elastic. Yep, it’s Lucene, but it has a following.

Guess who one of the followers is. Give up. Lucidworks. The technology is based on Lucene.

Selling consulting services is one thing. Selling search is another.

Today’s forward looking companies want next generation access, and they can get it from dozens of vendors. No starting from scratch. Sign a deal and begin processing data (big or small).

I highlighted this statement from the write up:

So if you move some of your data to SaaS solutions, move some of your data to PaaS solutions, move some of your data to IaaS solutions and across multiple vendors’ cloud platforms while maintaining some of your data behind the firewall—yeah, no one is going to find anything!

Sure. Solve problems. Don’t create them. One can search for solutions using a search engine. Let me know how that works out for your next big decision which you have to make in 10 seconds or less.

Stephen E Arnold, December 17, 2017

Elastic Remains Strategically Bouncy

November 10, 2017

Enterprise search remains a dull and rusty sword in the museum of enterprise applications. Frankly, other than wordsmithing with wild and crazy jargon, the technology for finding information in an organization works a bit like the blacksmith under the spreading chestnut tree.

The big news from my point of view has been the uptake in open source enterprise search software. The lead dog is Lucene. Even the much hyped free version of Fast Search technology pitched as Solr is built on Lucene.

Image result for winner

Yep, there are proprietary solutions, but where are these folks? Outfits with search technology are capturing the hearts and minds of decision makers who want solutions to findability problems, not the high speed sleet of buzzwords like ontology, taxonomy, natural language processing, facets, semantics, yada, yada, yada.

I read an article, which I assume is true, because I believe everything I read on the Internet and in white papers. The write up is “Elastic Acquires SaaS Site Search Leader Swiftype.” Elastic is the result of a bold search experience called Compass. The champion of this defunct system was Shay Banon, who created Elasticsearch.

For many people, Elasticsearch and the for fee “extras” available from the company Elastic is Lucene. Disagree? Everyone is entitled to an opinion, gentle reader.

The write up informed me:

Elastic, the company behind Elasticsearch, and the Elastic Stack, the most widely-used collection of open source products for solving mission-critical use cases like search, logging, and analytics, today announced that it has acquired Swiftype, a San Francisco-based startup founded in 2012 and backed by Y Combinator and New Enterprise Associates (NEA). Swiftype is the creator of the popular SaaS-based Site Search and the recently introduced Enterprise Search products.

Swiftype used Elastic to captur3e some customers with its search solution. According to the write up, even Dr. Pepper found a pepper upper with Swiftype’s Elasticsearch based system.

Why’s this important? I jotted down three reasons as I was watching a group of confused deer trying to cross a busy highway. (Deer, like investors in enterprise search dream spinners, are confused by the movement of fast moving automobiles and loud pick up trucks.)

First, compare Elastic’s acquisition with Lucidworks purchase of an interface company. Elastic bought people, a solution, and customers. Interfaces are okay, but those who want to find information need a system that springs into action quickly and can be used to deal with real world information problems. Arts and crafts are important, but not as important as search that returns relevant results and performs useful functions like chopping log files into useful digital lumber.

Second, Elastic has been on a role. We profiled the company for a wonky self appointed blue chip consulting firm years ago. The report went nowhere due to the managerial expertise of a self appointed search expert. See this link for details of this maven. In that report, my team of researchers verified that large companies were adopting Elasticsearch because those firms had the most to gain from an open source product which could be supported by third party engineers. Another plus was that the Elasticsearch product could be extended and amplified without the handcuffs of a proprietary search vendor’s license restrictions.

Third, Elasticsearch worked. Sure, it was a hassle to become familiar with the system. But if there were an issue, the Lucene community was usually available for advice and often for prompt fixes. Mr. Banon pushed innovations down the trail as well. It was clear five years ago and it is clear today that Elastic and Elasticsearch are the go to systems for some savvy people. Contrast that with the floundering of outfits flogging their search systems on LinkedIn or on vapid webinars about concepts.

Net net: Elastic is an outfit to watch. For most of Elastic’s competitors watching is easy when one is driving a Model T behind the race leader in one of those zippy Hellcats with 700 horsepower.

Even blacksmiths take notice when this baby roars down the highway. And the deer? The deer run the other way.

Stephen E Arnold, November 10, 2017

Enterprise Search: Will Synthetic Hormones Produce a Revenue Winner?

October 27, 2017

One of my colleagues provided me with a copy of the 24 page report with the hefty title:

In Search for Insight 2017. Enterprise Search and Findability Survey. Insights from 2012-2017

I stumbled on the phrase “In Search for Insight 2017.”


The report combines survey data with observations about what’s going to make enterprise search great again. I use the word “again” because:

  • The buy up and sell out craziness which culminated with Microsoft’s buying Fast Search & Transfer in 2008 and Hewlett Packard’s purchase of Autonomy in 2011 marked the end of the old-school enterprise search vendors. As you may recall, Fast Search was the subject of a criminal investigation and the HP Autonomy deal continues to make its way through the legal system. You may perceive these two deals as barn burners. I see them as capstones for the era during which search was marketed as the solution to information problems in organizations.
  • The word “search” has become confusing and devalued. For most people, “search” means the Danny Sullivan search engine optimization systems and methods. For those with some experience in information science, “search” means locating relevant information. SEO erodes relevance; the less popular connotation of the word suggests answering a user’s question. Not surprisingly, jargon has been used for many years in an effort to explain that “enterprise search” is infused with taxonomies, ontologies, semantic technologies, clustering, discovery, natural language processing, and other verbal chrome trim to make search into a Next Big Thing again. From my point of view, search is a utility and a code word for spoofing Google so that an irrelevant page appears instead of the answer the user seeks.
  • The enterprise search landscape (the title of one of my monographs) has been bulldozed and reworked. The money in the old school precision and recall type of search comes from consulting. Search Technologies was acquired by Accenture to add services revenue to the management consulting firm’s repertoire of MBA fixes. What is left are companies offering “solutions” which require substantial engineering, consulting, and training services. The “engine”, in many cases, are open source systems which one can download without burdensome license fees. From my point of view, search boils down to picking an open source solution. If those don’t work, one can license a proprietary system wrapped around open source. If one wants a proprietary system, there are some available, but these are not likely to reach the lofty heights of the Fast Search or Autonomy IDOL systems in the salad days of enterprise search and its promises of a universal search system. The universal search outfit Google pulled out of enterprise search for a reason.

I want to highlight five of the points in the 24 page write up. Please, register to get your own copy of this document.

Here are my five highlights. My comments are in italics after each quote from the document:

Read more

Enterprise Search: Still Floundering after All These Years

October 11, 2017

Enterprise search conferences once had pride of place. Enterprise search or “search” was the Big Data, artificial intelligence, and cyber intelligence solution from 1998 to 2007.

But by 2007, the fanciful claims of enterprise search vendors were perceived as “big hat, no cattle” posturing. Unable to generate sustainable revenues, the high profile enterprise search systems began looking for a buyer. Those who failed disappeared. Do you know where Convera, Delphes, Entopia, and Siderean are today? What’s the impact of Exalead on Dassault? Autonomy on Hewlett Packard Enterprise? Vivisimo on IBM?

Easy questions to ignore. Time marches on. Proprietary search cost a bundle to keep working. The “fix” to the development, enhancement, and bug fix problems was open source.

A solution emerged. Lucene. That brings us to the title of this blog post: “Enterprise Search: Still Floundering after All These Years.”

The money from license fees is insufficient to make enterprise search work in a good enough way. Open source search, which seems to be largely free of license fees, allows vendors to offer search and highly profitable services to the organizations who want or need an “enterprise search system.”

This means that a vendor who makes more money offering search services can be perceived as a problem to an venture funded company built on promises and tens of millions in venture capital.

The truth of this observation was revealed in an article written by or for Search Technologies, a unit of a Fancy Dan consulting firm. If I understand the Search Technologies’ write up, Lucidworks (né Lucid Words) told Search Technologies that it was not welcome at a conference designed to promote Solr.

Here’s what Search Technologies said in “Why Wasn’t Search Technologies at Lucene/Solr Revolution 2017?”

Lucene/Solr Revolution’s organizer, Lucidworks, informed us that we were no longer welcome to exhibit or speak at the event. Lucidworks considered us a company that:

  • Competes with their professional services group (maybe)
  • Is not likely to resell Lucidworks’ platform exclusively (we are vendor-agnostic, after all), and,
  • Has technology assets that compete with their Fusion platform (partially true)

I don’t care too much about venture funded outfits running conferences to make their “one true way” evident to the attendees. I don’t worry about a blue chip consulting firm’s ability to generate sales leads.


I find that some of enterprise search’s most problematic weaknesses have not been solved after 50 years of flailing. Examples include:

  • The cost of moving beyond “good enough” information access
  • Revealing that enterprise search systems are expensive to tune and shape to the needs of an organization
  • Developing solutions which keep indexes current and searches responsive
  • Seamless handling different types of content, including video, engineering drawings, and data tucked inside legacy systems
  • Keeping the majority of the users happy so bootleg search systems are not installed to meet departmental or operating unit needs.

The “search” problem is an illustration of innovation running out of gas. I have zero stake in Lucidworks, Search Technologies, or enterprise search. I am content to be an observer who points out that search vendors, their marketing, the consultants, and the conference organizers are their own worst enemy.

That’s why enterprise search imploded about a decade ago. Search today is pretty much “good enough.” Antidot, Lucene, Solr, dtSearch, X1, Fabasoft, Funnelback, et al. Each does “good enough” search in my opinion.

To make any system better takes consulting and engineering services. These deliver high margins. Users? Well, users want enterprise search to answer questions and work like Google. After 50 years of effort, no company has been able to meet the users’ needs.

That says more than two consulting firms trading digital jabs. What’s at stake is consulting revenue and proprietary fixes. Users? Yes, what about the users?

Stephen E Arnold, October 10, 2017

Palantir Settlement Makes Good Business Sense

October 11, 2017

Palantir claims it is focusing on work, not admitting its guilt over a labor dispute in a recent settlement. This is creating a divide in the industry about what it exactly does mean. We first learned of the $1.66 million settlement in How To Zone’s story, “Palantir Settles Discrimination Complaint with U.S. Labor Agency.”

How did we get here? According to the story:

The Labor Department said in an administrative complaint last year that it conducted a review of Palantir’s hiring process beginning in 2010. The agency alleged that the company’s reliance on employee referrals resulted in bias against Asians. Contracts worth more than $370 million, including with the U.S. Defense Department, Treasury Department and other federal agencies, were in jeopardy if the Labor Department had found Palantir guilty of discrimination.

Serious accusations. But this settlement might not signal what you think it does. Palantir said in a statement:

We settled this matter, without any admission of liability, in order to focus on our work.

This might be the smartest action on their behalf. Consider what happened to SalesForce when they got wrapped up in a legal battle earlier this year. It not only slowed down their sales, but some experts feel the suit may have altered enterprise search for good.

Something tells us Palantir, with its rich government contracts, wants to simply put this behind them and not get caught in a legal web.

Patrick Roland, October 11, 2017

Are Vendors of Enterprise Search Distracted?

September 6, 2017

I read “To Have Good Ideas, Remember to Get Bored.” I noted this assertion in the write up:

the temptation of constant podcast listening, phone fiddling, and TV watching takes over. The fight to maintain some boredom never ends.

The idea is that distraction kills boredom. Without boredom, “people” do not get good ideas. Ergo when one notes lots of bad ideas, that may be the signal that distraction undermines innovative thinking.

I am not certain the statements in the write up and the accompanying TED talk (which bored me, by the way) are applicable across a population sleeted at random in Rwanda or rural Kentucky, but let’s assume the idea has value.

I look at enterprise search and I see the same old perpetual motion machines: Semantics, metatagging, context, yada yada.

Perhaps those involved in enterprise search system development are manifesting their distractedness. Instead of putting down the mobile and performing myriad displacement activities, are enterprise search system developers fresh out of ideas.

Something’s wrong. Analysts find search just peachy when relying on SAP, IBM Watson, Fabasoft, and the other systems available today.

I know I am bored, and I would postulate that those involved in next generation information access systems may want to cultivate a bit of boredom as well. Innovation may come about.

Example: As I was thinking about today’s me-to enterprise search systems, I was bored. I decided to begin work on a new book in my cyber intelligence series. How does eDiscovery for Investigators sound?


Stephen E Arnold, September 6, 2017

Next Page »

  • Archives

  • Recent Posts

  • Meta