Attensity Command Center Gives Clients Control

July 21, 2011

Attensity Looks to Give Brands a Window into Social Media,” reports the Silicon Valley BizBlog. Attensity is touting its new Command Center software, which takes social media analysis a step further. It’s designed to display the real time information continuously to their customers’ employees. What caught my eye was this passage:

The Attensity Command Center is basically a bank of monitors and the back end software to run the monitors. Using proprietary, patented text analysis algorithms, the platform categorizes incoming tweets by subject, sentiment, and geography, etc. The goal is to aggregate and visualize what’s being said online, so that the customers can know in real time how many people are talking about them and what they’re saying.

Writer Jon Xavier experienced a demo of the product, and was suitably impressed. His only issue was that the passing tweets moved too fast to read them. He noted that to make full use of the software, a company would have to dedicate a couple of employees to monitoring and acting on the information.


Nope, it is not virtual. Will social media augment this reality? Image source:

The interest in social media is fascinating. Once the Internet was for rocket scientists. Now the Internet is the place to stroll. A digital las ramblas. When gizmos are embedded in the human body, the Information Highway takes on an interesting shape. The metaphors used to describe the next big thing will be interesting. For now, Attensity touts control

With this offering, Attensity amps up marketing in the ad sector. Will it be enough to make headway against the Google+ marketing cyclone?

Stephen E Arnold, July 21, 2011

Sponsored by, publishers of The New Landscape of Enterprise Search

Symantec Snaps Up Clearwell to Enter E Discovery Market

July 20, 2011

I do some odd jobs for Enterprise Technology Management. Among them is hosting podcasts on various topics. Last week we did a podcast with several luminaries in the e discovery market. E Discovery is a term used to describe the content and text processing required to figure out what is in unstructured content gathered in a legal matter. There doesn’t have to be a law suit to trigger a company’s running an e Discovery project, but unlike search, e Discovery beckons legal eagles.

We read the article “Symantec acquires Clearwell Systems for $390m.” Perhaps best known for their antivirus software, Symantec also offers an array of information management solutions. Clearwell Systems specializes in e-discovery tools, used in response to litigation and other legal/ investigative matters.

Symantec gains much with the acquisition:

Symantec notes the acquisition will add archiving, backup and eDiscovery offerings to its existing offerings, enabling it to offer a broader set of information management capabilities to customers. The deal will help Symantec provide future product integration opportunities with Symantec backup and security, Symantec NetBackup, Data Loss Prevention and Data Insight, the company said.

This acquisition moves e-discovery to the cloud, while continuing the appliance approach.

On the podcast I learned:

  • There will be a push for more hosted services. Autonomy has done a good job with its Zantaz acquisition and its hosted services, so Symantec is going down a route that leads to a pay off.
  • The Clearwell approach will continue to feature its rapid deployment model. I associated the phrase “rocket docket” with Clearwell which connotes speedy service.
  • The Clearwell report and user audit functions will be expanded and enhanced. I saw a Clearwell report and watched an attorney pop it in an envelope for delivery to another attorney. The system impressed me because the report did not require any fiddling by the attorney. Good stuff.

Naturally, other new services are planned. Stay tuned.

Cynthia Murrell July 14, 2011

Exclusive Interview with Margie Hlava, Access Innovations

July 19, 2011

Access Innovations has been a leader in the indexing, thesaurus, and value-added content processing space for more than 30 years. Her company has worked for most of the major commercial database publishers, the US government, and a number of professional societies.


See for more information about MAI and the firm’s other products and services.

When I worked at the database unit of the Courier-Journal & Louisville Times, we relied on Access Innovations for a number of services, including thesaurus guidance. Her firm’s MAI system and its supporting products deliver what most of the newly-minted “discovery” systems need. Indexing that is accurate, consistent, and makes it easy for a user to find the information needed to answer a research or consumer level question. What few realize is that using the systems and methods developed by the taxonomy experts at Access Innovations is the value of standards. Specifically, the Access Innovations’ approach generates an ANSI standard term list. Without getting bogged down in details, the notion of an ANSI compliant controlled term list embodies logical consistency and adherence to strict technical requirements. See the Z39.19 ANSI/NISO standard. Most of the 20 somethings hacking away at indexing fall far short of the quality of the Access Innovations’ implementations. Quality? Not in my book. Give me the Access Innovations (Data Harmony) approach.

Care to argue? I think you need to read the full interview with Margie Hlava in the Search Wizards Speak series. Then we can interact enthusiastically.

On a rare visit to Louisville, Kentucky, on July 15, 2011, I was able to talk with Ms. Hlava about the explosion of interest in high quality content tagging, the New Age word for indexing. Our conversation covered the roots of indexing to the future of systems which will be available from Access Innovations in the next few months.

Let me highlight three points from our conversation, interview, and enthusiastic discussion. (How often do I in rural Kentucky get to interact with one of the, if not the, leading figure in taxonomy development and smart, automated indexing? Answer: Not often enough.)

First, I asked how her firm fit into the landscape of search and retrieval?

She said:

I have always been fascinated with logic and the application of it to the search algorithms was a perfect match for my intellectual interests. When people have an information need, I believe there are three levels to the resources which will satisfy them. First, the person may just need a fact checked. For this they can use encyclopedia, dictionary etc. Second, the person needs what I call “discovery.” There is no simple factual answer and one needs to be created or inferred. This often leads to a research project and it is certainly the beginning point for research. Third, the person needs updating, what has happened since I last gathered all the information available. Ninety five percent of search is either number one or number two. These three levels are critical to answering properly the user questions and determining what kind of search will support their needs. Our focus is to change search to found.

Second, I probed why is indexing such a hot topic?

She said:

Indexing, which I define as the tagging of records with controlled vocabularies, is not new. Indexing has been around since before Cutter and Dewey. My hunch is that librarians in Ephesus put tags on scrolls thousands of years ago. What is different is that it is now widely recognized that search is better with the addition of controlled vocabularies. The use of classification systems, subject headings, thesauri and authority files certainly has been around for a long time. When we were just searching the abstract or a summary, the need was not as great because those content objects are often tightly written. The hard sciences went online first and STM [scientific, technical, medical] content is more likely to use the same terms worldwide for the same things. The coming online of social sciences, business information, popular literature and especially full text has made search overwhelming, inaccurate, and frustrating. I know that you have reported that more than half the users of an enterprise search system are dissatisfied with that system. I hear complaints about people struggling with Bing and Google.

Third, I queried her about her firm’s approach, which I know to be anchored in personal service and obsessive attention to detail to ensure the client’s system delivers exactly what the client wants and needs.

She said:

The data processed by our systems are flexible and free to move. The data are portable. The format is flexible. The interfaces are tailored to the content via the DTD for the client’s data.  We do not need to do special programming. Our clients can use our system and perform virtually all of the metadata tasks themselves through our systems’ administrative module. The user interface is intuitive. Of course, we would do the work for a client as well. We developed the software for our own needs and that includes needing to be up running and in production on a new project very quickly. Access Innovations does not get paid for down time. So our staff are are trained. The application can be set up, fine tuned, deployed in production mode in two weeks or less. Some installations can take a bit longer. But as soon as we have a DTD, we can have the XML application up in two hours. We can create a taxonomy really quickly as well. So the benefits, are fast, flexible, accurate, high quality, and fun!

You will want to read the complete interview with Ms. Hlava. Skip the pretend experts in indexing and taxonomy. The interview answers the question, “Where’s the beef in the taxonomy burger?”


Stephen E Arnold, July 19, 2011

It pains me to say it, but this is a freebie.

Inteltrax: Top Stories, July 11 to July 15

July 18, 2011

Inteltrax, the data fusion and business intelligence information service, captured three key stories germane to search this week, particularly the explosion of social media-oriented business intelligence.

Our week jumped off with a lengthy feature article, “Facebook Becoming Data Mining Powerhouse,”  draws a surprising correlation between the recent Supreme Court data mining ruling and how Facebook’s advertising arm might use this to tighten its impressive lead on the rest of the advertising world.

Facebook’s bite-sized rival, Twitter, also got a lot of column space this week. First, in the article “Twitter Joins the Analytic Race,” which explored the micro-blog sites recent purchase of analytic house, BlackType and asked, “Why?”.

Like an entertaining Twitter feed, the company popped up frequently over the week, next in the article, “Mining Twitter Mountain.”  This story focused on the numerous analytic apps and programs designed to pluck sentiment and cohesive data from millions of 140-character chunks of info.

It’s no secret that social media is producing more savory data for advertisers, investors and trend-spotters than ever thought possible. We were excited to see the social media companies themselves taking a role, but also intrigued about the analytic companies springing up around them, not unlike mining camps around an Old West gold strike. We’ll be watching these claims and others, rest assured.

Follow the Inteltrax news stream by visiting

Patrick Roland, Editor, Inteltrax.

Sponsored by, publishers of Stephen E Arnold’s new monograph, The New Landscape of Enterprise Search

MarkLogic, FAST, Categorical Affirmatives, and a Direction Change

July 5, 2011

I weakened this morning (July 4, 2011) with a marketing Fourth of July boom. I received one of those ever present LinkedIn updates putting a comment from the Enterprise Search Engine Professionals Group in front of me.


The MarkLogic positioning exploded on my awareness like a Fourth of July skyrocket’s burst.

Most of the comments on the LinkedIn group are ho hum. One hot topic has been Microsoft’s failure to put much effort in its blogs about Fast Search & Transfer’s technology. Snore. Microsoft put down $1.2 billion for Fast, made some marketing noises, and had a fellow named Mr. Treo-something talk to me about the “new” Fast Search system. Then search turned out to be more like a snap in but without the simplicity of a Web part. Microsoft moved on and search is there, but like Google’s shift to Android, search is not where the action is. I am not sure who “runs” the enterprise search unit at Microsoft. Lots of revolving door action is my impression of Microsoft’s management approach in the last year.

The noise died down and Fast has become another component in the sprawling Shanghai of code known as SharePoint 2010. Making Fast “fast” and tuning it to return results that don’t vary with each update has created a significant amount of business for Microsoft partners “certified” to work on Fast Search. Licensees of the Linux/Unix version of ESP are now like birds pushed from the next by an impatient mother.

New MarkLogic Market Positioning?

Set Microsoft aside for a moment and look at this post from a MarkLogic professional who once worked at Fast Search and subsequently at Microsoft. I am not sure how to hyperlink to LinkedIn posts without generating a flood of blue and white screens begging for log in, sign up, and money. I will include a link, but you are on your own.

Here’s the alleged MarkLogic professional’s comment:

Many organizations are replacing FAST with MarkLogic. MarkLogic offers a scalable enterprise search engine with all the features of FAST plus more…


An XML engine with wrappers is now capable of “all” the Fast features. In my new monograph “The New Landscape of Enterprise Search”, I took some care to review information presented by Fast at CERN, the wizard lair in Europe, about Fast Search’s effort to rewrite Fast ESP, which was originally a Web search engine. The core was wrapped to convert Web search into enterprise search. This was neither quick nor particularly successful. Fast Search & Transfer ran into some tough financial waters, ended up the focus of a government investigation, and was quickly sold for a price that surprised me and the goslings in Harrod’s Creek.

You can get the details of the focus of the planned reinvention of the Fast system and the link to the source document at CERN which I reference in my Landscape study. A rewrite indicates that some functions were not in 2007 and 2008 performing in  a manner that was acceptable to someone in Fast Search’s management. Then the acquisition took place. The Linux/Unix support was nuked. Fast under Microsoft’s wing has become a utility in the incredible assemblage of components that comprises SharePoint 2010. I track the SharePoint ecosystem in my information service If you haven’t seen the content, you might want to check it out.

Read more

Big Data Inhabits New Space in the Virtual Market

June 20, 2011

We noticed this press release, “Big Data Mall Opens on the Informatica Marketplace” which was picked up by GlobeNewswire.

Big data is the buzzword du jour to describe large amounts of structured and unstructured information. The idea is that there is so much data to process that traditional methods fall short of delivering useful results at a reasonable cost in the time available for a 30 something decision maker to fill his or her role as a “decider.”

Companies like Informatica are making tackling this contemporary challenge a priority, and continue to lead the way in terms of data management solutions. Concurrent with the release of their Informatica 9.1 Platform, consumers now have access to the recent addition to the Informatica Marketplace, the Big Data Mall.

The Marketplace allows both customer and vendor to collaborate in an effort to better manage the goals of modern commerce. The methods arrived at are what is referred to within the Marketplace as blocks. Specific blocks are then collected into sections known as malls. The release provides an explanation of this new section:

“The Big Data Mall is a focal point for the industry in addressing the challenges and opportunities in Big Data,” said Tony Young, chief information officer, Informatica. “The new Mall debuts with 40 Blocks from Informatica and other leading vendors that map to the three major technology trends that define Big Data – Big Transaction Data, Big Interaction Data and Big Data Processing. New Blocks will be added going forward, as more and more innovative solutions emerge from the industry around Big Data.”

Will big data become the next frontier for findability or will predictive analytics become the next big thing?

Micheal Cory, June 20, 2011

Sponsored by, the resource for enterprise search information and current news about data fusion

Inteltrax: Top Stories, June 10 to June 16 2011

June 20, 2011

For readers of Beyond Search who have an interest in data fusion and analytics, the editor of, our Web log tracking this market, provided us with a run down of last week’s top stories.—Stephen E Arnold

Inteltrax, the data fusion and business intelligence information service, captured four key stories germane to search this week.

First, “Analytics for Cities” points out the many ways companies like IBM are strengthening search for city governments to run smoother using business intelligence and analytics.

Second, “Don’t Forget India When Pushing Analytic Chips Toward China” takes an in-depth look at the burgeoning Chinese analytic and search market. However, those betting heavily on China are doing a disservice overlooking India.

Third, “South Africa Ready to Join Analytics Boom”  shows how some are declaring South Africa dead when it comes to using analytic search, however, a recent economic boom suggests otherwise.

Fourth, “The Rising Tide of Unstructured Data” warns how unstructured data is a growing thread to the analytics and search communities alike.

Clearly, search professionals are being transformed by developments in predictive analytics, whether it is as far away as Africa or China, in their own city or even in their own business’ mounting pile of info. These are subjects that effect our global business world on almost every level and deserve our attention.

Follow the Inteltrax news stream by visiting

Patrick Roland, Editor, Inteltrax June 20, 2011

Thanks to Digital Reasoning, a sponsor of Beyond Search

Will Technology Actually Revolutionize News Gathering

June 18, 2011

One of my for fee columns for July 2011 focuses on AOL One could make the case that is one of the efforts underway to revolutionize news.

Information, particularly news, is in one of those “best of times, worst of times” moments. Shocking event follows shocking story the way a ballpark wiener requires a white bread roll.

Some major formats, channels, and companies are failing. The content is not hot or not relevant. The price is too high for the perceived value or the hassle is too great for the payback.

We found’s “’What Really Happened?’: Using SwiftRiver to Help Confirm Newstips” thought provoking. The story discusses the current failings of news outlets and the increasing efforts to use technological innovations to usher in a new era of reporting. The piece highlights the use of SwiftRiver, described on its site as:

” … a free an open-source suite of tools for managing excessive amounts of real-time data. Our architecture allows users to mashup real-time data from disparate media channels (Twitter, Email, SMS, JSON, XML or RSS/Atom), structures it, then offers methods for using the output.”

Being someone who can easily lose hours poring over articles and posts from a host of media outlets, most of which originate beyond our borders, the drive the author speaks of to rise above the idiocy of modern news media resonated with me. I found this passage somewhat encouraging:

“Can we get a ‘people’s newswire’ based on eyewitness reports of newsworthy events? I believe we can – if we combine the automation of systems like Swiftriver, the data visualization possibilities of tools like Ushahidi, and the insight of trained reporters who can follow up on potential leads.”

We remain open minded. However, will technology replace the traditional approach to identifying a story, researching it, and then putting the write up through a process of nit-picking by colleagues and bosses? We don’t think so, but will it matter to those raised with smartphones and persistent distraction?

Stephen E Arnold, June 18, 2011

Sponsored by, the resource for enterprise search information and current news about data fusion

Attensity Marketing Themes Revealed

June 16, 2011

In my email was the Attensity newsletter, dated June 15, 2011. In addition to unaudited assertions like “the company’s most successful quarter to date” and the word “successful” undefined, there were some interesting hints about the company’s strategy.

First, in the letter from the CEO (Ian Bonner), the company has rolled out a Customer Command Center. The militaristic suggestion is fascinating. A number of search and content processing vendors offer dashboards, but the command center may be a fascinating new view of what text processing software is supposed to do.

Second, the company continues to emphasize the new release of the firm’s flagship, Attensity 6.0. You can get additional information about the system from a Web page with the title “BI Guys Watch Out, It’s Never Been This Easy.

Third, Attensity continues to use webinars to drum up awareness and business. In what I find an interesting move, the webinar about “accuracy” now includes a companion white paper. You can read that document at this link. Registration appears to be simple once you provide the all important contact information. The one two punch of a webinar and a more traditional white paper may be one indication that hot new marketing methods require multiple payload delivery vehicles. I wanted to pick up on the “command center” metaphor.

Finally, a battle of assertions about sentiment appears to be escalating. I elected not to report about the misfires of one well known vendor of sentiment solutions. It seems that Attensity has picked up some vibrations and responded with “When Does Sentiment NOT Matter?” The idea is that sentiment is not an all purpose solution. I agree with Attensity. Perhaps some blogger or sentiment vendor will step up and rip the skrim from the reality of sentiment analysis.

Net net: the “command center” analogy strikes me as marking a step up in the marketing warfare for text analytics. One indicator will be the diffusion of the “command center” metaphor. Which competitor will be the first to embrace this Attensity-ism?

Stephen E Arnold, June 16, 2011

Sponsored by, the resource for enterprise search information and current news about data fusion

Recommind and Predictive Coding

June 15, 2011

The different winners of the Kentucky Derby, Preakness, and Belmont horse races cast some doubt on predictive analytics. But search and content processing is not a horse race. The results are going to be more reliable and accurate, or that is the assumption. One thing is 100 percent certain: A battle over the phrase “predictive coding” in the marketing of math that’s in quite a few textbooks is brewing.

First, you will want to read US 7,933,859, Systems and Methods for Predictive Coding.” You can get your copy via the outstanding online service at The patent was a zippy one, filed on May 25, 2010, and granted on April 26, 2011.

There were quite a few write ups about the patent. We noted “Recommind Patents Predictive Coding” from Recommind’s Web site. The company has a Web site focused on predictive coding with the tag line “Out predict. Out perform.” A quote from a lawyer at WilmerHale announces, “This is a game changer in eDiscovery.”

Why a game changer? The answer, according to the news release, is:

Recommind’s Predictive Coding™ technology and workflow have transformed the legal industry by accelerating the most expensive phase of eDiscovery, document review. Traditional eDiscovery software relies on linear review, a tedious, expensive and error-prone process . . . . Predictive Coding uses machine learning to categorize and prioritize any document set faster, more accurately and more defensibly than contract attorneys, no matter how much data is involved.

Some push back was evident in “Predictive Coding War Breaks Out in US eDiscovery Sector.” The point in this write up is that other vendors have been offering predictive functions in the legal market.

Our recollection is that a number of other outfits dabble in this technological farm yard as well. You can read the interview with Google-funded Recorded Future and Digital Reasoning in my Search Wizards Speak series. I have noted in my talks that there seems to be some similarity between Recommind’s systems and methods and Autonomy’s, a company that is arguably one of the progenitors of probabilistic methods in the commercial search sector. Predecessors to Autonomy’s Integrated Data Operating Layer exist all the way back to math-crazed church men in ye merrie old England before steam engines really caught on. So, new? Well, that’s a matter for lawyers I surmise.

With the legal dust up between i2 Ltd. and Palantir, two laborers on the margins of the predictive farm yard, legal fires can consume forests of money in a flash. You can learn more about data fusion and predictive analytics in my Inteltrax information service. Navigate to

Stephen E Arnold, June 15, 2011

Sponsored by, the resource for enterprise search information and current news about data fusion

« Previous PageNext Page »

  • Archives

  • Recent Posts

  • Meta