AIIM Report on Content Analytics

March 30, 2010

A happy quack to the reader who sent me a link available from the Allyis Web site for the report “Content Analytics – Research Tools for Unstructured Content and Rich Media”. If you are trying to figure out what about 600 AIIM members think about the changing nature of information analysis, you will find this report useful. I flipped through the 20 pages of data from what strikes me as a somewhat biased sample of enterprise professionals. Your mileage may vary, of course. One quick example. In Figure 4: How would you rate your ability to research across the following content types on page 7, the respondants’ data are pretty good at search customer support logs. The respondents are also confident of their ability to search “case files” and “litigation and legal reports.” My research suggests that these three areas are real problems in most organizations. I am not sure how this sample interprets their organizations’ capabilities, but I think something is wacky. How can, for example, a general business employee assess the ease with which litigation content can be researched. Lawyers are the folks who have the expertise. At any rate, another flashing yellow light is the indication that the respondents have a tough time searching for press articles and news along with collateral, brochures, and publications. This is pretty common content, and an outfit that can search “case files” should be able to locate a brochure. Well, maybe not?

There were three findings that I found interesting, but I am not ready to bet my bread crust on the solidity of the data.

First, Figure 14: What are your spending plans for the following areas in the next 12 months?. The top dog is enterprise search – application. This should give some search vendors the idea to market to the AIIM membership.

Second, respondents, according to the Key Findings, can find information on the Web more easily than they can find information within their organization. This matches what Martin White and I reported in our 2009 study Successful Enterprise Search Management. It is clear that this finding underscores the wackiness in Figure 4, page 7.

Finally, the Conclusion, page 15 states:

The benefits of investment in Finance and ERP systems have only come to the fore with the increasing power of Business Intelligence (BI) reporting tools and the insight they provide for business managers. In the same way, the benefits of Content Management systems can be much more heavily leveraged by the use of Content Analytics tools.

I don’t really understand this paragraph. Finance has been stretched with the present economic climate. ERP is a clunker. Content management systems are often quite problematic. So what’s the analysis? How about cost overruns?

I tucked the study into my reference file. You may want to do the same. If the Allyis link goes dead, you can get the report directly from AIIM but you may have to join the association.

Stephen E Arnold, March 31, 2010

Like the report, a freebie.

Oracle and SAP Squabble, Opportunities for Search Vendors?

March 30, 2010

The world is changing in the enterprise. The first shift I noticed took place last year. Chief financial officers were pushing back on some procurements. I worked on one job for which the procurement team had a fixed amount of money for the year. Period. In the past, procurement teams could edge forward without having a specific amount of money to spend. The second shift is the number of companies who are bundling search inside of their applications whether the search function works particularly well or not. A good example are bigger companies who buy smaller companies with eDiscovery or content management systems. Within the smaller companies’ software resides a search stub. It may come from open source or a low ball vendor or a grandfathered deal with a search giant. Regardless of the approach, the acquiring company’s takes the position that what it just bought is going to do the job.

A third trend jumped out of the article in Marketwatch tagged “Bully Oracle Taunts Smaller Rival SAP.” For me, the most interesting comment in the story was:

They [SAP] have lost their way,” Ellison said, as he noted that it is difficult to make money selling to small and medium business. “If they don’t want to be No. 1, we sure do.” Granted, SAP had a tough year. In 2009, total revenue fell 8% and earnings per share fell 13%, including a restructuring charge. But Ellison is probably really annoyed about SAP’s recent statements that it plans to start doing more acquisitions. As the computer industry has begun to mature, Ellison has proclaimed Oracle the chief consolidator.

Forget the animosity. The point is that two companies are going to grow by acquiring other companies. Since neither outfit has a 21st century search and content processing solution, is this an opportunity for some search vendors to make their stakeholders happy and sell out to one of these aging giants? If you scan the outputs on the Overflight system, you will see that quite a few search vendors have zero marketing activities underway. These are black holes, which may be an indication of preoccupied management or a lack of cash. A few of these outfits have a solid customer base and could benefit from the discipline an acquisition often brings to the acquired company. My hunch is that there are some ripe fruit to pick. Will Oracle and SAP squabble over the apple trees?

Stephen E Arnold, March 31, 2010

A freebie for sure.

Open Source and Fragmentation

March 30, 2010

The open source freight train continues to chug along, and it sparks some interesting discussion. I have been pointing to write ups about the dust up between for-fee data management systems and the open source NoSQL options. My attention focused on the article “Exclusive: Android Foryo to Take a Serious Shot at Stemming Platform Fragmentation.” The particulars of the write up were that open sources gets popular, attracts developers, and then fragments. The case discussed in the article is Android, Google’s open source platform used in mobile phones and other devices. The idea for Google is to get a footprint from which it can dig in its hiking boot for the assault on Mount Apple. For me, the most interesting comment was:

…we’re hearing that Google may be nearing the end of its breakneck development pace on Android’s core and shifting attention to apps and features. By the time we get to Froyo, the underlying platform — and the API that devs need to target — will be reaching legitimate maturity for the first time, which means we should have far fewer tasty treat-themed code names to worry about over the course of an average year. We like awesome new software as much as the next guy, but Google’s been moving so fast lately that they’ve created a near constant culture of obsolescence anxiety among the hardcore user base — and in turn, that leads to paralysis at the sales counter.

The larger issue, of course, concerns the use of this type of “paralysis” is that it may create an opening for non open source vendors. In search, this might be a great sales angle. After all, who wants to deploy an enterprise search solution only to discover that the zigs zagged unexpectedly? Interesting issue and a potential problem for Google and other open source surfers.

Stephen E Arnold, March 31, 2010

Nope. A freebie. I wish I could say an open source outfit paid me, but I can’t. I leave that to the azure chip crowd.

Print Is Alive

March 30, 2010

I quite enjoyed “Opinion: Print Is Dying … Really?” The author is a learned individual named Graydon Carter. His premise is, as the title of the article makes clear, that print is not dying. Like me, Mr. Carter trots out an obligatory reference to Gutenberg, who is probably spinning in his grave at his recent celebrity status. There is also a reference to monks, a trope that I confess to riding to exhaustion when I get the chance. I don’t do much in the new media space because I like to read stuff in print. I am just finishing The Predictioneer’s Game: Using the Logic of Brazen Self-Interest to See and Shape the Future and a lightweight discussion of the wild and crazy Caravaggio, both on paper.

Mr. Carter said:

Americans have taken to inhaling their news in catch-as-catch-can fashion from whatever screens they happen to have at hand: televisions, computers, cell phones, even those little TV sets in elevators. But in this age of constant information availability, it’s important to take a step back every now and then — once a month sounds about right — to immerse ourselves in the stories that define our times. At Vanity Fair, our writers continue to do what they’ve always done, ferreting out everything there is to know about a given subject and then pulling it all together in a gripping, satisfying narrative. A good Vanity Fair story should have at least a couple of the following elements: access, narrative arc, friction and disclosure. A great one has at least three and a truly great one has all four. Additionally, our stable of world-class photographers continue to find creative, visually arresting ways to reveal truths about our subjects in images that will stand up to any thousand words you throw at them. The fact is that people still want great, well-told tales. We see it on, where our longer articles routinely top the Most Popular list. We see it in the fact that our print circulation (both newsstand and subscriptions) is emphatically up at a time when everyone tells us it is supposed to be down. [emphasis added]

This is indeed encouraging. I have had the good fortune to be interviewed by a reporter working on a Condé Nast publication’s story. I am not sure that calling me is going to advance the cause of “everything there is to know.” In fact, I thought the reporter whom I shall not name seemed particularly clueless about the subject of the article.

I am also not sure about the following items and their impact on the finances of Condé Nast:

  1. Staff cutbacks at some of the properties and selling off some properties so that content volume has been affected
  2. The costs of printing, mailing, distributing, and picking up unsold copies of magazines
  3. The costs of maintaining various electronic properties and improving the functions of some of those properties before they become technology laggards, not technology leaders
  4. Packaging print and online in such a way that advertisers get their messages to their targeted demographics
  5. Making investments in digital content pay off in a manner that allows the company to grow and generate margins that meet the needs of stakeholders.

In short, I don’t think print is dead. Those with money to buy a $30 hard copy book or pay upwards of $6.00 for a magazine are going to be around for a while. The problem is that the information consumption patterns and methods are changing. Print is a bit like writing cursive. Schools are dumping the teaching of handwriting. I suppose cursive is irrelevant just as print may be irrelevant to information consumers who are decades younger than I.

Stephen E Arnold, March 29, 2010

No pay for this write up. I will report this injustice to the Kentucky Division of Forestry. Save those trees. Buy a computing device with lead and mercury infused components.

HP: Ink or Information?

March 30, 2010

I have commented about Hewlett Packard’s push into content centric activities. The deal that made me sit up and take notice was the purchase of Exstream Software for more than $1.0 billion a year or two ago. There have been moves in other content areas, including records management. Seeking Alpha caught my attention with its article “Hewlett-Packard: Printers More Valuable Than PCs” and its nifty interactive graph. The main point of the write up is that HP, despite its efforts to make its bottom-line as fat as a French-government, certified goose destined for paté, HP is a printer and ink/toner company. With much of its printing technology influenced by Canon, I wondered about HP’s innovation as well. For me, the most interesting comment in the write up was:

HP’s stock is more sensitive to changes in its printer business than changes in the notebook business. A 5% increase in HP’s printer market share will lead to a 4% upside to $53 Trefis [analyst outfit] price estimate for HP’s stock, while a 5% increase in HP’s notebook share will result in less than a 3% upside.

HP may be big but the company has to find a way to generate more bottom-line impact from its other businesses before the printer and ink business softens.

Stephen E Arnold, March 30, 2010

No one paid me to write this about HP.

NoSQL to Die in Train Wreck

March 30, 2010

I Can’t Wait for NoSQL to Die” adds fuel to the SQL suck and NoSQL is stupid fire fight. I want to make clear that the addled goose paddles in his pond filled with mine run off fluids and takes no strong position on either side in the battle. When the bullets fly, the goose submerges his head and hopes no stray slug makes him oven ready.

The idea is that SQL is pretty darned good and:

The idea is that object relational databases like MySQL and PostgreSQL have lapsed their useful lifetimes, and that document-based or schemaless databases are the wave of the future. Never mind of course that MySQL was the perfect solution to everything a few years ago when Ruby on Rails was flashing in the pan. Never mind that real businesses track all of their data in SQL databases that scale just fine. (For Silicon Valley readers, Wal-Mart is a real business, Twitter is not.)

The write up points out some notable flaws in the NoSQL solutions. Here’s an example and a good one in the goose’s opinion:

So you’ve magically changed your backend from MySQL to Cassandra. Stuff will just work now, right? Well, no. Did you know that Cassandra requires a restart when you change the column family definition? Yeah, the MySQL developers actually had to think out how ALTER TABLE works, but according to Cassandra, that’s a hard problem that has very little business value. Right.



Other SQL advocates to whom I have spoken have pointed out that even Google uses MySQL in its advertising system. Yes, even Google.

My view of this is that I want to start gathering these pro and con arguments. When emotions run high over a technical issue, I think there may be some interesting examples and possibly some financial information beneath the bluster.

Dr. Codd made a wonderful contribution to data management. My hunch is that with an efflorescence of non-Codd methods, perhaps some useful learnings will emerge. It takes years for an innovation to survive the tests imposed by the real world. When the shooting stops, SQL will remain a useful tool and we may have other useful tools to use to solve certain types of problems.

But the arguments and the verbal sharpshooting is a great deal of fun as long as no goose is killed. I don’t want to be dinner or paté just yet.

Stephen E Arnold, March 30, 2010

Nope. No one paid me to write about my interest in self preservation.

IBM and an Open Source Milestone

March 30, 2010

I think the buzz about open source is interesting. I had forgotten that IBM had grabbed the tailgate of the open source bandwagon years ago. “IBM Celebrates a Decade of Linux on Its System Z Mainframe” reminded me that that IBM saw an opportunity to use open source to address some of the objections to the z/OS operating system and its interesting idiosyncrasies. The article said:

Since opening the mainframe to run popular open source Linux applications 10 years ago, there are today 3,150 Linux applications enabled for System Z and 70 per cent of the top one hundred global mainframe customers run Linux.  Rosamilia [IBM manager for mainframes] described the mainframe as middle-aged because “it’s come a long way and has a long way to go.” And it’s timeless. “We made compatible changes so that stuff that ran a long time ago still runs today and yet we’ve still invested in brand new architectures so that we can take advantage of lots of things,” said Rosamilia. He named two top drivers for Linux on System Z. The speed with which it allows new server provisioning, and the control IT administrators have to maintain over servers while lowering risks in the data centre environment. [Emphasis added]

I love IBM mainframes because there is not as much competition to keep these puppies alive and well. IBM loves them too, and I think these comments indicate that open source is more than chopping licensing fees. Open source makes it possible to use “middle-aged” and quite expensive in today’s code-and-run world. STAIRS III, anyone?

Stephen E Arnold, March 30, 2010

A freebie. An unsponsored write up.

Oracle Text Visualization

March 30, 2010

Short honk: A happy quack to the reader and obvious Oracle aficionado for this tip. You can download the Java library for Oracle Text visualization directly from Oracle and at no charge. The information on the Oracle Web site is sparse:

This Java library for Oracle Text visualization is incorporated in the Oracle Text sample code application to visualize clusters, categories, and themes.

Act now.

Stephen E Arnold, March 30, 2010

No one paid me to pass along this link.

Google and the Fine Line between Content Access and Creation

March 29, 2010

Short honk: The addled goose lacks the intellectual and financial resources of the New York Times. So the write up “Not Creating Content. Just Protecting It” is spot on. I would suggest, however, that a quick read of Google Version 2.0 and Google: The Digital Gutenberg suggest that Google has gone to some trouble to create systems and methods to process information objects and create other information objects.


Google can release one of its innovations with little incremental effort. Google’s technical systems and methods store considerable “potential energy”. Source:

The “information factory” can slice and dice information about digital cameras and there are a number of patent applications that explain the virtues of this method and the clever “filling in the gaps” method. There are also outputs that mash up inputs and create new content objects. The Michael Jackson example which Googler Cyrus informed me I “made up in Photoshop” is a notable example. Let’s not confuse public relations and damage control with the functions resident within Google’s technical arsenal. I recall the buzz about Google becoming a $100 million media company but that seems to have been subject to some change in the last few months. What’s technically possible and what’s shaped to meet the needs of the present business landscape may be two quite different things in my opinion.

Stephen E Arnold, March 29, 2010

No one paid me to write this.

Autonomy Takes a Poke in the Ribs

March 29, 2010

I saw a news item on Yahoo a few minutes ago (March 28, 2010, 2 pm Eastern), and I wondered it the item were accurate. The news story “Shares of Autonomy Appear Overvalued” may be little more than the stuff flowing from a disgruntled analyst struggling with a wrinkled shirt. Potentially negative financial news becoming available on a Sunday is not the same as a negative story fired out before trading begins on a Monday. At any rate, the story reports that the financial publication Barrons has done some tea leaf analysis and concluded that “the Britain-based software maker may now be overvalued and set for a pullback.” For me, the interesting comment was:

Given its modest returns on invested capital and “granting it some premium for improving returns, a multiple closer to its organic earnings growth could trim shares by roughly 20 percent,” the report said.

I will now try to chase down the Barrons report and make sure I keep an eye on Autonomy’s response. I thought the company was taking other search and content processing companies to school. Even OpenText, which has had a somewhat similar approach to generating growth and buzz, has not impressed me as much as Autonomy.

Stephen E Arnold, March 29, 2010

No pay for this one. I will report no dough to the SEC, an outfit on the alert always for financial silliness like working for no money.

