Google and Screwups

February 24, 2010

PCWorld is certainly getting frisky. The story “2010 Is Becoming the Year of Google Screwups.” The article written by Robert X. Cringely is going to get lots of clicks. Even the addled goose exercises goose judgment when writing about Googzilla. For example, I wanted to cover the Google mistress article, but I had a tough time figuring out how to hook search into the story.

Not PCWorld. For me, the most interesting point was:

So far, 2010 is shaping up to be the year Google discovered it had feet of clay — and those feet have been spending a lot of time in Google’s mouth.

The Screwups article provides a particularly useful discussion of Google and its handling of copyright violation claims. MBAs are going to love this write up.

In my view, the year is young.

Stephen E Arnold, February 24, 2010

No one paid me to write this. Because of the reference to copyright, I will alert the Copyright Office that I am working like a beaver chewing down potentially useful raw material for paper suitable for ink jet use.

Iron Mountain Snags Mimosa

February 24, 2010

I read in the Microsoft centric publication “Iron Mountain Acquires Mimosa.” Iron Mountain began to grow when records management took off. The company has been riding high on the digital data glut. According to the write up:

Mimosa is the rapidly growing provider of premises-based e-mail and file-based content archival solutions called NearPoint. The Microsoft Gold Certified Partner, popular among Exchange and SharePoint shops, last month announced its 1,000th customer. Though it was founded in 2003, more than 300 of its NearPoint systems were sold last year. By acquiring Mimosa, Iron Mountain can offer premises-based archival and e-discovery it has not been able to offer before.

More information about Mimosa is at

Iron Mountain also owns Stratify, formerly Purple Yogi. Now with all that digital data, how will Iron Mountain customers search, retrieve, process, and manipulate those information objects? When I know the answer, I will let you know. My recollection is that Mimosa has a basic search system, but it is not the fire-breathing dragon that some vendors have in their menagerie.

Stephen E Arnold, February 24, 2010

No one paid me to write this. Because I reference a mineral, iron, I think I have to report free work to the USGS.

Digital Dorks: Maybe Lots of Them?

February 24, 2010

I was amused with the Wall Street Journal’s “free” article called “Nearly 20% of US Is Digitally Uncomfortable or Digitally Distant, FCC Says.” Some folks, according to the Federal Communications Commission, are not hip to online. If not hip to online, I wonder if these segments read books, subscribe to magazines, and frequent the local library to lap up information.

In the Commonwealth of Kentucky, the schools struggle to crank out students who can read. The low level of reading skill may be different where you live, but I find it disturbing. Also Harrod’s Creek is enlivened on some mornings with the sound of gun fire. Avid hunters pursue squirrel (tree rats in the local jargon), deer (rats on hooves), and geese (not sure if I am a rat or not) with enthusiasm. My network access is flakey and the throughput sucks. I have to high speed lines, so unless it snows or rains or is windy or is foggy, the data creeps through.

Why then would it be surprising that in Kentucky and similar states, there would be a large chunk of folks who are digitally different? I think that online for some folks means using an automatic teller machine for cash, not surfing LexisNexis.

Among the factoids in the write up that impressed me were:

  • The digitally distant make up 10 percent of the US population. Figure 320 million people. The distant folks amount to 32 million.
  • The digitally uncomfortable account for another 10 percent. That’s another 32 million people.

That means that 64 million people are not going to be into online like this addled goose. What happens when we consider the appetite for reading and the unemployment rate”? What about the dependence on video games, Twitter, and TV for information? What about the loss of local bookstores? What about the decline in local radio and TV news coverage? Yikes!

For me, it means that knowledge workers comprise an important but probably very tiny segment of society. So when we talk about search and nifty gadgets like the forthcoming Apple iPad, I wonder if the spectacular growth numbers associated with online and tech-based services are sustainable.

Even more interesting a question is, “Will search systems shift from user generated queries to predictive pushing of information?” The idea is that some people may be too busy or simply unable to formulate a query for the information needed to perform a task. Why search? Just take what gets pushed to a person with a question.

Finally, with the information revolution going on for decades, what happens to the organizations who have not yet mastered electronic information? Into what category do these people and their organizations fit? Maybe we need a new segment for digital dorks? Just a question, not a real assertion.

Stephen E Arnold, February 24, 2010

No one paid me to think up the phrase digital dorks. I will report non payment to the Council on Literacy.

Xerox Legal Eagles Swarm at Google and Yahoo

February 24, 2010

Quite a surprise. I have not given Xerox much thought. True, about 11 years ago we had a job to hook one of the DocuTech scanners to the main DocuTech copy machine. Not too tough, but work is work. Since that time, I don’t pay much attention to Xerox. I know about Xerox Parc’s history of innovation, of course. I do recall learning that the company has rolled out an information system for law firms, but I don’t think of Xerox as a document management or eDiscovery company. Xerox to me is a maker of photocopy machines which makes clear why the headline “Xerox Files Patent Suit against Google, Yahoo” caught my attention. I thought, “What?”

The main idea is that Xerox has US6778979, “System for Automatically Generating Queries”. I think I met Greg Grefenstette at one time. The invention, according to the patent document’s abstract states:

A system generates a query using an entity extractor, a categorizer, a query generator, and a short run aspect vector. The entity extractor identifies a set of entities in selected document content for searching information related thereto using an information retrieval system. The categorizer defines an organized classification of document content with each class in the organization of content having associated therewith a classification label that corresponds to a category of information in the information retrieval system. The categorizer assigns the selected document content a classification label from the organized classification of content. A query generator formulates a query that restricts a search at the information retrieval system to the category of information in the information retrieval system identified by the assigned classification label. The short length aspect vector generator generates terms for further refining the query using context information surrounding the set of entities in the selected document content.

Xerox also asserts that the Google infringed on US6236994, “Method and Apparatus for the Integration of Information and Knowledge.” This invention, according to that patent document’s abstract states:

The present invention is a method and apparatus for first integrating the operation of various independent software applications directed to the management of information within an enterprise. The system architecture is, however, an expandable architecture, with built-in knowledge integration features that facilitate the monitoring of information flow into, out of, and between the integrated information management applications so as to assimilate knowledge information and facilitate the control of such information. Also included are additional tools which, using the knowledge information enable the more efficient use of the knowledge within an enterprise, including the ability to develop a context for and visualization of such knowledge.

The TechWeb article reported:

Xerox is seeking treble damages because it claims the defending companies are aware of its patents and that their infringement is willful.

I know zero about the legal world. I do know big bucks when I read about this type of claim. What’s interesting is that Xerox seems happy to talk about the legal matter. According to the write up:

“We have been in dialog with Google and Yahoo for some time about licensing these patents, without reaching a resolution,” a Xerox spokesperson said in an e-mailed statement. “We believe we have no option but to file suit to properly protect our intellectual property.”

The economy may be struggling, but the lawyers involved in this may have a Veyron in the drive way by next spring.

Stephen E Arnold, February 24, 2010

No one paid me to write this. Unlike attorneys, I guess, I work without compensation. I have to report non payment to the USPTO. I hope that group’s online system someday includes more patent documents easily accessible via a search system that does not violate another party’s patent.

Reed Numbers

February 23, 2010

Short honk: I read “Reed Looks to IT Outsourcing to Cut Costs”. I wanted to know if there was any magic behind the company’s financial reports and I was curious about how outsourcing was going to impact existing staff. For information about Reed, click here.

Well, not much was new in the info department, but I did notice several items.

First, the big payday was the result of an acquisition. I noted this telling passage:

Reed saw underlying revenues and adjusted operating profits fall 4 percent and 15 percent, respectively.

Second, Reed is banking on legal and professional information to put the ball in the net. The only problem is Google’s plugging away in the legal information arena. Google has dumped legal info into Google Scholar, but there are some interesting links in Google’s US government index, and from what I hear on the goose pond, there is more legal info coming. Google gives away legal info, allowing advertisers to foot the bill. Reed will have to find a way to compete with Google’s subsidized business model.

Third, the fancy talk about more software and better customer support is interesting, but these are words that may keep the Wall Street folks happy, but I think that this is an old cassette tape stuck in the boom box.

My view is that Reed will face increasing pressure going forward. Last time I checked, there were fewer and fewer attorneys willing to spend big bucks to tap into expensive legal databases. Libraries have some money but there is fierce competition for those dollars which means marketing and sales costs are going up and up.

Outsourcing is a fancy word for getting rid of staff and finding lower cost sources. Interesting but what happens to the “quality” and “value” of the information products? Fun to watch this info giant move forward in my opinion.

Stephen E Arnold, February 23, 2010

No one paid me to write this. Because I mentioned costs, I will report not payment to the Treasury Department. Money wizards there for sure.

Baffled about Real Journalists

February 23, 2010

I am not a journalist. I don’t even know how one becomes a “real” journalist. I learned when I read “Why We Don’t Trust Devil Mountain Software (and Neither Should You)” that big publishing companies don’t know either, assuming the information in the write up is accurate. I guess I should not be surprised. I learned last week at a person who writes about electronic information is officially an “expert” on electronic information. I suppose that means that if I were a crime reporter and I wrote about an alleged illegal activity, I would be qualified to talk about wrongdoing as an “expert”. I wonder how the professionals in law enforcement, military intelligence, and related disciplines feel about “real” journalists becoming experts by virtue of talking to people and reading news items? I suppose faux expertise and “real” journalists are products of the modern world. I find the footprints of these types of folks when I work to mop up after search and content processing disasters. There is a downside to the lack of  information about complex subjects even at outfits who are supposed to “know” what’s “real” and what’s not. One thing is sure. This flap is great for the search engine optimization crowd.

Stephen E Arnold, February 23, 2010

No one paid me to write this. I do use a persona—namely, the addled goose—when I write this column. But I received no crumbs for this article. As a fowl, I will report this bedraggled condition to Fish & Wildlife. I wonder if the “real” journalist was paid for his dual roles?

Now That Is a Diagram

February 23, 2010

Quite an interesting diagram in the ZDNet story “Ten Emerging Enterprise 2.0 Technologies to Watch.”


No work on how to find information unfortunately. Seems to be an opportunity.  The list of the 10 technologies to watch is here. Care for a sample? Here are three I noted:

  1. Social CRM
  2. Social media workflow
  3. Next-generation unified communication

Azure chip consultants, “real” journalists from Devil whatever, and poobahs, get your spurs on!

Stephen E Arnold, February 23, 2010

Unpaid article. I will report this sad fact to the White House which is into 2.x technologies.

Vendor Lock In. Sorry, It Is Here to Stay

February 23, 2010

A happy quack to the reader who sent me a link to the Forbes Magazine story “The Future Of Enterprise Software.” The write up is one that says, “Yo, cloud computing is coming.” The suggestion is that vendors of enterprise software will no longer “lock in” a customer to a specific system.


A commercial company has a mission: make money. One makes money by getting customers and keeping them. In the good old days of the mainframe, there was multi-point lock in. Today, the lock in is somewhat more subtle, but a bit like one of the country club prisoners that some wrong doers enjoy.

For me this passage triggered a chuckle:

Collectively these developments allow corporations to take a step back from software lock-in the way they stepped back from hardware lock-in during the mainframe era. IBM’s mainframes suffered more because of a concern over vendor lock-in than because of inefficiency or performance. And data outsourcing ended abruptly in the client/server era largely because companies were disgusted with service provider lock-in from companies like EDS.

In my experience, here is a run down of what some vendors are now doing:

First, there is the low cost play. Quite a few companies are moving down this path. The idea is to get the customer hooked and then play the “security” or “fear” card. The result is that the company will stick with a particular vendor to get the added value the company delivers. This is a “devil you know” approach.

Second, there is the open source play. Predicated on reducing license fees, most of the open source vendors sell services. Most organizations have at best a couple of wizards who can deal with open source. The rest will pay fees to have the open source experts manage the shop. Hey, it’s still money and it works just like the old lock in model but it has a “green” feel—responsible, natural, lower cost, etc.

Third, there is the cloud play. The idea is that an organization cannot afford or find top notch IT people. The cloud vendor can, so shove the software “out there” and enjoy the reduced costs.

Human nature loathes certain types of change. Once a vendor lands a customer, the vendor manages the customer relationship. The goal is lock in.

In my goose pond, lock in is the name of the game. It has to be because the cost of customer acquisition is too high. The whole point of selling software, systems and services is lock in. Just my opinion, Mr. Forbes.

Stephen E Arnold, February 23, 2010

No one paid me to write this. Because I mentioned lock in, I will report non payment to the General Accountability Office, an outfit that rides herd on vendors who work hard to lock in the Federal government and some succeed pretty darn regularly.

Google Becomes a Bit Like Rome in 200 CE

February 23, 2010

Ouch. Google has “troubling blind spots.” You can read “Privacy, Complexity Seen as Google Blind Spots” and get the straight dope from a newspaper not too far from the Googleplex. For me the most interesting comment was:

“It’s this whole sense of hubris. They get to a certain size and think, ‘We don’t have to care,’ ” he said. “They are just roaming around, not defining themselves, and allowing their actions to be interpreted by whomever.

If on the money, the Google may have a more troubling decline than the Roman Empire. At Google speed, the collapse may take months, not centuries. Will it thrive or dive? Exciting.

Stephen E Arnold, February 22, 2010

I was not paid to write this item. Because I reference the fall or Rome, my disclosure goes straight to the Library of Congress, where history survives.

Autonomy and OpenText: Which Strategy Has Stronger Legs?

February 22, 2010

Two companies in the search and content processing space pepper me with announcements of their acquisitions. I learned today, first via my Overflight service, and then from a gentle reader who sent me a link to the TechCrunch story “Open Text Buys Up Content Analysis Startup Nstein Technologies For $34 Million” that OpenText has acquired another content processing company. OpenText has a pickup truck bed filled with search and content management properties. These range from the long-in-the-tooth BRS Search to the even more aged BASIS system. RedDot and Vignette have given OpenText quite a few content management customers. Not surprisingly, when organic growth is tough to achieve, acquisiitons make good sense.

Keep in mind that Nstein itself is a roll up. The company has a clump of technologies that it has used to build its business. I think of Nstein as a mini-LingTemcoVaught but without the financial lift that Jim Ling was able to infuse into his collection of properties.

What about Autonomy?

Autonomy has also been an aggressive buyer of companies. The firm’s most recent purchases have included Zantaz and Interwoven. Unlike OpenText, Autonomy makes an effort to explain and convert customers to the IDOL (integrated data operating layer) platform. Autonomy has been quick to exploit its acquisitions’ strengths. For example, Zantaz helped propel Autonomy into hosted services. And the Interwoven product became the foundation for Autonomy’s social functionality.

Which is the strategy with the stronger legs? My view is that Autonomy has made more strategic acquisitions and better tactical use of its acquisitions’ technical capabilities.

Will Nstein propel OpenText into the hot areas that Autonomy has tapped? Right off the top of my head, I don’t think so. Here’s why:

  1. OpenText has a lot of products that perform essentially similar or duplicative functions; for example, why does an OpenText engineer have to support two quite interesting code bases to deliver content management? Why not have one platform?
  2. The acquisitions that OpenText has made often bring some legacy issues. For example, Vignette has powered some high profile site, and Vignette has also proven to be for some customers expensive and difficult to operate. Even RedDot customers have had to wait for updates that address some performance problems. I can’t say that every Zantaz customer was a happy camper, but on the whole Autonomy has done a better job of picking companies with what strike me as fewer legacy issues.
  3. Autonomy’s buys have opened new markets. An example is Autonomy’s push into video indexing and its increasing presence in areas such as email archiving, fraud detection, and social media marketing services.

A couple of observations about Nstein. First, I would not call the company a start up. The company opened its doors as a search and content processing company. Then it morphed into a vertical vendor supporting traditional publishers. I was supposed to get a briefing in London in December 2009 but the Nstein people did not keep the appointment. I assume the details of this deal made it impossible to tell me that I did not have to cool my heels for one hour while the Nstein booth staff tried to find their boss. The Newssift service, as I have noted, has not been a home run for Nstein and the other vendors involved in this project. Endeca and Lexalytics contributed to the service, which when I last checked had been taken down. The Financial Times, like other traditional publishers, has found finding the winning formula for online success elusive. Finally, Nstein, according my Overflight search file, opened for business in 2000. That’s not a start up to me.

The goslings here in Harrods Creek hope the acquisition works wonders for OpenText. My hunch is that a number of companies will be gunning for OpenText’s customers. My other hunch is that the investors in Nstein are happy campers despite Canada’s loss to the US hockey team last night. Autonomy has stronger legs in my opinion.

Stephen E Arnold, February 22, 2010

No one paid me to write this. Since I mention Vignette, I will disclose non payment to the GSA, an outfit with some knowledge of Canadian vendors to the US government.

« Previous PageNext Page »

  • Archives

  • Recent Posts

  • Meta