Visualization: Interesting and Mostly Misleading
May 7, 2009
i fund the EVvie award for excellence in search system analysis. This year’s number two winner was Martin Baumgartel for his paper “Advanced Visualization of Search Results: More Risks, or More Chances?”. You can read the full text here and a brief interview with him here. You will want to read Mr. Baumgartel’s paper and then the Fast Company article called “Is Information Visualization the Next Frontier for Design?” here by Michael Cannell.
These two write ups point out the difference between a researcher’s approach to the role of visualization and a journalist’s approach to the subject. In a nutshell, visualization works in certain situations. In most information retrieval applications, visualization is not a benefit.
The Fast Company approach hits on the eye candy value of visualization. The title wisely references design. Visualization as illustration can enhance a page of text. Visualization as an aid to information analysis may not deliver the value desired.
Which side of the fence is right for your search system? Selective use of visualization or eye candy? The addled goose favors selectivity. Most visualizations of data distract me, but your response may be different. Functionality, not arts and crafts, appeal to the addled goose.
Stephen Arnold, May 7, 2009
Yahoo Snags a Google Guru
May 7, 2009
A happy quack to the reader in Europe who alerted me to Yahoo’s hiring a Google guru. Dr. Yoelle Maarek, previously head of one of Google’s Haifa’s R&D centers. After a restructuring, Dr. Maarek found herself demoted. She then jumped to Yahoo Israel in Haifa. The move doesn’t require a change in her commute. Yahoo is in the same building as Google, just a couple of floors separate the two facilities. The original story appeared in Globes here.
Stephen Arnold, June 7, 2009
Search 2010: Five Game Changers
May 7, 2009
Editor’s Note: This is the outline of Stephen Arnold’s comments at the “debate”session of the Boye 09 Conference in Philadelphia, Pennsylvania, on May 6, 2009. The actual talk will be informal, and these notes are part of the preparation for that talk.
Introduction
Thank you for inviting me to share my ideas with you. I remember that WC Fields had a love hate relationship with Philadelphia. Approaching the Curtis Building, where we are meeting, I realized that much of the old way of doing business has changed. I don’t have time to dig too deeply into the many content challenges organizations face. If the publisher of the Saturday Evening Post were with us this afternoon, I think Mr. Curtis would have a difficult time explaining why his successful business was marginalized; that is, pushed aside, made into an artifact like the Liberty Bell down the street.
I have been asked to do a “Search 2010” talk twice this year. Predicting the future in today’s troubled economic environment is difficult. Nevertheless, I want to identify five trends in the next 20 minutes. I will try to take a position on each trend to challenge the panelists’ thinking and stimulate questions from you in the audience.
Let’s dive right in. Here are the five trends:
- Darwinism and search
- Real time search
- Google’s enterprise push
- Microsoft’s enterprise search
- Open source
I want to comment on each, offer a couple of examples, and try to come at these subjects in a way that highlights what my research for Google: The Digital Gutenberg revealed as substantive actions in search.
Search and Darwin
The search sector is in a terrible position. The term “search” has been devalued. Few people know what the word means, yet most people say, “I am pretty good at search.” That confidence is an illusion. The search sector is a tough nut to crack. Well known companies such as Mondosoft and Ontolica found themselves purchased by an entrepreneur. That company restructured, and now the “old” Mondosoft has been reincarnated but it is not clear that the new owners will make a success of the business. Delphes, a specialist vendor in Québec, failed. Attensity orchestrated a roll up with two German firms to become more of a force in marketing. A promising system in the Netherlands called Teezir was closed when I visited the office in November 2009. I hear rumors about search vendors who are chasing funding frequently, but I don’t want to mention the names of some of these well known firms in this forum. Not long ago, the high profile Endeca sought support in the form of investments from Intel and SAP’s venture arm. At Oracle, the Secure Enterpriser Search 10g product has largely disappeared. The strong survive, which means big players like Google and Microsoft are going to fighting for the available revenue.
Real Time Search
What is it? The first thing to say is that real time search is a terrible phrase. Riches await the person who crafts a more appropriate buzzword. The notion is that messages from a service like Twitter fly around in their 140 character glory. The Twitter search system at http://search.twitter.com or the developers who use the Twitter API make it easy to find or see information. A good example is the service at http://www.twitturly.com or http://www.tweetmeme.com. You look at Tweets (the name for Twitter messages) and you scan the listings on these services. Real time search blends geospatial and mobile operations. Push, not key word search, complements scanning a list of suggested hits. The mode of user interaction is not keyword search. This is an important distinction.
“Search” means look at or scan. “Search” does not mean type key words and hunt through results list. It is possible to send a Tweet to everyone on Twitter or to those who follow you and ask a question. You may get an answer, but the point is that the word “search” does not explain the value of this type of system for business intelligence or marketing, for example. If you run a search with the keyword of a company like Google or Yahoo, you can get information which may or may not be accurate or useful. You will see what’s happening “now”, which is the meaning of “real time”.
Microsoft and the Twitter Imperative
May 7, 2009
I found Nicholas Carlson’s “Microsoft Must Buy Twitter” here an interesting analysis. My thought was that deal makers would have a day at the State Fair if Apple, Google, and Microsoft began a bidding war over Twitter.com. Mr. Carlson offers five reasons why Microsoft has a Twitter imperative. I can’t reproduce the five points here, but I can comment on two of them and invite you to navigate to Mr. Carlson’s article to get the full story.
Mr. Carlson suggests that Microsoft can’t make its dream of attending Google’s funeral a reality with Yahoo as its principal weapon. Microsoft needs the T bomb; that is, the Twitter user base, buzz, and monetization opportunity. I find the idea intriguing, but Microsoft has not made any progress in closing the gap between itself and Google in Web search. Now the GOOG is aiming at Microsoft’s enterprise business. Twitter could be, in my opinion, an expensive distraction that leaves Microsoft vulnerable in a business sector it can ill afford to see slip downhill.
Twitter is, Mr. Carlson implies, will get more expensive. So, buy now and save. Twitter is definitely hot at this moment. The challenge for a company like Microsoft is to acquire something hot and then prevent it from getting cold. Hot properties in the hands of big, slow moving entities often lose their zippiness. My hunch is that if Microsoft owned Twitter, Twitter would be surpassed by another real time messaging service and quickly.
Microsoft may buy Twitter. That opens the door for another Twitter and Microsoft is poorer and more vulnerable as a result. Microsoft needs to leapfrog to stay where it is. An acquisition won’t do the job in my opinion.
Stephen Arnold, June 7, 2009
Inside Microsoft Search
May 7, 2009
The Register ran an intriguing article called “Microsoft’s New Search – Built on Open-Source” here by Cade Metz. The article stated:
In July of last year, Microsoft acquired Powerset, a San Francisco startup intent on bringing natural language processing to web search. And like the original Hotmail, the startup’s semantic search engine leans heavily on open source code.
Ms. Cade asserted that:
Powerset generates its search index via Hadoop, the same open-source distributed computing platform that juices Yahoo!’s search engine. Based on Google’s MapReduce distributed computing platform and GFS file system, Hadoop was originally developed by open-source maven Doug Cutting, now on the Yahoo! payroll. But it was Powerset that originated Hadoop’s HBase project, an effort to mimic Google’s famous distributed storage system, BigTable.
You will want to read the original story to get the full analysis. I want to highlight a scintillating sentence: “And it’s [Hadoop] the bastard child of the Google Chocolate Factory.”
My thoughts swirled when I read this write up. I recalled hearing that open source had been used in the Fast Search & Transfer system too. I don’t know what to think about this article. Quite a challenging story.
Stephen Arnold, June 8, 2009
Open Text Vignette: Canadian Omelet
May 6, 2009
A happy quack to the reader who alerted me to this big buck ($300 million plus) deal for Open Text to purchase the financially challenged content management vendor Vignette. You can read the Canadian press take on the announcement here. This story is hosted by Google, so it may disappear after a short time. I recall seeing a story by Matt Asay in August 2008 here that the two companies were talking. Well, the deal appears to be done. Open Text is a vendor with a collection of search systems, including Tim Bray’s SGML engine, Information Dimension’s BASIS, the mainframe centric BRS search, the Fulcrum system, and some built in query systems that came with acquisitions. Vignette, on the other hand, is complex and expensive content platform. The company has some who love it and some like me who think that it is a pretty messy bowl of clam linguini. The question is, “What will Open Text do with Vignette”?” Autonomy snagged Interwoven, snagged some up sell prospects, and fattened its eDiscovery calf. Open Text has systems that can manage content. Can Open Text manage the money losing Vignette? Autonomy in my opinion is pressuring Open Text. Open Text now has to manage the Vignette system and marshal its forces against the aggressive Autonomy. Joining me on the skepticism skateboard is ZDNet’s Stephen Powers. He wrote “Can Open Text Turn the Page on Vignette’s Recent History?” here. He wrote:
The other interesting question raised by this announcement: what to do about the Vignette brand? The press release states that Vignette will be run as a wholly-owned subsidiary. But will Open Text continue to invest in what some argue is a damaged brand? Or will they eventually go through a rebranding, as they did with their other ECM acquisitions, and retire the purple logo? Time will tell.
Mr. Powers is gentle. I think the complexity of the Vignette system, its money losing 2008, and the push back some of its licensees have directed at the company. Does Open Text have the management skill and the resources to deal with integration, support, and product stability issues? Will Open Text customers grasp the difference between Open Text’s overlapping product lines?
My hunch is that this deal is viewed by Autonomy as an opportunity, not a barrier.
Stephen Arnold, June 6, 2009
Microsoft and Disruption in Search
May 6, 2009
Ina Fried has written an interesting article called “Ballmer: We Need to Be More Disruptive in Search” here. Disruption and Microsoft are not two words I associate. Google and disruption, in my opinion, have the stickiness of peanut butter and jelly. Ms. Fried wrote and article that presented a number of statements made by Mr. Ballmer in a talk at Stanford University, home of the Gates center and the Paul Allen network facility. Oh, Stanford was the stomping ground for Messrs. Brin and Page. What I liked about Ms. Fried’s write up was her pulling out verbal bullets. I don’t want to cite the statements attributed to Mr. Ballmer in their entirety, but I want to highlight two remarks:
First, “Ballmer said Microsoft can’t afford to outspend Google in the search business or participate in each facet of the business.” The talk about Microsoft billions has accepted the reality of the cost and time barriers that Google has erected. Unable to spend to catch up or better yet, leapfrog Google, Microsoft’s statement appears to admit defeat. I got some pushback when I asserted last year at a conference in San Jose that Google had won in search. Well, now Mr. Ballmer seems to be reaching the same conclusion.
Second, this alleged statement: “We can experiment with new business models. We have less to lose than the market leader does.” The remark gave me pause. I am not sure that Microsoft has “less to lose”. Microsoft has multiple revenue streams but only a couple produce the cash surplus that keeps shareholders somewhat happy. Google has one revenue stream, advertising. Missteps for either company can have a very big downside. Microsoft is a $65 billion plus company. Google is a paltry $20 billion. In terms of financial downside, Microsoft seems to me to have more at risk. If Google makes headway in the enterprise and with Google Apps, Microsoft may face a Rubicon in the data world. Stopping or moving forward could be equally risky.
Check out the original.
Stephen Arnold, May 6, 2009
CMS Experts and Vendors May Be Floundering
May 6, 2009
I had a very unsettling conversation with a young man who recently set up his own content management consulting firm. I met him when I arrived to register for the Boye 09 conference in Philadelphia. I won’t reveal his name or his consulting firm. I do want to highlight three comments he made when we spoke yesterday afternoon and offer a comment about the implications for CMS. When I read “There Was Much Noise about the Closure of Tripod, Sites.Google, Geocities”, I realized the changing of the guard was as much about the failure of CMS as about new ways to tame the bull of electronic information, particularly in an organization.
My three questions:
First, the individual said that he had worked for an integration company that had been hit with the financial downturn. The integrator had little choice. Reduce staff or shut its doors forever. The company provided a range of technical and management services to publishing companies wanting to better manage their content. With the rumors of cut backs at some of the US based information consulting companies along with the reduction in force at Capgemini in India I wrote about here, this news was not surprising. It did indicate that technology advisors are not indispensible. Everyone, it seems, is dispensable, sort of like Kleenex. I am not a people person, but even I could sense that the individual with whom I was speaking was shocked at the change in his employer’s fortunes.
The question that raced through my mind this morning was, “Why should people who work for small service firms be surprised when the top brass has to reduce costs quickly?” I find it difficult to escape the economic news. Perhaps those in service companies in technology fields perceive themselves as insulated from the difficulties the auto industry faces, for instance?
Second, the individual told me that he decided to become a consultant and explore new opportunities. I think this is an excellent strategy. My concern this morning arose from my realization that this young person did not have the benefit of doing hard time at one of the blue chip consulting companies. Second and third tier consulting companies use bright people, but if those people don’t learn the basics of building a client base, marketing expertise, and pricing to win jobs while making a profit, the risk of failure goes up. Even those used to the safety of the Bain or Boston Consulting safety net can and do fail when setting off on their own without a logo that people recognize on their business card.
My mind asked this question, “Why type of training or educational experiences are needed to get a bright young person into a consulting business with adequate knowledge to deal with the rigors of this profession?” I attended a Booz, Allen “charm school”, but I was fortunate. I think this young man needs that type of experience.
Third, a young person entering consulting has to have skills that cause people to part with their money. As I thought about this person’s description of his background, I thought it sounded good. After all, most organizations have big content problems.
This morning, however, I realized that the young man was using Harvard MBA speak to explain his expertise. The notion of “best practices”, “project management,” and “strategy” are ones that are quite difficult to deliver in a successful, profitable way to skeptical clients.
Now what’s this have to do with content management?
I think that as information gains more prominence as a strategic asset, CMS systems and consultants are getting into increasingly hot water. A software package that organizes organizational writing is useful, but it is not a system that creates information that is a strategic asset.
Judging from the comments in this sessions, many CMS experts and attendees are trying to keep their heads above water. CMS costs are rising. Information is increasingly difficult to manage. The top guns in an organization want information to pay dividends. CMS is on the firing line with no bullet proof vest or much in the way of ammunition to defend themselves against irate users and cost watching financial officers. Open source solutions like Drupal may be one path to explore, but I think the boundaries between information value and CMS may swamp this sector and some of the leading players.
In short, CMS like enterprise search seems to be a troubled software sector.
Stephen Arnold, May 6, 2009
Google: A Victim of Success
May 6, 2009
If you are a person who sees Google as a victim, you may enjoy James B. Stewart’s essay “Few Match Google. Does that Make It a Monopoly?” here. This link may go dead at any time, but the hard copy version of this story appeared in the May 5, 2009, Wall Street Journal. The author argued that Google has been successful. As a result, Google should not be singled out for regulatory or any other action that would impede the company. For me, the most interesting comment in the write up was:
Google is indisputably a victim of its own success. Its market share of Internet search has continued to rise steadily, encompassing roughly two-thirds of total searches. At 76%, its share of search advertising is even higher, thanks to Google’s technological prowess at matching ads to people’s search queries. Given the accompanying high profit margins on this lucrative business, Google displays the telltale characteristics of a monopolist: high, even dominant market share, with high profits and pricing power that are evidence of high barriers to entry for competitors.
This is the MBA argument, and we know how well MBAs handled the financial industry.
The problem, in my opinion, is not regulatory. The problem is that Google is a new type of industrial enterprise. I argue that it is not a single firm. Google is something like the invention of the printing press. The company is transformational. Regulators have trouble with known entities such as utilities and pharmaceutical firms. The SEC was clueless when it came to Bernie Madoff’s alleged Ponzi scheme. Regulators are likely to be at sea without a life raft when it comes to understanding the “digital Gutenberg”. Should we trust MBAs? Should we trust Google? Ideas?
Stephen Arnold, May 6, 2009
Patricians, Cesspools and Rubber Boots
May 6, 2009
I became interested in ancient technology by accident. I ignored history, particularly ancient history, in college. Too fuzzy. A few years ago on a tour with some friends we were in a ruin somewhere in Turkey. I looked down and saw an exposed clay pipe. I asked the tour guide what the pipe was, and she replied, “The wealthy citizens had running water.” The ruins dated from the 800 BCE period, and I assumed that the folks who lived used the equivalent of outhouses. I was wrong. Someone had figured out how to make pipes and install indoor plumbing.
There were two classes of residents. Some had indoor plumbing and lived the good life. Others had a less good life. Patricians have used plumbing as one way to distinguish themselves from people like me. My ancestors had to use outdoor plumbing.
The image of the cesspool, then, is one that makes clear that there are two classes of people – an upper class and a lower class. When I read about metaphors invoking cesspools, I think about the class distinctions that are evident in the ruins of ancient cities. Those cities were much like the modern one in which I live. Order and disorder collide, and institutions make an attempt to prevent chaos from dominating simple activities like driving to work or shopping.
I thought about this dividing line when I read Jim Spanfeller’s “What Google Can Do to Make the Web Less of a Cesspool” here. The article makes some interesting points, yet I was troubled by assertion that a commercial enterprise can and should assume responsibility for information. A commercial enterprise has finite resources and, by definition, the need to clean up is likely to be a large job. Information is created by many, which leads to the implicit idea that a larger entity should become the janitor. If not Google, who? Well, one candidate is “the government” or perhaps a group of really smart and good people will will act as the government’s agent. Now we are back to the plumbing in the ancient city. As long as the patricians can keep the mess from their premises, life improves.
Mr. Spanfeller wrote:
At Forbes.com, we have estimated that Google makes roughly $60 million a year directing folks to our site. And by the way, 40 percent of those dollars are derived from the search terms of Forbes, Forbes.com or Forbes Magazine—simple navigation. Seems like a very nice chunk of change for simply being there. In the end, in attempting to “do no evil,” Google has done exactly that. I say this not just as someone running a content site but also as an end user. If this inequity of support continues along these lines, we will see a continuing destruction of our journalistic enterprises—enterprises that are one of the core building blocks of our democracy. Last year, while addressing the magazine publishers and editors of the MPA at the Google Campus, Eric Schmidt suggested that the web was a “cesspool” and that it was up to the major journalistic brands to clean it up. Well Eric, in a great many ways, Google has helped to create that cesspool, and as such I would hope that it can be part of the solution.
The idea that free flowing information can be cleaned up is an interesting one. I don’t think the job will be easy. I don’t think patricians from the dead tree world or from the online world are up to the task. We have a new context in which digital information is not a cause, but a consequence of our present way of existing.
The cesspool arguments are tempests in a chamber pot. Once the information flows outside the boundaries of restricted and tightly controlled channels, the information cannot be put back into those old containers. Get your boots on may be a better way to approach the situation.
Large flows of information, cesspools, and a digital mess are the characteristics of this time and place. The combined efforts of Forbes and Google will have little substantive impact. In fact, neither of these companies and not even the governments of Australia and China can contain the information flows, but these nation states keep trying to shut Pandora’s box.
The patricians want the good old days, but those days are gone. What’s interesting is that the newer modes of communication may be permanently outside the span of control of the Google and, it seems to me, existing mechanisms for control of human behavior. The fix is to abandon computers, electricity, and all things digital.
Mr. Spanfeller’s argument and his plea for Google to do its part are like the scholars’ reading of ancient texts: interesting, maybe intellectually satisfying, but markers of an era lost to history. Rubber boots, on the other hand, are practical, work reasonably well, and put the responsibility in the hands of an individual, not in fantasyland. Perhaps I will tweet that boot idea?
Stephen Arnold, May 6, 2009