Google and Capillary Action

July 2, 2008

I think it was Dr. Snow’s Biology 101 class in 1962 when I had to perform an experiment related to capillary action. Capillary action, as I recall, the ability of a substance to draw another substance into it. My experiment involved a beaker of some foul smelling substance, a chunk of a mop, and a scale. I had to calculate how quickly the stinky stuff moved from the beaker into the mop. I did the experiment, got an A, and continued through life indifferent to this fundamental physical principle so essential to life.

InfoWorld, a great online publication compared to its last days as a failing print publication, has an important essay “Can Google Apps Move Up Market?” The author is Tom Kaneshige, and he does a good job of explaining that Google Apps, while not quite toy applications, are likely to face some resistance in organizations. The most important observation in his write up for me was:

Although Google Apps may carve out niches, it’s unlikely that basic applications in the cloud will play a major role in the way giants of industry conduct business. Imagine sensitive business documents being shared in the cloud without comprehensive enterprise controls. Not only is Google Apps not ready … companies aren’t either.

I don’t want to dispute the InfoWorld essay. I agree with most of its points.

However, I think one important observation may be germane. Google is working like a little beaver to get developers to create software for Google. Google is dating Salesforce.com. There’s the Android initiative. There’s the Google partner ecosystem cranking out scripts via the OneBox API. There’s the mapping crowd extending Google’s ubiquitous geospatial footprint. Developers are a longer term investment, but over a two or three year span, Google’s jejune developer program will have an impact.

Also, Google, as you probably are aware, is chomping on the wooden doors at colleges and universities. I am surprised when I meet a person from Arizona State University who said to me in April 2008, “Google is all over the campus. It’s Gmail. It’s Google Calendar. It’s all Google all the time.” ASU is not alone. The GOOG has its snout into more than 300 major academic institutions. One deal is for 1.5 million students someplace in Australia that I wrote about here.

Google’s approach to the enterprise is a variant of capillary action. As these seemingly uncoordinated activities take place, time–not technology or aggressive salesmanship–will deliver for Google. Google is betting that as its most avid developers mature and its college users enter the work force, these folks will pull Google along. Why beat your head against a concrete wall as Mr. Ballmer did in one of his famous motivational presentations? Why not let capillary action pull Google Apps, the Google Search Appliance, and Google data management services into organizations. It’s easier and doesn’t create YouTube.com video moments.

Stephen Arnold, July 2, 2008

Answering Questions: Holy Grail or Wholly Frustrating

July 2, 2008

The cat is out of the bag. Microsoft has acquired Powerset for $100 million. You can read the official announcement here. The most important part of the announcement to me was:

We know today that roughly a third of searches don’t get answered on the first search and first click…These problems exist because search engines today primarily match words in a search to words on a webpage [sic]. We can solve these problems by working to understand the intent behind each search and the concepts and meaning embedded in a webpage [sic]. Doing so, we can innovate in the quality of the search results, in the flexibility with which searchers can phrase their queries, and in the search user experience. We will use knowledge extracted from webpages [sic] to improve the result descriptions and provide new tools to help customers search better.

I agree. The problem is that delivering on these results is akin to an archaeologist finding the Holy Grail. In my experience, delivering “answers” and “better results” can be wholly frustrating. Don’t believe me? Just take a look at what happened to AskJeeves.com or any of the other semantic / natural language search systems. In fact, doubt is not evident in the dozens of posts about this topic on Techmeme.com this morning.

So, I’m going to offer a different view. I think the same problems will haunt Microsoft as it works to integrate Powerset technology into its various Live.com offerings.

Answering Questions: Circa 1996

In the mid 1990s, Ask Jeeves differentiated itself from the search leaders with its ability to answer questions. Well, some questions. The system worked for this query which I dredged from my files:

What’s the weather in Chicago, Illinois?

At the time, the approach was billed as natural language processing. Google does not maintain comprehensive historical records in its public-facing index. But you can find some information about the original system here or in the Wikipedia entry here.

How did a start up in the mid-1990s answer a user’s questions online? Computers were slow by today’s standards and expensive. Programming was time consuming. There were no tools comparable to python or Web services. Bandwidth was expensive and modems, chugged along south of 56 kilobits per second, eagerly slowing down in the course of a dial up session.

jeeves 1997

I have no inside knowledge about AskJeeves.com’s technology, but over the years, I have pieced together some information that allows me to characterize how AskJeeves.com delivered NLP (natural language processing) magic.

Humans.

AskJeeves.com compiled a list of frequently asked questions. Humans wrote answers. Programmers put data into database tables. Scripts parsed the user’s query and matched it to the answers in the tables. The real magic, from my point of view, was that AskJeeves.com updated the weather table, so when the system received my query “What is the weather in Chicago, Illinois?”, the system would pull the data from the weather table and display an answer. The system also showed links to weather sites in case the answer part was incorrect or not what the user wanted.

Over time, AskJeeves.com monitored what questions users asked and added these to the system.

What happened when the system received a query that could not be matched to a canned answer in a data table? The system picked the closest question to what the user asked and displayed that answer. So a question such as “What is the square of aleph zero plus N?” generated an answer along the lines “The Cubs won the pennant in 1918?” or some equally crazy answer.

AskJeeves.com discovered several facts about its approach to natural language processing:

  1. Humans were expensive. AskJeeves.com burned cash. The company tried to apply its canned question answering system to customer support and ended up part of the Barry Diller empire. Humans can answer questions, but the expense of paying humans to craft templates, create answer tables, and code the system were too high then and remain cash hungry today.
  2. Humans asked questions but did not really mean what they asked? Humans are perverse. A question like “What’s a good bar in San Francisco?” can go off the rails in many ways. For example, what type of bar does the user require? Biker, rock, blue collar? What’s San Francisco? Mission, Sunset, or Powell Street? The problem with answering questions, then, is that humans often have a tough time formulating the right question.
  3. Information changes. The answer today may not be the answer tomorrow. A system, therefore, has to have some way of knowing what the “right” answer is in the moment. As it turns out, the notion of “real time”–that is, accurate information at this moment–is an interesting challenge. In terms of stock prices, the “now quote” costs money. The quote from yesterday’s closing bell is free. Not only is it tricky to keep the index fresh, to have current information may impose additional costs.

This mini-case sheds light on two challenges in natural language processing.

Read more

ZDNet Says, Powerset Won’t Change the Search Equation

July 2, 2008

Larry Dignan has another good essay, “Microsoft’s Search Plan: It’s about Semantics and Possibly for Naught”. You can read the full essay here. Mr. Dignan believes that Microsoft gets some smart people and maybe a boost. He concludes:

However, Microsoft can reinvent search, but it’s still running up a natural Google monopoly. The analogy here is Windows: Microsoft didn’t have the best operating system on the planet. It just had the best positioned one. In search, the tables are turned in Google’s favor. I don’t see how Powerset will change that equation.

He is correct and diplomatic. My view is that semantic technology may help Microsoft with certain narrow functions. But applying the Powerset technology across the 12 billion Web pages that Microsoft says it has indexed will take some clever engineering. Semantic technology has to operate on the source content and figure out what the heck the user means. Google uses short cuts even though it has some serious semantic brainpower at the Googleplex. It is not just technology; it is plumbing that can be scaled economically and operated with tight cost controls.

Microsoft has money, but I am not sure it has enough time. The Google keeps lumbering forward. Microsoft has to find a way to jump over Google and take the high ground. Catching up won’t work. This is the calculus of Microsoft’s search challenge.

Stephen Arnold, July 2, 2008

Search: An Old Taxi with a Faux Cow Hide Interior

July 2, 2008

The last time I was in a big city I hailed a taxi. What a clunker. It smelled of fast food, incense, and hot plastic. One fender was dented and the curb side door would not open. The window would not go down. “She dead,” smiled the driver. The interior of the taxi had a set of blinking lights popular at holiday times. The taxi was a mess, but the faux cow interior was unusual. lights were working.

cow interior

Thanks to ABC Australia for the photo. The original is here. http://www.abc.net.au/news/newsitems/200610/s1770336.htm

I have been clicking and scanning the opinions about the Microsoft Powerset deal. Scanning the links at Congoo.com, Megite.com, and Techmeme.com will take a long time. I have been a slacker, clicking at random and looking for some substantive news.

Why is search like a lousy taxi with a useless faux cow hide interior?

My thought for this evening is that search is string matching. The other functions are ways to:

  • Make it easier for a busy person who does not have time or the desire to read a traditional document; that is, a multi page report.
  • Show the user what is available and push the user toward that information. The user, who doesn’t want to make this effort, will let the software do the work.
  • Support a user who is not to swift when it comes to thinking about abstract digital data.
  • Reduce the time a user spends fumbling for information.
  • Put training wheels on a worker who forgets work processes the way I forget where I put my automobile keys five minutes ago.

What’ happening is that key word search, string matching, and its kissing cousin Boolean are the lousy taxi. Good enough but not too pleasant.

The cow interior for search are these types of enhancements:

  • Assisted navigation, a fancy term for Use For and See also references
  • Clustering, putting like things together in a folder or under a heading
  • Discovery, an interface that provides an overview of information
  • Semantic search, a system that figures out what you mean when you type a two word query
  • Natural language processing, a term that now means answering a question, assuming that someone takes the time to think up a question and type it into a search box
  • Dashboards, a report that has panels or containers, each containing different information. Some dashboards look like speedometers with text; others can be quite fanciful.
  • Access to metadata about what person in an organization gets the most email about a specific technical issue. This type of monitoring and analysis is now called social search because surveillance is not politically correct in many circles.

You get the idea.

Possible impacts

Let’s consider the consequences.

First, enterprise search is complicated. Today I spoke with an enthusiastic and young professional. The call touched upon creating a plan for enterprise search. Like most organizations, this outfit has three separate enterprise search systems. None work all that well, so the phone rings. This is a common situation, and I am not to optimistic that enterprise search will work very well when there are competing factions each with a favorite search engine to support. Adding whizzy new functionality adds to the cost and complexity, and I am not convinced users want to do much more than find the needed information and move on to another task.

Read more

Noetix Search

July 1, 2008

Noetix Search, based in Redmond, has announced its Noetix Search system. The system supports Oracle and PeopleSoft. The idea is that Noetix generates reports, not laundry lists of results, when querying Oracle tables and PeopleSoft information stores. The company says:

Noetix Search is a web-based search application that enables Noetix users to quickly identify [sic]  the right Noetix views for their reports via a familiar, intuitive user interface. Using their favorite browser (including Internet Explorer, Firefox, and Safari), users enter a search term and are able to browse logical classifications of the most accurate results available for their specific data request.

Noetix Search takes a template-based approach to reports. The idea is that users can sort and filer, explore related information, and search across view names, view column names, table names, and table column names. Noetix Search words as a cloud-based service and integrates with third-party query tools, including Microsoft Excel. The full text of the announcement of Noetix Search is here.

Noetix has landed customers eager to make Oracle data more accessible. Customers include Cummins, Motorola, Starbucks, Toshiba, Welch’s, and Visa, among others. The company was founded in 2000 and has funding from Polaris Venture Partners and Sigma Partners.

My thought is that the word “search” may not be the appropriate one to describe this particular application. You can obtain more information about the company at its Web site http://www.noetix.com.

Stephen Arnold, July 1, 2008

Business Week: Microsoft’s Search Moves Analyzed

July 1, 2008

Catherine Holahan’s “Microsoft’s Plan B for Search” popped into my news reader this morning. The interesting essay–almost a business school-type write up–appears on the Business Week Web site here. I think the story will appear in some form in the hard copy magazine, but I read the online version this morning, July 1, 2008.

Ms. Holahan looks at the alleged Powerset buy out by Microsoft. The “Plan B” is acquiring additional search technologies in the aftermath of Redmond’s failed Yahoo deal. Her analysis is closely-reasoned, so it is difficult for me to summarize the argument.

I did find one point of particular interest; that is:

Rather than focus on creating one consumer-facing site capable of answering any query, like Google has, Microsoft has split its search engine into specific categories—a comparison-shopping engine, Microsoft Live Cashback; a travel search engine, Farecast; and a health-specific search engine, health.live.com. Today, semantic search engines do best with such category-specific searches, which help them to scan a smaller set of pages in detail. Scanning the entire Web in that much detail is difficult to do quickly.

Business Week has done a good job of explaining that Microsoft has a more fractionalized approach to search than Google. Keep in mind, however, that Google is not a single piece of digital cloth. There are different search mechanisms in operation at Google; specifically, the search system used for Google Base differs from the search system used for the search box on Google.com. The Google Search Appliance is also moving in its own direction as well.

In general, I applaud Ms. Holahan for identifying the difference initiatives within Microsoft. She has also identified two other interesting “semantic engines”. The first is Hakia, a company that offers a “Compare Hakia” function here and the Berggi Search for mobile devices other than the BlackBerry or iPhone, which limits the market somewhat. Hakia is working to generate the type of buzz that Powerset’s team found so effortless. She also mentions Expert System, based in Modena, Italy, and founded in 1989. The firm has beefed up its US presence with a new president and a more focused public relations campaign. You can learn more about Expert System here. Expert System has gained some traction for its software componetns in the mobile search market and has a lower profile in North America than Italy.

Observations

  1. The buzz about semantic search is gaining pitch and volume. My view is that semantic search is not an end in itself; it is a component of a search system. Vendors of semantic search are likely to find warmer welcomes as utilities or refinement functions within larger constellations of information retrieval methods. I guess I don’t buy the notion of “semantic search”.
  2. The key difference between Google and Microsoft boils down to the fact that Google has been working on its infrastructure for a decade. Without a honking big super computer, semantic technology is tough to implement on [a] large amounts of content and [b] content that changes frequently. The well known problem of updating indexes becomes quite challenging.
  3. Fragmented search is not necessarily a bad thing. But when there are many different search systems, costs become a problem quickly. Each system requires its own technology, engineers, and infrastructure. Google–while not homogeneous–avoided the “pushcart full of junk” approach taken by Yahoo. Microsoft, with its purchase of Fast Search & Transfer, may be unconsciously following the Yahoo model. Google’s approach of greater, not less, search homogeneity is the lower cost path. I was surprised Business Week’s B-school analysis of Microsoft’s Plan B ignored cost as a factor. Cost is a very big deal in search, which is the reason search vendors crash and burn. There’s no money to buy fuel.

Agree? Disagree? Use the comments section of this Web log to inform me of my intellectual short comings.

Stephen Arnold, July 1, 2008

Microsoft Data Centers: Spending Billions

July 1, 2008

GigaOM has another scoop. You can learn about Microsoft’s next-generation data centers in this exclusive video or read a summary of the main points. Both items available on the GigaOM.com Web site here. The information is quite dense, and I won’t try to summarize it. Navigate to GigaOM and watch the video. The key point for me is this statement:

Microsoft is taking the design of servers into its own hands. “We are doing some unique things in the mother board designs, server designs, and because we are Microsoft, operating systems.”

The quote in the quote is from Microsoft’s corporate VP of global foundation services, Debra Chrapaty. I have a diagram of Microsoft’s data center design circa 2006. If I can find it, I will post it and offer some observations. My research conducted in 2007 for a financial institution indicated that the “old data centers” posed some challenges for Microsoft; namely:

  1. Caching was needed to make Web search fast. The need for expensive edge services contributed to Microsoft’s purchase of Savvis.
  2. Microsoft was using a range of techniques to move data among data centers including some peer to peer technology and a bit of Linux.
  3. SQL Server access was a bit of a bottleneck, so a mechanism was set up to minimize direct access to SQL Server data tables.

The new changes may make Microsoft more competitive. But whatever Microsoft does, the company has to leap frog Google. The Sergey and Larry team has been working on data centers and infrastructure for a decade. Time may be running out for Microsoft to bound over the still growing Googzilla.

Stephen Arnold, July 1, 2008

IBM Search: Circling Back

July 1, 2008

I learned that an engineer named Michael Moran worked on IBM’s public facing search system for many years. You can read about this person’s contributions here. (Click this link quickly. The Yahoo news disappears faster than Yahooligans resign.) Mr. Moran has left IBM to join Converseon, a social media company. I hope there was no connection between my critique of IBM’s Web search system, Planetwide. But it is pretty terrible. Because Mr. Moran will not join Conversion until September 2008, he has time to tweak Planetwide and IBM’s e-commerce sub system as well.

To be fair to Big Blue, I dived back into the Web site. This time I focused on buying something. Providing an e-commerce function seems a reasonable expectation. Plus I own NetFinity 5500 servers, and I sometimes need parts.

Let’s take a look. You can look at these tiny WordPress processed screen shots or navigate to the e-commerce splash page and run this sample query.

Finding the Store Front

On the www.ibm.com splash page is a tab labeled “Shop For”. So far, so good. I click the tab and the drop down bar displays my choices.

ecom_choices

I decide to shop for a workstation. Years ago I owned a ZPro workstation, and it was a workhorse. The case fell apart, but the guts kept on ticking for years.

Here’s the page for workstations. Remember. I want to buy something.

ecom_worksplash-02

Instead of an Amazon or eBay like listing, I see a picture of a workstation. Okay, I click on the smaller workstations. The system shows me more text. Here is the product information for the Unix workstations that I wanted to buy, but I am now getting frustrated. Where are the products? Dell Computer in its darkest days with its sluggish e-commerce search system does better than this. Amazon, despite the baloney promoting the Kindle and showing me crazy recommendations, lets me get to products. Not IBM. The pages look alike. In fact, I am not sure that the display has changed. I like consistency, but I also like to see products.

ecom-page 3

I wade through the text in the center column under the picture and I click on IBM Intellistation POWER 265 Express. I get this screen:

ecom_description 04

More choices and more text. I scroll to the bottom of the page and I get a list of features. I am convinced. I scroll back to the top of the page where the “Browse and Buy” button is. I click it, and finally I get some bite-sized information and a price in red no less.

ecom prices 05

Read more

Content Management Vendors: We Do Social Stuff Too

June 30, 2008

After a wonderful flight with exceptional service from caring airline employees, I had to read this headline twice to make sure I wasn’t in some state of delirium. The headline is “Content Management Software Vendors Eye Social Networking”. The essay is authored by Larry Dignan, and he does a great job of catching my attention. This headline and essay are keepers.

The key point to me is:

In other words, social networking will become a generic enterprise feature at some point. These CMS players can develop their own community suites (and hire staff that understands the social types), acquire white label networks or just hang back.

The trigger for this story is a consultant report. I can’t recall which firm stuffed full of pundits came up with this observation, but I think there is some truth to content management vendors’ chasing the rainbow of social search, social content, social chit chat, and social anything.

The reason is not far to seek. CMS is a faux application that often doesn’t work very and always costs a lot more than the customer anticipated. I used to write about CMS applications, but after I had to do some clean up in two Federal agencies when systems went off the rails, I just stopped paying attention to the vendors in this software sector.

Content management is software that tries to convert companies that don’t know much about publishing into publishers. As part of the deal, employees who are not skilled writers will get some help to become more information literate. The CMS then tries to keep track of versions, enforce security, output Web pages, and perform levitation.

Why not include social functions? Social software is as much a part of CMS as any other software function. If you can’t make a system better, just make it bigger, more complex, and more trendy. The reason enterprise publishing systems are gaining traction is a result of the opportunities CMS has created with their over-hyped assertions.

Enterprise search is disappointing. CMS is disappointing as well. Instead of delivering a solution that works, just add social features. Sounds like the enterprise software industry is up

Agree? Disagree?

Stephen Arnold, June 30, 2008

The Jab-Google Bandwagon Rolls On

June 30, 2008

Phil Wainewright, whose writing I enjoy, wrote “Google’s Culture Not Fit for Enterprise Apps.” The essay appeared on June 30, 2008. You can find the full text here. Xooglers have been picking on the search dominator, and I posted a link to a story that I thought might be a spoof here. Apparently it wasn’t based on some email feedback I received this weekend, but I remain skeptical. What I am sure about is that criticism about Google seems to be on the uptick, and I am not sure why. The company has been consistent in its behavior for years. The biggest change is the company’s increased “transparency”. Googlers are everywhere: at conferences, in the news, on Web casts. Everywhere I look, there’s the GOOG.

Now, to Mr. Wainewright. He is actually picking up on the theme that Xooglers–that is, employees who cash out, quit, or get fired–are revealing that Mother Google has some idiosyncrasies. The key paragraph for me in Mr. Wainewright’s well-written essay was:

It’s a damning indictment, and one that casts a long shadow over Google’s attempts to replace Microsoft’s pre-eminence in the office collaboration software market with its Google Apps suite. As a disruptive competitor, it doesn’t have to match Microsoft Office feature-for-feature. But if it really is unreliable and buggy as Solyanik claims — and the current outage of Feedburner’s Web analytics service lends further weight to this view — then Google doesn’t even make the grade as a business-class SaaS provider.

Let me offer three observations.

First, demographics will help Google. As Google’s push in the educational institutions increases, future graduates will be comfortable with Google and its characteristics. When these graduates enter the work force, I think some of them will continue using Google or take steps to get Google products and services into the organization. I am not sure quality will have much to do with this sell through. I think habit, loyalty, and the notion that Google is pretty good will have some impact. Ergo, the short term and today’s expectations don’t matter so much.

Second, existing enterprise applications are clunky, disappointing, and costly to deploy, customize and maintain. I think that in a deteriorating economy, Google’s approach or that of its surrogates like Salesforce.com will be good enough. If the price is right, Google has a great opportunity to be pulled into an organization. Sure, traditional outfits and information technology departments may balk. But when money is tight, Google can cut a great deal, and maybe some of those “old fashioned IT professionals” could be rationalized. The systems associated with that crowd may strike some youthful chief financial officers as problems, not solutions.

Third, the competitors in the enterprise space are struggling. Oracle is boosting prices. Microsoft is betting the farm on a polymorphic software solution that is really complicated. (If you have not seen the SharePoint placemat, take a gander. You can find it here.) IBM is a consulting firm with loyal customers who so far have been content to write huge checks for solutions, but in a lousy quarter, IBM could face some pressure from upstarts like Google and its partners.

In short, Google can be baffling. I think that as people learn more about Google, more warts and blotches will become visible. Nevertheless, the GOOG is following its own path. By definition, those who are not Googley cannot be expected to understand how the company works, what it is doing, and when it will take certain actions. The “why” is clear: To make money. The “how” is a baffler, but I think the approach the firm is taking is interesting and more disruptive than many think.

Agree? Disagree? Let me know.

Stephen Arnold, June 30, 2008

Update: July 1, 2008, 9 50 pm Eastern; A round up of Google’s woes is here.

« Previous PageNext Page »

  • Archives

  • Recent Posts

  • Meta