Google Forms: A Data Snout for a Bigger Creature

April 12, 2008

Navigate to Google’s Webmaster Central Blog. Scan the posting written by two wizards whom you probably don’t know, Alon Halevy (senior wizard) and Jayant Madhavan (slightly less senior wizard). Here’s what you will be told in well-chosen, Googley prose:

In the past few months we have been exploring some HTML forms to try to discover new web pages and URLs that we otherwise couldn’t find and index for users who search on Google. Specifically, when we encounter a <FORM> element on a high-quality site, we might choose to do a small number of queries using the form. For text boxes, our computers automatically choose words from the site that has the form; for select menus, check boxes, and radio buttons on the form, we choose from among the values of the HTML. Having chosen the values for each input, we generate and then try to crawl URLs that correspond to a possible query a user may have made. If we ascertain that the Web page resulting from our query is valid, interesting, and includes content not in our index, we may include it in our index much as we would include any other web page.

The idea is that dynamic content does not usually appear in an index. On the public Internet, this type of content is useful to me. For example, when I want to take a Southwest flight, I have to fill in some annoying Southwest forms, fiddle with drop down boxes, and figure out exactly which fare is likely to let me sit in one of the “choice” seats by boarding first. Wouldn’t it be great to be able to run a query on Google, see the flights aggregated, and from that master list jump to the order form? Dynamic content is now becoming more common.

I heard from one wizard at a conference in London that dynamic content is now more than half of the content appearing on the Web. The shift from static to dynamic is, therefore, a fundamental change in the way Web plumbing works on Web log content management systems to the sprawling craziness of Amazon.com.

pse

A diagram from Dr. Guha’s patent applications with the Context Server shown in relation to the other parts of the PSE. This is a figure from Google Version 2.0: The Calculating Predator, published by Infonortics, Ltd., Tetbury, Glou. in July 2007. Infonortics holds the copyright to this study and its contents.

Read more

Microsoft Windows: The Report of My Death Was an Exaggeration

April 11, 2008

The Gartner Group, the publicly-traded consultancy, made headlines with its interesting assertion that Microsoft Windows will collapse. Like other blue-chip consulting firms, generating buzz is good business. I know because I worked at one of the bluest-chip firms in the world, Booz, Allen & Hamilton a quarter century ago.

At a Gartner symposium, Gartner pundits asserted that Microsoft is big, fumbling, and “overburdened”. Therefore–and this is the part I admired–Windows is “collapsing”. My former boss at Booz, Allen would have swizzled the words, but that’s the difference between a blue-chip and a bluest-chip consulting firm.

Please, read the Computerworld story before it disappears from the public Web site. Also, scan the essay at Read Write Web. Both of these summaries provide useful information about the Gartner pundits’ remarks.

The thought that crossed my mind was that a large number of companies in the technology business are floundering. IBM is a baffler with $96 billion in revenues. Microsoft took advantage of IBM’s skepticism about personal computers, teetered on the precipice until a cookie expert taught the elephant to dance. IBM’s still with us, still pretty confusing to customers and competitors, still in the game.

Hewlett-Packard made what may be the most spectacular non-decision in the history of computing. HP owned AltaVista.com and orphaned it. Along came the Google, hired Jeffrey Dean and a cast of former AltaVista.com wizards. Messrs. Brin and Page–courtesy of HP–had a once-in-a-lifetime opportunity and were savvy enough to seize it. HP floundered, discovered ink, nuked Ms. Fiorina and now the company is digesting its $1.2 billion acquisition of Exstream Software. HP is an ink and printing company which technology enables. HP is in the $100 billion in revenue territory.

For Microsoft to blow a 95 percent share of desktop / notebook operating systems and applications in 24 to 36 months is a big job. If Microsoft tried to make these customers go away, I don’t think the company could do it. My father is in his mid 80s. He has one PC which runs XP. He will never upgrade, and he has minimal trouble. True, he has access to free technical support in the form of my visits. Most of the small businesses with which I am familiar aren’t likely to make any big jump to Macs (too expensive for today’s budget) or to Linux (not for the average bear).

Every year, I get at least one call about IBM mainframes, DEC 20s, and AS/400s. I’m amazed at how much of this hardware is still in use. Microsoft has problems, but what company doesn’t today? Did you used to work at the fifth largest bank in the US? Well, BearStearns is history. Software on lots of computers doesn’t collapse in the same way as a bank with other bankers wanting some cash pronto.

Kudos to Gartner for getting more media coverage than the AOL-Yahoo and Google-Yahoo tie ups. Some Microsofties will be annoyed. But there will be plenty of Windows users in 2010 or 2011 when the meltdown, implosion, or collapse occurs. Bet you a bowl of burgoo that Gartner’s wizards will still be using Word and PowerPoint to crank out their prognostications.

Stephen Arnold, April 11, 2008

The Importance of Being First

April 11, 2008

Alex Moskalyuk’s Web log contained a posting on April 10, 2008, that asserted “68 percent of search engine users click on the first page of results.” The story appeared in his Web log on Ziff-Davis’ ZDNet.com site. These data can be tough to find after a few days. Please, access the story and capture the data, which are from iProspect, a unit of the Aegis Group.

I am skeptical of usage data from Internet consultancies and search engine optimization companies. With that caveat in mind, the iProspect data reveal a significant trend in search system user behavior. Specifically, over time–if the data are accurate–users click on the first page of results only. The chart below illustrates this trend:

PageClickData

The top line is climbing, and it means that almost half of the users on Web search systems click on the first page of results. No real surprise, I suppose. The two other lines underscore the fact that fewer and fewer users are working through laundry lists of results. If these data are accurate, information on any other than the first page is not likely to get reviewed by a user.

What’s this mean for enterprise search (sometimes called Intranet search or behind-the-firewall search)? Users won’t spend much time looking for information if it is not slapped in front of their face. Key word search in organizations is generally a push cart filled with items that may or may not be pertinent to the employee’s query. If consumer behavior carries over to enterprise searchers, any system that takes a query such as “Acme proposal” and generates lists of results is going to be annoying.

Enterprise search system users need information to do their jobs, so the laundry list is almost a cinch to be more work than hunting for the needed information in other ways.

The iProspect data have another hook for me. As more young people enter the work force, Web behaviors are going to color their expectations of online search in their employer’s organization. Faced with laundry lists when Google and Microsoft personalize results, using probabilities to deliver a best guess about what’s needed by a particular person, traditional search systems in an enterprise are going to attract fewer and fewer enthusiastic users.

With the attention reports about deep-seated dissatisfaction about traditional enterprise search and content processing systems becoming more widely known, Mr. Moskalyuk’s Web log has provided another chunk of suggestive, interesting data. More details about enterprise search are needed, but in the search business, we have to take what the vendors provide. Like it or not.

Stephen Arnold, April 11, 2008

Enterprise Search: Disappointing and Annoying Users

April 11, 2008

Sinequa Shines a Bright Light in Enterprise Search’s Darkest Corner

Sinequa published the results of a survey of 200 users of enterprise search (sometimes called Intranet search or behind-the-firewall search). Users are dissatisfied and work in “information grave yards.”

Enterprise search is different from Web search. A Web search system such as those available from Google.com or Yahoo.com index content on the public Internet. Enterprise search indexes information that an organization has on its own servers. The differences boil down to search technology itself. What works on billions of Web pages is not well suited for the content on an organization’s servers. Sure, there are some gross similarities, but security, the diverse nature of the content, and the existence of many versions of certain documents require special functions generally not implemented in a public search system such as Microsoft’s search.live.com. Microsoft, in an effort to get technology suitable for enterprise search, is paying $1.2 billion for Fast Search & Transfer, a company that has asserted leadership in enterprise search. Fast Search’s executives have 1.2 billion reasons to make that claim.

Frustrated searcher

The enterprise sector is an intensely competitive business sector. The definition of the word search itself is fluid and subject to many different shades of meaning. In the last five years, key word retrieval–typing one or two words into a search box–has expanded to embrace point-and-click interfaces. Here’s a point-and-click interface available from Yahoo. Notice the search box. But the most important parts of the Web page are those that contain suggestions, hot links, and information directly germane to a user. In days gone by, this would have been called a portal. Not today. Now these assisted navigation interfaces are called search.

These new interfaces have a large number of moving parts. You can see some of the plumbing in the illustrations accompanying these Entopia and Sagemaker business case analyses published on this Web log. Enterprise search means hugely complex systems that find themselves a cross between a digital Swiss Army knife and a computerized information factory. This combination of very specific tools and a huge, sprawling technical infrastructure make many of these systems expensive and complicated.

Until the Sinequa study, which corroborated the findings in my new study Beyond Search: What to Do When Your Enterprise Search System Doesn’t Work, few people were aware of the growing dissatisfaction with enterprise search systems. A quick review of the analyst reports from well-known pundits and the marketing collateral of competitors rarely talk about dissatisfied users. Vendors like Autonomy and Google may snipe at one another, none of the vendors hint at their systems or even their competitors irritating users, creating dissatisfaction, and forcing employees to find their own search solutions by ordering a Google Search Appliance and saying, “My department will index its own information, thank you.”

Read more

ArnoldIT.com Headquarters

April 10, 2008

ArnoldIT.com is delighted to announce that it has moved to new headquarters in Harrod’s Creek, Kentucky.

In response to two questions about the location of Harrod’s Creek, ArnoldIT.com has released a photograph of its spacious, state-of-the-art offices.

Harrod’s Creek is one of North America’s high-technology centers. Our staff filters enterprise search news to separate the goose feathers from the giblets. Contact ArnoldIT.com by write sa at arnoldit.com.

arnoldithdq

Stephen Arnold, April 10, 2008

Google App Engine: Googzilla’s Slow, Small Baby Steps

April 10, 2008

Google is reasonably transparent–if you know the angle from which to observe the company. Viewed head on, Google sells only ads, generates billions. On one hand, the company races forward with betas, new engineering initiatives, and acquisitions. On the other hand, Google goes ever so slowly in converting its technological advantages into cold, hard cash.

Wall Street looks at the billions in ad revenue. Notes the $400 million from enterprise sales. End of story. Most Google pundits follow the same track. Its Google Search Appliance is not given much respect by the 150 companies competing in the search and content processing market. The scores of products available are quirks or footnotes to the larger story of Google’s ad revenue. Read more

Gartner and the GOOG: Is Google Failing in the Enterprise?

April 10, 2008

The Ziff Davis / eWeek story stopped me in my tracks. Chris Boulton, a fine, fine journalist, wrote a story the ZD editors called “Gartner: Google Doesn`t Understand the Enterprise”. (Read this story before it disappears from the eWeek Web site.) The hook for the piece is a Gartner professionals’ assertion that:

Google Apps is like a “fog rolling into the harbor,” permeating businesses quite possibly at the expense of Microsoft and IBM.

Allegedly Gartner pundit Tom Austin asserted that

Clients are calling us about GAPE [Google Apps Premier Edition],” Austin said. ‘They will use it as a bat to beat Microsoft or IBM to make them lower the cost of their software.’

The remarks appears to orginate in a talk at the Gartner Symposium ITxpo on  April 9, 2008.  The most telling part of this article, if  Mr.  Boulton  heard  correctly is:

In a line of reasoning echoing Microsoft Chairman Bill Gates’ claims that Google doesn’t understand businesses’ needs, Austin said that Google doesn’t understand the enterprise. It is not that the company can’t, he said, it is that Google doesn’t care to understand the enterprise. For example, while Microsoft and IBM offer customers five-year roadmaps under non-disclosure agreements, Google’s roadmap is one day at a time.

If true, Gartner must know a great deal more about Google’s enterprise success than I do. My sources tell me that Google is struggling to stay on top of the wave of success with its map and geo-spatial services. Google is reacting to customer requests, at least in the US government sector from what I hear from those familiar with canvas cubes in Washington, DC. My research about enterprise search revenues indicates that Google now has more than 9,000 licensees of its Google Search Appliance. This product generated somewhere around $350 to $400 million in calendar 2007 and is growing at double digit rates. The various applications, enhanced email, and messaging functions are pulling inquiries as well. In short, the Google is disrupting the traditional enterprise market on several fronts. Google lets customers pull Google to them. Google doesn’t push for sales like most enterprise software vendors.

My hunch is that Google’s “fog-like” behavior translates to sour grapes because Google is somewhat reluctant to shovel cash into the maw of the high-end IT consultancies for guidance.  Google’s reliance on “pull” tactics is challenge for some traditional consulting firms like Booz, Allen & Hamilton where I worked . Google has plenty of wizards and gurus on staff. If a pundit is Googley, that consultant will probably work for Google. This is a difficult concept for some for-hire experts to accept. But that’s just my interpretation of the matter.

I think Mr. Boulton got the story right. Could it be that Gartner doesn’t understand Google?

Stephen Arnold, April 10, 2008

Absolutes and Electronic Information

April 9, 2008

I find the research for my work fascinating. Periodically I root through some of the PDFs and PowerPoints used in my public talks.

Information in 2001

Today, while consolidating some information from a soon-to-be-retired NetFinity 5500, I came across a presentation I made to the legal information giant, Lexis Nexis, in year 2001.

The presentation sure didn’t win me any buddies in this $1 billion a year unit of the Euro-giant Reed Information. Reed, like the Thomson Corporation, maintains a low profile. Most people are unaware of what these two professional publishing companies do for a living, and I am not going to tell you that. You will have to figure it out for yourself.

My talk was given at some golf resort, and I don’t golf. I sat on my tail feather and waited to deliver my talk, which I titled “Information Professionals and In-Phase Services”. The main idea behind the talk was that anyone who used information for a living (lawyers, consultants, intelligence officers, and financial analysts) wanted current information in the context of their work.

The idea of stopping one thing to go ferret out a missing piece of information is growing long in the tooth. No, “long in the tooth” is too gentle even seven years after I wrote this presentation. Stupid, ill-advised, crazy, dumb — these are much more appropriate words. In year 2000, it was obvious — based on my research — that savvy users of information wanted information from one screen or dashboard. Furthermore that information should be [a] comprehensive, [b] current or fresh, and [c] in a form that allowed it to be cut-and-pasted or recycled without annoying manual reformatting.

I used this quote from Emily Dickinson to catch the crowd’s attention: “The truth must dazzle gradually / Or every man be blind…” No one knew what the heck I was talking about. To help the audience along, I used this chart from Forbes Magazine, October 2, 2000:

absolutes

The point of this study is that humans–more than two thirds of them in 2000–want fixed points in their lives. The notions of change, flux, transformation made people uncomfortable. The chart did little to win my audience’s confidence in my talk because I then told the group, “Absolutes are rarely found when we talk about electronic information.”

Read more

Google Metaphors of the Day: DeathStar and Skynet

April 9, 2008

John Murrell, SiliconValley.com, used two interesting metaphors in his “Google Opens Cloud to Crowd” story about Google’s new hosted services. Most Google watchers tip toe around comparisons of Google (the happy company) to any thing dark and sinister. Mr. Murrell writes:

Now that it has built up its computing infrastructure to a size somewhere between the Death Star and Skynet, Google really hates to see any of that power going untapped. To spread the wealth, Google announced late Monday the beta launch of App Engine, a set of tools and services that will let Web developers run their own applications on Google’s platform, avoiding all that troublesome back end maintenance.

If you have been outside the flows of popular culture, the DeathStar is a planet-sized ship armed with sufficient fire power to destroy a planet. Skynet, on the other hand, is the self-aware computing system that is waging war with annoying humans in the Terminator films.

Stephen Arnold, April 9, 2008

Sinequa Reveals Users of Search Are Annoyed

April 9, 2008

— Enterprise Search’s Problems Exposed in a New Survey —

Sinequa (a business intelligence and search systems vendor based in Paris) has taken a bold and refreshingly candid step.

The company reported results of a survey of 200 office workers. The results run counter to the marketing blather about the efficacy of enterprise search, sometimes called behind-the-firewall search or Intranet search.

Competitors in the enterprise search space like Autonomy, Endeca, Fast Search & Transfer, and dozens of up-and-coming vendors rarely reveal data about installations gone wrong. Sinequa has exposed the massive failure of enterprise search by revealing the results of its study of 200 professionals using enterprise search at work.

These data suggest that nearly 60 percent of the workers participating in the survey said the tools their employers provide for search are “poor”. Almost 80 percent of the respondents reported that they would benefit from having access to timely information from across their organization.

The report in IWR (Information World Review, a unit of the VNU Network) said:

Despite 88 per cent of respondents’ employers investing in an intranet, none had looked at how to maximise the information on the intranet to help drive efficiencies in performance. In fact, just eight per cent of respondents have a tool allowing them to search information across the company using key search terms.

The lack of a search system that works has even broader implications:

In the past month alone 46 per cent of respondents said that they could count up to ten occasions when not having access to the right information had impaired their performance. 16 per cent could recall ten or more times when this had happened. A further 40 per cent said that finding the information to support the development of a business critical document takes two to three hours on average, with a quarter stating it could take three hours or more.

It’s easy to calculate the cost of a flawed search system. Take an average hourly wage–say, $50 / hour. This simple exercise converts search into more than an intellectual exercise. Search gone wrong costs a great deal of money.

A spokesperson for Sinequa is reported by VNU to have said that “businesses are seriously missing the trick”. Another VNU report of the Sinequa survey revealed:

Employees are struggling to find even basic information, which impacts their productivity on a day-to-day basis… The gap between what staff can do as consumers and what they can do as employees is causing employee frustration as well as limiting the value of corporate information.

These data back up the findings reported in Beyond Search: What to Do When Your Enterprise Search System Doesn’t Work. That study reported results of a 2007 research project which found that two-thirds of the users of an enterprise search system were dissatisfied with the performance of their search tools.

Sinequa and Beyond Search reports of user frustration and annoyance seem out of step–particularly when viewed in the rosy light of the overwhelmingly upbeat reports of search success reported at such high-profile, search-centric industry events as Fast Forward (Microsoft – Fast Search & Transfer), Enterprise Search Summit (Information Today, Inc.), and the International Online Show (Incisive Media).

The clash of industry reports of success and the rumblings of user dissatisfaction raise serious questions about the reality of enterprise search systems in actual use. Search pundits and consultants rarely talk about systems that go off track. If the Sinequa data are accurate and if the data in the new Gilbane study about search are on track–the high-flying enterprise search sector may be running short of fuel.

This Web log will feature an interview with the founder of Sinequa on April 21, 2008, as part of ArnoldIT.com’s “Search Wizards Speak” feature.

Stephen Arnold, April 9, 2008

« Previous PageNext Page »

  • Archives

  • Recent Posts

  • Meta