The Gilbane Lecture: Google Wave as One Environmental Factor

July 14, 2009

Author’s note: In early June 2009, I gave a talk to about 50 attendees of the Gilbane content management systems conference in San Francisco. When I tried to locate the room in which I was to speak, the sign in team could not find me on the program. After a bit of 30 something “we’re sure we’re right” outputs, the organizer of the session located me and got me to the room about five minutes late. No worries because the Microsoft speaker was revved and ready.

When my turn came, I fired through my briefing in 20 minutes and plopped down, expecting no response from the audience. Whenever I talk about the Google, I am greeted with either blank stares or gentle snores. I was surprised because I did get several questions. I may have to start arriving late and recycling more old content. Seems to be a winner formula.

This post is a summary of my comments. I will hit the highlights. If you want more information about this topic, you can get it by searching this Web log for the word “Wave”, buying the IDC report No. 213562 Sue Feldman and I did last September, or buying a copy of Google: The Digital Gutenberg. If you want to grouse about my lack of detail, spare me. This is a free Web log that serves a specific purpose for me. If you are not familiar with my editorial policy, take a moment to get up to speed. Keep in mind I am not a journalist, don’t pretend to be one, and don’t want to be included in the occupational category.

Here’s we go with my original manuscript written in UltraEdit from which I gave my talk on June 5, 2009, in San Francisco:

For the last two years, I have been concluding my Google briefings with a picture of a big wave. I showed the wave smashing a skin cancer victim, throwing surfer dude and surf board high into the air. I showed the surfer dude riding inside the “tube”. I showed pictures of waves smashing stuff. I quite like the pictures of tsunami waves crushing fancy resorts, sending people in sherbert colored shirts and beach wear running for their lives.

Yep, wave.

Now Google has made public why I use the wave images to explain one of the important capabilities Google is developing. Today, I want to review some features of what makes the wave possible. Keep in mind that the wave is a consequence of deeper geophysical forces. Google operates at this deeper level, and most people find themselves dealing with the visible manifestations of the company’s technical physics.

Source: http://www.toocharger.com/fiches/graphique/surf/38525.htm

This is important for enterprise search for three reasons. First, search is a commodity and no one, not even I, find key word queries useful. More sophisticated information retrieval methods are needed on the “surface” and in the deeper physics of the information factory. Second, Google is good at glacial movement. People see incremental actions that are separated in time and conceptual space. Then these coalesce and the competitors say, “Wow, where did that come from?” Google Wave, the present media darling, is a superficial development that combines a number of Google technologies. It is not the deep geophysical force, however. Third, Google has a Stalin-era type of planning horizon. Think in terms of five years, then you have the timeline on which to plot Google developments. Wave, in fact, is more than three years old if you start when Google bought a company called Transformics, older if you dig into the background of the Transformics technology and some other components Google snagged in the last five years. Keep that time thing in mind.

First, key word search is at a dead end. I have been one of the most vocal critics of key word search and variants of that approach. When someone says, “Key word search is what we need,” I reply, “Search is dead.” In my mind, I add, “So is your future in this organization.” I keep my parenthetical comment to myself.

Users need information access, not a puzzle to solve in order to open the information lock box. In fact, we have now entered the era of “data anticipation”, a phrase I borrowed from SAS, the statistics outfit. We have to view search in terms of social analytics because human interactions provide important metadata not otherwise obtainable by search, semantic, or linguistic technology. I will give you an example of this to make this type of metadata crystal clear.

You work at Enron. You get an email about creating a false transaction. You don’t take action but you forward the email to your boss and then ignore the issue. When Enron collapsed, the “fact” that you knew and did nothing when you first knew and subsequently is used to make a case that you abetted fraud. You say, “I sent the email to my boss.” From your prison cell, you keep telling your attorney the same thing. Doesn’t matter. The metadata about what you did to that piece of information through time put your tail feather in a cell with a biker convicted of third degree murder and a prior for aggravated assault.

Got it?

Metadata about what humans do with information is where the solution to the Gordian knots of understanding information may be found.

Second, Google Wave is not new. The idea for the Wave surfaced at Bell Labs in the early 1990s. In fact, the patents for some of the technology which influenced Google Wave are held by Alcatel Lucent but the brains behind those inventions are now at Google. The idea is to create a plastic bag. I think of Wave as a big Ziploc freezer bag. You put stuff in the bag and you can close it up. The stuff doesn’t get lost and you can see through the sides of the plastic bag, count what’s in there, and if the contents are alive, like a high school kid’s bug collection, you can watch the stuff move around.

Now think of a software plastic bag with a couple of interesting features. First, you can take snapshots of the bag at any point in time. This is very useful if you want to compare what was in the bag a month ago with what’s in the bag now. You can also keep track of what the stuff in the bag says and does. Think of this as one of those liver video podcasts that are popular. You just capture everything. You can use math to count actions and determine relationships.

That’s basically what Wave does, but Google uses quite a few, well chosen words to resonate with developers. “Social” is a good Google word. “Collaboration” is a better Google word. “Communications” is one of the best Google words. So that’s Wave. You put stuff in a digital baggie and keep it together, share it, change, and Google pays attention to these actions.

Payoff: users have stuff in one place which is useful for those who are knowledge workers. Gold nuggets: Google gets metadata, which is very useful for information retrieval.

Third, Wave runs on the Google computer. Now there’s a big light bulb that pundits and mavens have in their cubicles. That illumination allows these experts and azure chip consultants (my term for those out of work and trying to make a living as an expert) realize that Google’s data centers are like multifunction computer semiconductors on a big scale. Wave just runs on the Google computer. It uses the services of that computer’s operating system. The programmers pretty much hook functions together, and Google has software that generates some original code automatically. This leaves more time for Googlers to do more important work that software, for now, can’t do too well.

Part of this plumbing is something called the Programmable Search Engine. I have written about this invention by the former IBM Almaden researcher, Ramanathan Guha. You will want to chase down my report for BearStearns, published in 2007, buy Google Version 2.0, or read the five Guha patent applications to understand how “context” works because it plays a part in Wave and Fusion, both Google services you can fiddle with today.

Fourth, Wave is a single dataspace that Google users of the service can control. You can learn about dataspaces by reading the discussion of the technology in my Gilbane report called “Beyond Search”, Google: The Digital Gutenberg, or the IDC report No. 213562 I did last year with Sue Feldman. A dataspace is a technology that consists of systems and subsystems to create, manage, manipulate, analyze, combine, and exploit the plastic baggies of stuff I mentioned earlier.

In a nutshell, a dataspace consists of information objects, information about what users do and have done to those objects, data about systems resources required to manipulate those objects, and run new types of queries from the indexes built for dataspaces. The new queries allow a user to determine “where” something came from and obtain a “score” that indicates how confident a user or process can be in a particular outcome, event, or object. Dataspaces have multiple dimensions and are not relational databases although a dataspace can export information so that it can be further manipulated in a MySQL database. The dataspace is the great leap forward in information retrieval. Right now, Google “owns” that space. So innovations in search are interesting but simply not game changers the way I see dataspaces altering the information retrieval landscape.

To conclude, let me make three observations. I pay a dollar for questions because most people upon hearing me find that I come at Google so differently that no one knows what the heck I am talking about. Waves, dataspaces, and leaps forward—crazy stuff, right? Impossible, right?

Observation 1: when you look at Google what do you see? This is a question you want to research. Ask an SEO expert or a Mad Avenue type and Google is an ad system. Ask a publisher; you get a very different answer. My point is make an effort to understand the Google.

Observation 2: when you get in the water, do you want to surf or be crushed? Think about this carefully. The Wave is here. Where are you?

Observation 3: traditional search is a non starter. The future of search is not search. The future of search will be defined in a dataspace.

Stephen Arnold, July 14, 2009

Written by Stephen E. Arnold · Filed Under Conferences, Database, Feature, Google, Online (general), Real time search, Search, Technology, Text analytics, Text processing

Comments

Comments are closed.

Search the site
Subscribe to Beyond Search
Feature archive
News archive

Stephen E. Arnold monitors search, content processing, text mining and related topics from his high-tech nerve center in rural Kentucky. He tries to winnow the goose feathers from the giblets. He works with colleagues worldwide to make this Web log useful to those who want to go "beyond search". Contact him at sa [at] arnoldit.com. His Web site with additional information about search is arnoldit.com.