Google and Privacy: Usage Data Model

July 31, 2008

I’ve been sitting on the sidelines watching the Google privacy articles, posts, and arguments. The Smoking Gun’s essay here hooked my attention. I wanted to flag the comment that caught my attention:

Arguing that technology has ensured that “complete privacy does not exist,” Google contends that a Pennsylvania family has no legal grounds to sue the search giant for publishing photos of their home on its popular “Street View” mapping feature.

WebProNews’s David Utter also has a useful comment about the problem. His July 31, 2008, article “Company Responds to Street View Photo Lawsuit” here picks up the theme that the aggrieved party “as being out of touch with reality.” Mr. Utter reminded me of Scott McNealy’s comment “You already have zero privacy. Get over it.”

If you are interested in privacy, you may want to look up US20060224583, “Systems and Methods for Analyzing a User’s Web History.” I mention this invention in my KMWorld feature “Cloud Computing and the Issue of Privacy”, pp. 14 ff in the July/August issue.

Here’s the abstract for this invention by Andrew Fikes, Jeff Korn, Oren Zamir and Christine Irani:

A user’s prior searching and browsing activities are recorded for subsequent use. A user may examine the user’s prior searching and browsing activities in a number of different ways, including indications of the user’s prior activities related to advertisements. A set of search results may be modified in accordance with the user’s historical activities. The user’s activities may be examined to identify a set of preferred locations. The user’s set of activities may be shared with one or more other users. The set of preferred locations presented to the user may be enhanced to include the preferred locations of one or more other users. A user’s browsing activities may be monitored from one or more different client devices or client application. A user’s browsing volume may be graphically displayed.

If you have not made a connection among the geographical data, the usage data, and the information a user or cluster of users examines, you may want to read this document. Remember, I don’t want to imply that Google is using the technology disclosed in a patent document. I do think these documents provide a glimpse inside the engineering “factory” at Google.

Stephen Arnold, July 31, 2008

Google: Universal Search on Mobile Devices

July 31, 2008

My earlier post here about Google in South Africa contained a reference to universal search on mobile devices. I had two incoming messages asking about this functionality. One person asserted that universal search on a mobile device was not possible and that the South Africa source I cited was out to lunch. To offer some additional information, I would like to direct everyone’s attention to US20080183699, “Blending Mobile Search Results.” This patent document discloses an invention by Ning Hu and Vida U. Ha. You can snag a copy at the wonderful USPTO here. The abstract for this invention is:

Methods, systems, and apparatus, including computer program products, for blending mobile search results. A method includes receiving a search query and multiple search results. The search results each satisfy the search query and have a respective search result quality score. The search results include generic and mobile search results. The generic and mobile search results each identify a generic and mobile resource, respectively. The search result quality scores include mobile and generic search result quality scores for the mobile and generic search results, respectively. The mobile search result quality scores and the generic search result quality scores were generated according to different scoring formulas. Based on one or more terms in the search query, the search query is classified as a mobile query. As a consequence, one or more search result quality scores are modified to improve the sorting of search results that include both mobile and generic search results.

My reading of this patent document suggests that Google indeed has some Universal Search tools on its digital workbench.

Stephen Arnold, July 31, 2008

Google: South Africa Market Share

July 31, 2008

MoneyWeb reported on July 31, 2008 about “Google’s Search Dominance.” You can read Rudolph Muller’s article here. The points about Google that I found interesting were:

  • Google’s South African office is headed up by a former Novell wizard, Stafford Masie
  • Google traffic dwarfs that of Ananzi and Aardvark. “Ananzi currently attracts 221,436 unique monthly visitors, down from 314,132”, reports Mr. Muller. Aardvark “received 88 774 unique monthly visitors, down from 106 102 during the same period in 2007.”
  • “Mobile remains the leading telecommunications medium in the country,” Mr. Muller reports. Google offers universal search for mobile in South Africa.
  • is popular in South Africa.

Africa is quickly becoming the next “big thing”. Google appears to be poised for growth.

Stephen Arnold, July 31, 2008

Autonomy Bites into the Juicy BlackBerry

July 31, 2008

Autonomy has rolled out software and services for BlackBerry email. The two-pronged product/service makes it possible to archive email, which is proliferating despite the advent of Twitter-like mini-messages. In addition, Autonomy has added a dollop of the Zantaz eDiscovery functionality to the new service. Autonomy has lashed to the new product/service the filters that can handle more than 1,000 formats. These include multimedia, images, BlackBerry’s proprietary device-to-device messages, and, of course, text. You can read more about the service here. One interesting point is that Autonomy is using the descriptive phrase “infrastructure software for the enterprise” for its wide array of products, services, and technologies.

Stephen Arnold, July 31, 2008

More about Mobile Search

July 31, 2008

On July 30, 2008, Ryan Spoon wrote “Sergey Brin: iPhone Users Conduct 30x More Mobile Searches (and Other Fascinating Stats) on the Ryan Spoon Web log here. The title of the article is a bit off center. The increase in mobile searches is information that I have known for a couple of months. Mr. Spoon does nail two pieces of information that I found most suggestive:

  1. The soon-to-be really interesting Pandora service “had 350,000 downloads in the iPhone’s first week”
  2. Mr. Spoon makes the point that “every additional iPhone search is opportunity for Google… not for Apple”.

I thought partners were in win-win relationships. Maybe not if Mr. Spoon is correct.

Stephen Arnold, July 31, 2008

SharePoint Technical Library for a USB Drive

July 31, 2008

A happy quack to a-foton for this link to the Office SharePoint Server 2007 Technical Library in Compiled Help format. I don’t know about you, but I have a tough time remembering the weird names and methods required to move beyond the defaults in SharePoint Server. Click here for the file direct from MSDN. A link to this material also appears on Patrick Butler Monterde’s Web log here.

Stephen Arnold, July 31, 2008

Stanford TAP: Google Cool that Trails Cuil

July 31, 2008

in the period from 2000 to 2002, Dr. Ramanathan Guha with the help of various colleagues and students at Stanford built a demonstration project call TAP. You can download a Power Point presentation here. I verified this link on July 30, 2008. Frankly I was surprised that this useful document was still available.

TAP was a multi-organization research effort. Participants included IBM, Stanford, and Carnegie Mellon University.

Why am I writing about information that is at least six years old? The ideas set forth in the Power Point were not feasible when Dr. Guha formulated them. Today, the computational power of multi core processors coupled with attractive price-performance ratios for storage makes the demos from 2002 possible in 2008.

TAP was a project set up to unify islands of XML from disparate Web services. TAP also brushed against automatic augmentation of human-generated Web content.Working with Dr. Guha was Rob McCool, one of the developers of the common gateway interface. Mr. McCool worked at Yahoo, and he may still be at that company. Were he to leave Yahoo, he may want to join some of his former colleagues at Google or a similar company.

Now back to 2002.

One of TAP’s ambitious goals was to “make the Web a giant distributed database.” The reason for this effort was to bring “the Internet to programs”. The Web, however, is messy. One problem is that “different sites have different names for the same thing.” TAP wanted to develop a system and method for descriptions, not editors, to choreograph
the integration.”

The payoff for this effort, according to Dr. Guha and Mr. McCool is that “good infrastructures have waves of applications.” I think this is a very important point for two reasons:

  1. The infrastructure makes the semantic functions possible and then the infrastructure supports “waves of applications”.
  2. The outputs of the system described is new combinations of information, different ways to slice data, and new types of queries, particularly those related to time.

Here’s a screen shot of TAP augmenting a query run on Google.

augmented search results

The augmented results appear to the left of the results list. These are sometimes described as “facets” or “assisted navigation hot links”. I find this type of enhance quite useful. I can and do scan result lists. I find overviews of the retrieved information and other information in the system helpful. When well executed, these augmentations are significant time savers.

Keep in mind that when this TAP work up was done, Dr. Guha did not work at Google. Mr. McCool was employed at Stanford. Yet the demo platform was Google. I find this interesting as well that the presentation emphasizes this point: “We need [an] infrastructure layer for semantics.”

Let me conclude with three questions:

  1. Google was not directly mentioned as participating in this project, yet the augmented results were implemented using Google’s plumbing. Why is this?
  2. The notion of fueling waves of applications seems somewhat descriptive of Google’s current approach to enhancing its system. Are semantic functions one enabler of Google’s newer applications?
  3. When will Google implement these enhanced features of its interface? As recently as yesterday, the interface was described as more up to date than Google. Google had functionality in 2002 or shortly thereafter that moves beyond what showed today.

Let me close with a final question. What’s Google waiting for?

Stephen Arnold, July 31, 2008 Useful Interface Enhancements

July 31, 2008 is one of the search companies tapping Yahoo’s search index. The has introduced some useful interface changes. I will be digging into this system in future write ups, but I want to call your attention to one of the innovations I found useful. (my first write up is here.)

Navigate to here. Enter your query. You will see a result screen that looks like my query for “fractal frameworks”.


The three major changes shown in this screenshot are:

  1. Entities appear in the tinted area above the graphic. My test queries suggested to me that was identifying the most important entities in the result set.
  2. A top ranked link with selected images. Each image is a hot link. I could tell quickly that the top ranked document included the type of technical diagram that I typically want to review.
  3. A selected list of other entities and concepts.

Read more

Useful SharePoint Info, Useless Presentation

July 30, 2008

A happy quack to J. Peter Bruzzese for his “Desperately Seeking Enterprise Search” which appeared in the July 30, 2008, InfoWorld Web log. You can read the story here. For me the most useful part of the write up was this passage:

Although the MOSS and Search offerings are still available and current, Microsoft has moved on with offers like Search Server 2008 Express and Search Server 2008. From a feature comparison perspective, MOSS 2007 still wins out despite the lack of streamlined installation; it more than makes up for that with such features as People and Expertise Searching, Business Data Catalog, and SharePoint Productivity Infrastructure.

One useful part of the write up is the inclusion to links about SharePoint in its various incarnations. These comparisons and descriptions can be tough to find on the InfoWorld Web site. I recommend that you snag these links and tuck them away for future reference.

Now, to the presentation. Mr. Bruzzese just writes the articles, some other group sets up the InfoWorld Web log. Here’s what you will encounter if you try to print the page: partial printing and two blank pages. Pretty annoying.

There are some workarounds involve that browser extensions, but here’s a work around that doesn’t require installing any:

  1. View the source
  2. Scroll to the beginning text for the story; that is, “There’s a new search player…”
  3. Copy the text of the story plus any tags to this point in the story: “I’d like to know your opinion.”
  4. Paste the text into an HTML editor or even a blank Word document
  5. Save the file.

InfoWorld is so eager to sell that it uses a pop up before you see this story. This is called a “prestitial”, which I dismiss instantly. Then it dumps into the page with the useful information lots and lots of ad baloney, which I also ignore.

You can go back and edit out the embedded calls within Mr. Bruzzese’s quite useful write up. So only Mr. Bruzzese gets the happy quack. The Beyond Search addled goose is winging toward InfoWorld’s Web wizard’s automobile to deposit an avian memento on the vehicle’s waxed fender.

To bad a good story was made hugely annoying to me by a presentation that is more confused than this addled goose.

Stephen Arnold, July 30, 2008

Google ADAS: Shuffling toward Multi-Tenancy

July 30, 2008

I have been puzzled by Google’s interest in cloud computing and seemingly sluggish response to growing market demand. Amazon, for instance, has rolled out a mini-fleet of Web services. Despite some hiccups, Amazon has landed some major accounts, and it has insinuated itself into a number of promising start ups. The “deals” are not disclosed, but outfits like Powerset (recently purchased by Microsoft) were using some of Amazon’s cloud services to deliver the Wikipedia demo if my sources are correct. is another baffler. Google demonstrates obvious affection for There are “hooks” from into Google. Google dons its short skirts and picks up its pom poms when asked about the promise of Google is a really attractive cheerleader, despite the horn rimmed glasses with tape on the nose piece.

Now HP, Intel, and Yahoo (my goodness) have teamed to form a quasi-academic cloud computing service. The new Three Musketeers’ plans look a bit like the Google-IBM parallel processing education initiative. This initiative also drags along some cloud computing functionality, but its focus is aimed at training programmers and getting the US computer science programs to shift back into high gear. Why go to MIT or Stanford when Moscow State University, the University of Waterloo, and various institutions in China, France, and Germany, are cranking out folks with sharper and deeper technical skills?

On July 29, 2008, the struggling and often confused USPTO published a patent assigned to Google wonderfully titled “Method and System for Assured Denotation of Application Semantics”. The inventor is Google Ulfar Erlingsson (yep, Icelandic), who is not a household name even among the in crowd in Mountain View, California. Google likes to keep certain wizards deep in the digital wonderland. The best security is the cloak of anonymity Google provides for some of its 19,000 geniuses.

Mr. Erlingsson’s invention appears as US7406542. The patent was filed in March 2003. The date is important because it provides a peg on which to hang Google’s initial interest in the ADAS “sharing” technology. Google, based on my research, requires any where from six months to 18 months to build up the research momentum to draft a patent document. Doing some crude calendar flipping, Google engineers were poking into ADAS as early as 2002, maybe earlier. The point is that unlike the Guha five patent documents in February 2007, the ADAS invention has been in the Google lab quite a bit longer than the programmable search engine invention was.

What’s ADAS do?

Well, ADAS–based on the reading of this addled goose–appears to address one tiny slice of the multi-tenant tar ball. Multi-tenant is a fancy term that means one application can be used by many different clients at one time. Now, what’s tricky is that each client can have many different users. Implicit in multi-tenant operations, then, is keeping:

  • Users seeing and interacting with what each is supposed to see and interact with
  • Preventing the application from tripping over itself as many different clients’ users bang on the application
  • Managing the complicated system in an efficient way because traditional messaging and management tools can be too verbose and inefficient.

Will Google use this technology? Is Google now using a chunk of this invention in another system; for example, the “ig” or individualized Google service? I will keep looking for information to put this interesting invention into a more solid context.

If you have ideas or insights about ADAS, please, use the comments section of this Web log to share them.

Stephen Arnold, July 30, 2008

Next Page »

  • Archives

  • Recent Posts

  • Meta