The Google Treadmill System

November 12, 2009

The Google is not in the gym business. The company’s legal eagles find ways of converting wizard whimsy into patents. The tokenspace suite of patent documents does not excite the “Sergey and Larry eat pizza” style of Google watcher. For those who want to get a glimpse of the nuts and bolts in Google’s data management system, check out the treadmill invention by ace Googler, Jeffrey Dean. He had help, of course. The Google likes teams, small teams, but teams nevertheless. Here’s how the invention is described in U7,617,226, “Document Treadmilling System and Method for Updating Documents in a Document Repository and Recovering Storage Space from Invalidated Documents.”

A tokenspace repository stores documents as a sequence of tokens. The tokenspace repository, as well as the inverted index for the tokenspace repository, uses a data structure that has a first end and a second end and allows for insertions at the second end and deletions from the front end. A document in the tokenspace repository is updated by inserting the updated version into the repository at the second end and invalidating the earlier version. Invalidated documents are not deleted immediately; they are identified in a garbage collection list for later garbage collection. The tokenspace repository is treadmilled to shift invalidated documents to the front end, at which point they may be deleted and their storage space   recovered.

There are some interesting innovations in this patent document. Manual steps to reclaim storage space are not the main focus. The big idea is that a digital treadmill allows the Google to perform some magic for content updates. The tokenspace is a nifty idea, but the Google has added the endless chain notion. Oh, and there is scale, compression, and access associated with the invention. You can locate the document at http://www.uspto.gov. In my opinion, the tokenspace technology is pretty important. Ah, what’s a tokenspace you ask? Sorry, not in the blog, gentle reader.

Stephen Arnold, November 11, 2009

I don’t think my AdSense check this month was intended for me to write a short blog post calling attention to a system and method that Google would prefer to remain off the radar. Report me to the USPTO. That outfit pushed the info via RSS to me. So, a freebie.

Comments

Comments are closed.

  • Archives

  • Recent Posts

  • Meta