Google Extends Government Indexing
June 18, 2010
Google has a better index of US government content than either the government or the vendors who are beavering away on this treasure trove. Now Google has added another chunk of content to its system. You can benefit from these data, but I would assert that Google’s MOMA Intranet may make even better use of the information. How? Just ask your local Googler for a demo.
The US Patent and Trademark Office (USPTO) is entering into a two year, no cost agreement with Google to make bulk electronic patent and trademark public data available. In this arrangement, the USPTO provides the data, Google hosts it for the public.
Research Buzz reported in their post, “Google Teaming Up With USPTO To Make Patent and Trademark Data Available” that the estimated size of this data storage will be about ten terabytes. This not so humble chunk of data will include patent grants and applications, trademark applications, and patent and trademark assignments, with more data (like trademark file histories) available in the future.
Google noted that it is only hosting the data provided by the USPTO; it isn’t altering it or changing it in any way. It should also be noted that this bulk hosting provided in zip files. It appears that Google wants you to download it to your own machines before you start analyzing it.
Skeptical geese might ask, “Why not crunch that content with the Guha / Halevy methods?” I think making the data with the benefit of semantic processing is slightly more useful than a big zip file.
Melody K. Smith, June 18, 2010
Freebie