Adieu, Google Code Search

February 6, 2012

Russ Cox, former Google intern, is getting a little misty eyed at Google’s retirement of Code Search.  Back in 2006 Cox helped to build the application, which searched for open source code throughout the Web. Now he has posted “Regular Expression Matching with a Trigram Index or How Google Code Search Worked” to mark the occasion.

The write up gets into some detail about the processes behind Code Search’s Indexed Word Search, Indexed Regular Expression Search, and Implementation. Check out the article for details.  Cox summarizes:

Despite all their apparent syntactic complexity, regular expressions in the mathematical sense of the term can always be reduced to the few cases. . . considered above. This underlying simplicity makes it possible to implement efficient search algorithms like the ones in the first three articles in this series. The analysis above, which converts a regular expression into a trigram query, is the heart of the indexed matcher, and it is made possible by the same simplicity.

Hmm, we thought trigrams are patented by Brainware. Interesting use of this technology.

Though Google Code Search is no more, Cox recommends investigating some standalone programs for localized indexed regular expression searches, like the one found here.

Cynthia Murrell, February 8, 2012

Sponsored by Pandia.com

Comments

One Response to “Adieu, Google Code Search”

  1. Thomas Feldtmose on February 7th, 2012 1:43 pm

    As an alternative to GCS, try http://www.symbolhound.com/codesearch

  • Archives

  • Recent Posts

  • Meta