Is Google Chasing Dessert and Ignoring the Main Course?

January 7, 2011

We love the Google in Harrod’s Creek? The Street View picture of our office is now a bush. Our listing is in “review” and has been for months. The goose finds these actions amusing.

“Google’s Decreasingly Useful, Spam-Filled Web Search” keeps an earlier write up’s points alive despite the gingerbread. (You can read the source of the Marco.org information at this link.) Among the points, the subject of “spam” is the most interesting in our opinion.

One person’s spam may be another person’s dinner on a cruise ship. Our view is that a Google query is a useful adjunct to other research actions.

Is Google increasingly becoming an outsider for certain types of online research?

For example, yesterday we had to dig up quickly some information from our Overflight archive about a “relaxed SQL” search vendor. Here’s what we did to locate the items of information:

First, we ran the general query on Exalead’s search at www.exalead.com/search. This index is not distorted by advertisements and has more than 10 billion pages in its index. We also use the Exalead engine for Overflight. We then did the query on Blekko.com (www.blekko.com) and plucked specific results before navigating to Web sites. Yep, old fashioned pre-retrieval vetting. Still works at ArnoldIT.

Second, we ran queries for the company’s founder, who is in indexes under several spelling variants. We think spelling variants are quite interesting, particularly when the vendor is involved in licensing technology to what seem to be “dating” or “meeting people” services. The systems we used were:

Cluuz.com at www.cluuz.com. This appears to be a Yahoo BOSS service implemented on content in the Bing.com/Yahoo.com index
The Google News Archive at http://news.google.com/archivesearch, using the advanced search functions to get the string variants
Icerocket at http://www.icerocket.com/
Collecta at http://www.collecta.com

Third, we did our patent searching using my favorite site, the USPTO at www.uspto.gov.

Notice that we did not use the general Google Web index. There were four reasons:

Relevancy, unless the advanced search features are used for the query, is focused on the person looking for Lady Gaga, not “relaxed SQL”
The date of documents is important to us and we find that figuring out the date of an item and the freshness of the Google index a bit of a challenge and frankly not worth the effort
The automatic truncation and spelling correction functions override what’s stipulated in certain situations. When looking for proper name variants, I don’t want automatic anything. I want to see what I typed in the search query string
The 32 billion Web pages, the ads, and the other stuff jammed into a Google results display are mental clutter for me. I now avoid trying to figure out what’s what by using other services.

How did we do? We learned from the outfit asking us to perform the research that we surfaced information that directly supported what the company developing “relaxed SQL” was saying in briefings.

Mission accomplished using Google as one component in a secondary process. That’s quite a change from our original dependence on Google in 2002.

My hunch is that Google is nearly perfect and the change in our Web search method is a result of mental degradation here in Harrod’s Creek. If you are dependent on Google, good for you.

Stephen E Arnold, January 7, 2011

Freebie

Written by Stephen E. Arnold · Filed Under Editorial opinion, Google, News, Online (general), Search

Comments

Comments are closed.

Search the site
Subscribe to Beyond Search
Feature archive
News archive

Stephen E. Arnold monitors search, content processing, text mining and related topics from his high-tech nerve center in rural Kentucky. He tries to winnow the goose feathers from the giblets. He works with colleagues worldwide to make this Web log useful to those who want to go "beyond search". Contact him at sa [at] arnoldit.com. His Web site with additional information about search is arnoldit.com.