Cpedia Previews the Future of Content Assembly

April 13, 2010

There are two services that anticipate some interesting future search methods. One company is Kosmix and the other is Cuil.com, the much maligned search service from Anna Patterson (former Googler) and Tom Costello (former IBMer). The folks behind Cuil.com have released Cpedia. According to GigaOM’s “Cuil Failed at Search, Now Fails to Copy Wikipedia”:

Cpedia launched last week with a blog post from Cuil co-founder and former IBM staffer Tom Costello, who described a meeting he had with Sun Microsystems co-founder Bill Joy when Costello and his wife Anna Patterson (a former Googler) were trying to raise money for Cuil. Joy told Costello that people didn’t need a new search engine that just returned a list of results, they needed something that would write an article based on a search. A note on Cpedia topic pages reads: “We find everything on the Web about your topic, remove all the duplication and put the information on one page.”

I have documented a couple of Google patent documents that describe somewhat similar ideas, although the Google systems and methods are tailored to the Google platform’s specific requirements for scale, cross processing, and optimizing performance among Google’s many different “flavors” of servers.

My view of Cpedia is somewhat less harsh than this statement in the GigaOM publication:

Unfortunately, being new and different doesn’t necessarily mean that it is either good or useful. Other users who have tried it out describe it as “sentence after sentence of automated nonsense,” and Tumblr and Instapaper developer Marco Arment says that “if this feature is meant to become a serious product, I truly feel bad for them.”

My view is:

Conceptual slicing and dicing is a particularly interesting content processing problem. The Cuil method does yield some unusual outputs but for topics like “Julius Caesar”, I found the results in line with outputs from other systems we have reviewed. One can argue that the Cuil method does not produce outputs in line with what a college educated person might assemble after scanning six or seven sources, but the Cpedia results were in the ballpark compared to some of the wackiness we have seen in the past
The computational load for this type of processing is quite high. Our tests showed that for high frequency queries like prominent topics and major historical figures, results were displayed quickly.
The inclusion of real time results struck me as one step in providing the much needed context for information pulled from Twitter and Facebook. Too often, real time items are disembodied and make little or no sense. Maybe the Cuil.com approach is not the perfect answer, but I find the inclusion of real time results within a content centric context an improvement over a Collecta box showing items in a stream. (See http://ssnblog.com for an example of the Collecta stream.)

Our tests of Cuil.com continue, and we find that the service has been improving. Cpedia keeps the ball rolling.

Stephen E Arnold, April 13, 2010

Unsponsored post.

Written by Stephen E. Arnold · Filed Under News, Online (general), Search, Technology, Text processing

Comments

One Response to “Cpedia Previews the Future of Content Assembly”

Senate passes payday loan bill; Assembly OK in doubt | Express Loan America on April 14th, 2010 12:15 am

[…] Cpedia Previews the Future of Content Assembly : Beyond Search […]

Search the site
Subscribe to Beyond Search
Feature archive
News archive

Stephen E. Arnold monitors search, content processing, text mining and related topics from his high-tech nerve center in rural Kentucky. He tries to winnow the goose feathers from the giblets. He works with colleagues worldwide to make this Web log useful to those who want to go "beyond search". Contact him at sa [at] arnoldit.com. His Web site with additional information about search is arnoldit.com.