DataparkSearch, Free Full-Featured Web Search Engine

May 24, 2010

Newslookup.com  is a quite the feat of news-search engineering. It is the first search engine to arrange search results by media type (television, radio, Internet, etc.) and category, display separate document parts, and effectively use meta data to crawl the internet to provide a “snapshot look of news websites throughout the world.” This is powered by a free, open-source search system called DataparkSearch, its origins going all the way back to 1998 via Russian programmer Maxim Zakharov.

Now in version 4, DataparkSearch boasts an impressive set of features, including indexing of all (x)html file types as well as MP3 and GIF files; support for http(s) and ftp URL schemes; vast language support; authentication and cookie support with session IDs in URLs; and a wide array of sorting, categorizing, and relevancy models to return specific results quickly. All of this is run through various database systems, notably SQL and ODBC.

Sochi’s Internet, a portal and search engine for the Russian city hosting the 2014 Winter Olympics, uses the DataparkSearch engine to deliver hotel, job, and real estate data for the city and surrounding area. The CGI front-end seen on the site provides the data collected by the “indexer,” described as a mechanism that “walks over hypertext references and stores found words and new references into the database.” The same mechanism allows for “fuzzy search,” correcting for spelling corrections and different word forms.

DataparkSearch is available through its own Web site  or via Google Code  where it has a quite busy activity log. Coded in C, the software is supported on a plethora of UNIX operating systems including FreeBSD and RedHat. Frequency dictionaries, synonym lists, and other helpful files can be found in multiple languages on the website, as well. Support for the search engine can be found through their Wiki, forum, and Google Group.

Samuel Hartman, May 20, 2010

Freebie.

Comments

Comments are closed.

  • Archives

  • Recent Posts

  • Meta