Technology from Harrod's Creek
ez2Find: Search Morphs to Global Metasearch
"What is the future of search?" This question gets asked and answered at conferences of all types, with topics ranging from the murky knowledge management field to the slightly less woolly world of electronic commerce.
A better question is, "Where is the future of search?" Silicon Alley or Silicon Valley. What about Montgiscard in La Haute Garonne? Search programmers can enjoy long lunches under the clear Mediterranean sky and code a global metasearch engine before heading home for the evening meal. Sound surreal? It sounds very real to Luigi and Maite Castagna, the brother and sister team that owns Holomedia, a company based in Montgiscard, France.
Metasearch refers to a process which takes a user's query and sends that query to multiple search engines. Most North American Internet users find Google and Yahoo! wonderful search tools. For Luigi and Maite Castagna, metasearch is the next wave in Internet information retrieval.
Many of the well-known metasearch engines have become little more than lists of results with sponsored links competing for the user's click. The Internet veteran may know about Metacrawler and Dogpile. Metacrawler uses the tag line, "Search the search engines." Dogpile, with its cute pal, Arfie, provides a similar service. The principal difference between these two services is that Dogpile lists hit by the source search engine. Metacrawler combines the hits into one master list. InfoSpace acquired Webcrawler.com and funded the new, improved Excite.com, creating a mini-conglomerate that has helped keep metasearch technology in a gloomy cubbyhole.
All about Metasearch
Metasearch looks at data in its native form, indexes it, and then displays the hits in a Web page. Metasearch tools, unlike other search methods, are not stymied by the many forms that data takes. Different search engines create indexes of these content types, such as HTML, digitized video or audio, and XML files. In order to get one answer, it makes sense to use technology that eliminates the need for many individual searches.
"Metasearch is one of the promising technologies for information retrieval," says Mr. Castagna, developer of ez2Find, a next-generation metasearch service.
Holomedia is not alone in "reinventing" metasearch. Vivisimo has captured a loyal following with its clustering technology. Kartoo's map-like interface put a new spin on the visual display of results.
"KartOO is a metasearch engine with visual display interfaces," says a company spokesperson. "When you click on OK, KartOO launches the query to a set of search engines, gathers the results, compiles them and represents them in a series of interactive maps through a proprietary algorithm."
Users love KartOO's stunning interface to an integrated list of hits.
Surfboard.NL snapped up Ixquick, a metasearch engine built on clever ranking and relevancy algorithms derived from index trading. Ixquick passes the user's query to major search engines and then calculates each hit's relevance and displays single ranked list of hits.
Mention metasearch to research professionals, and their response may be "Copernic or Bull's Eye. Which do you use?" Both of these products are "fat clients." The user installs software on a computer and launches queries to multiple search engines. As the result flow in, the software on the user's machine eliminates duplicates, highlights search terms, and displays a relevance-ranked list of hits. In addition, the fat clients provide enhanced features for power searchers. The results can be formatted for printing or output as a professionally-designed Web page.
A variant of metasearch surfaces in IBM's "virtual database." The idea is that software from Metamatrix, Inc., connects disparate sources of data. One query is passed against the index of that virtual database, and the results are displayed as if the information came from one comprehensive, structured repository.
"Metasearch was interesting to us, so we just started coding, and we think there will be many new innovations from us and others working in this area," says Ms. Castagna. "We are now ez2Find. We started with a name that had the letters 'www.' People wrote us and said, we like your service, but the name is too hard to type. We listen to our customers, and now we are ez2Find," says Mr. Castagna.
Montgiscard still lags behind Silicon Valley as a technology hotbed, but it is close to the action.
"We are one of a few technology companies in Montgiscard," says Mr. Castagna. "There are about two thousand people here, and we are not too far from Toulouse, which is quite strong in technology. Montgiscard is also close to Montpelier, one of the fastest growing high-technology centers in France."
The firm's technology focuses squarely on the Web browser.
"Our approach is Web-based," says Ms. Castagna. "We don't like the idea of downloading and installing something if the function can be done via a browser. Also we like the idea of a free service. When we started the company in 1997, we survived the e-crack because we had some fresh ideas about generating revenue from people who wanted to reach our users. We plan to stay with the free service. So we sell some advertisements and some listings, but we keep the advertisements out of the hits."
According to the Catagnas, ez2Find's traffic is surging, although the U.S. services are not reporting ez2Find as one of the Hot 100 sites in the first quarter of 2003.
"Most of our search partners are very happy to be included in our meta search," says Mr. Castagna. "They get the benefit of our growing traffic and they get free advertising and, of course, the links are going back to them."
Ms. Castagna adds, "For big engines the problem is a little bit different, some of them are happy since the search is counted for them and others prefer to sell their results. What we see is that there are many database engines waiting to take the place of the big ones. For ez2Find, the results are more important than the engine source."
The first time a user sees the ez2Find splash page, the service looks like a splash page on a corporate Intranet. Come back a second time, ez2Find automatically personalizes the view of the ez2Find content by displaying the information in one of 48 languages, including Eesti (spoken in Estonia) and Suomi (spoken in Finland) with more languages coming. The site uses the visitor's IP address to determine the version of the site to display.
The site leapfrogs the services offered by the fat client providers, such as Intelliseek and the metasearch services of the traditional commercial database company. Anyone remember Dialog's '411' search? AS part of the personalization, the ez2Find system serves a page with local stock quotes, news, weather, Yellow Pages, and a language-specific search. "With all of this happening pretty fast even when traffic is at its peak," says Maite Castagna.
In addition to a search box, ez2Find offers a Google-like news window. News, however, is integrated into the page along with an Open Directory-based taxonomy.
"One click provides the user with access to over 3.8 million sites listed in 460,000 categories," Ms. Castagna says.
Google does not integrate disparate content in its approach. A Google user must run separate queries to get the results of the Google spider and listings in the Google directory. The Castagnas take a similar approach but at a much more finely-grained level.
"We search the major search engines and one-thousand different databases, topic based or language based," says Mr. Castagna, who is also a PHP code enthusiast. "Instead of having a huge database like Google, we prefer to search in the databases in real time. Our approach makes sure the content is always fresh. Some people refer to our approach as 'searching the deep Web or the invisible Web.' To call a specific metasearch, a user just types downloads, images, recipes, MP3, maps, and so on. Most common words will work. Answers come fast, and we can easily scale our framework very economically."
The technology behind the site is a secret, which is to be expected for a small, privately-held company.
Mr. Castagna says, "We use some tools that we created in house for our needs, and some tools that were available. We have modified many of these. The code bas is a mix of different technologies programmed in C++, PHP and Perl depending on the job to do. We are very practical and focus on our users' needs, not a particular way to create a solution."
The site tries to make the process of searching easy for anyone, regardless of their search expertise.
"Metasearching for acronyms or translation is not very useful, so in ez2Find we show appropriate search boxes instead of a regular meta search," Ms. Castagna says. "Our users tell us, 'This is clear and easy.'"
"We have also a crawler to gather and feed an internal database of cities, movies, artists, misspelled words, among other information. We use this data to try to understand the query and serve the best possible results or databases to search," continues Mr. Castagna.
The Castagnas do not use the terms "artificial intelligence," "smart agents," or other buzzwords.
"We focus on the user and getting an answer to a question. The fancy words are not so important to us," says Ms. Castagna.
The site accepts advertising and supports pay-for-placement listings. These are integrated with the user's search results. Over a series of test searches in French and English, ez2Find offers relevant results and no obviously misplaced pay-for-placement advertisements.
Says Mr. Castagna, "We don't sell ads directly-we use an advertising agency, and an internal database from affiliate networks. That's where we get the money. Like some other sites, we make sure these banners are geo- and keyword targeted. A search for auctions will not give the same banners than a search for mobile phones, and the same search from Australia will not display the same banners as the same search from United Kingdom. We think this is one of our advantages as the Internet becomes more global."
The company is placing more emphasis on its news coverage, which is already remarkably broad.
"We are working on a global news metasearch," says Mr. Castagna. "You will be able to have global coverage from the major online newspapers from all over the world, retrieved from the newspapers themselves. And, we don't want to have any editor. Our software will do the smart work."
The company is not interested in the corporate or governmental Intranet market at this time.
Corporate use of metasearch technology is not new. Verity Corp., has long offered metasearch of disparate content domains. But ez2Find has been designed to fit a very lightweight footprint and comes equipped with technology that reduces must of the set up and tweaking that some portal metasearch technology requires. Companies and governmental organizations recognizing that legacy systems cannot be fixed with the 'rip and replace' approach need to break down barriers around some content. The future belongs to software, not massive amounts of money to rebuild or recode certain systems.
Trends in Metasearch
In addition to the appeal of metasearch technology to organizations, there are four broader trends in metasearch.
The first is a vertical or niche metasearch. An example may be found with the Dataware's Query Server that dates from the 1990s. Believe it or not, Queryserver, which is now part of Open Text, is still available. A 'general' version and a 'federal' version. The system lacks user controls and requires updating, but it allows a commercial customer to build an index of internal and external content. This content can be queried from a single search screen. Metasearch is now getting the attention of information aggregators. Gale Research offers metasearch as a way to bind together silos of Gale data. In the commercial database world, Ebsco, OCLC, ProQuest Information and Learning, Elsevier Science, Inc., and other library and electronic publishing companies are gravitating to the National I Standards Organization (NISO) to guarantee that content from these companies can be queried from a single search box.
The second trend is the "fat client" approach. The leaders in this segment of metasearch are Bull's Eye Intelliseek and Copernic services. These are "fat clients." The user must install a software application that performs updates automatically, provides special result format routines, and offers canned topic-oriented searches. "Fat client" is one solution for integrating different types of data into a single results display. A leaders in this niche is a low-profile advanced development shop doing business as Global InfoTek, Inc. Though much of the firm's work is classified, the company's public Web site provides a tantalizing glimpse of the power of fat client metasearch.
The third trend is the innovation in Web-based metasearch. Ixquick, WiseNut, and Teoma push the perimeter of metasearch. Each has captured a loyal following among specialist researchers. The overall usage of Web metasearch engines, according to postings on various discussion groups such as Webmasterworld.com, is difficult to pin down. In a quest to pump up traffic, metasearch Web engines innovate with vigor. For example, Ixquick has recently introduced a beta of the service that offers topic (and subtopic) clustering.
The fourth trend is the direction that the Castagnas are taking with ez2Find. The company's technology does much more than recycling hits from Google, FAST Search & Retrieval, and Overture. ez2Find includes distinct types of metasearch technology tuned to specific content domains such as music, which is, Mr. Castagna admits, "my biggest weakness. I love all music, all the time, all the types." The ez2Find approach embeds localized content with specific search functions.
"We are going to add more country specific services in 2003," says Mr. Castagna, who reached over to turn up the sound blasting from his state-of-the-art audio system.
The future of metasearch appears rosier now than at any time in the last three years. Metasearch technology seems more practical and easier to implement than reengineering an organization's computer system on an fancy new BEA Systems' WebLogic or Oracle framework. Software that can intelligently index information from a range of servers and application software is likely to find more acceptance than the software reengineering approach that leads to spectacular software flops like those associated with customer relationship management or integrated business systems.
Metasearch can add value to the search results displayed by a single search engine. In addition to clustering, the metasearch engine can combine results, delete duplication, and place greater weight on the newer material. Most users are woefully ignorant of the time delay between indexing crawls of content, and metasearch is one way to keep the displayed hits fresh.
Metasearch also offers a way to build what ez2Find calls 'local' views of data. The ez2Find software as well as products from personalization providers such as E.piphany builds individual slices of content.
The Castagnas' software provides some of the functionality associated with the most sophisticated portal toolkits, but in a lower-cost way.
"When I hear music, I get good ideas," Mr. Castagna says.
Montgiscard's software wizards are going to turn up the volume in the months ahead.
[ Top ] [ AIT Home ] [ Beargrass ] [ Site Map ]