Cazoodle: Semantic Search

April 3, 2009

A happy quack to the reader who sent me a link to Euwyn’s “Cazoodle – Semantic Data-aware Search” here. Developed by Chambana wizards, Cazoodle “looks to create semantic data-aware search for various verticals, starting with apartments, events, and shopping (electronics, for the most part).” Euwyn makes clear that Cazoodle is a vertical search engine; that is, the content focuses on a specific topic such as apartments. Cazoodle said:

[It is] a startup company from the University of Illinois at Urbana-Champaign (UIUC), aims to enable “data-aware” search– to access the vast amount of structured information beyond the reach of current search engines. The company is co-founded by Prof. Kevin C. Chang and his research team of graduate and undergraduate students, with the support of the University and technology transfer from the MetaQuerier research at UIUC. Cazoodle is located at EnterpriseWorks, an incubator facility of the University, on the Research Park of UIUC in Champaign, Illinois.

The company seems to be going in the same direction as Classifieds.com, a Web start up that I found quite interesting. Cazoodle delivers a “semantic data-aware search.” I ran a query for an apartment in Urbana, where I worked on my PhD many years ago. The Cazoodle results looked like this:

cazoodle

The service looks interesting, demonstrating that dataspaces can be useful. I detected a few Google influences as well. Click here to try the beta search.

Stephen Arnold, April 3, 2009

Searching for People

March 14, 2009

I ran across a useful summary of sources of information about people. The write up was the work of JR Raphael, and the story “People Search Engines: The Newest Web Privacy Threat” here. Mr. Raphael runs through some vertical search systems, providing tips to get useful results. The write up about Spokeo was useful. He mentioned one site with which I was not familiar, Rapleaf. His conclusion reminds the reader to be aware of what information is available. I downloaded and saved the story. Unfortunately, the publisher–an outfit called PCAdvisor–cluttered the pages with pop ups and annoying advertisements which made it a chore to read a useful article. I don’t think PCAdvisor is going to win me as a loyal reader with baloney getting in the way of the sirloin in its write ups. Too bad.

Stephen Arnold, March 14, 2009

Microsoft: One More Search System

March 9, 2009

TechCrunch’s information technology edition  reported here that Microsoft has inked a deal with ZoomInfo.com. You can read the story “ZoomInfo Scores Deal With Microsoft To Integrate Search Into CRM” here. ZoomInfo’s technology extracts information and generates information about people and companies. The TechCrunch description of ZoomInfo is here. ZoomInfo is a business search system and content generation engine. The resulting data makes ZoomInfo.com an excellent example of a vertical search engine.

What I found interesting about this tie up was:

  1. The deal is an admission that Microsoft’s CRM products lack a search and retrieval system that meets the needs of Dynamics’ users. I have been critical of the search functions provided with Dynamics and this deal certifies the validity of my analysis.
  2. Microsoft’s own technology is capable with love and attention of delivering ZoomInfo functionality. The fact that Microsoft’s own engineers cannot use Microsoft scripting tools, content access, and data management tools to perform ZoomInfo.com’s functions tells me quite a bit about the utility of those scripting tools, Microsoft’s data management, and the difficulty of creating a commercial grade solution with those products.
  3. The deal makes evident that Microsoft’s existing search technology such as Fast ESP cannot deliver the type of mash ups that customers want. Fast ESP demonstrates its Market Track report generation system yet Microsoft itself has elected to use a third party solution. After spending $1.2 billion for Fast Search & Transfer, this bypassing of Fast ESP illuminates what I see as some of the cracks in Microsoft’s existing search products.

Maybe this type of deal won’t make waves in big ponds. But here in the mine run off pond, the ZoomInfo.com tie up is great for ZoomInfo.com and its investors. For this addled goose, a quiet honk of satisfaction.

Stephen Arnold, March 9, 2009

MyRoar: NLP Financial Information Centric Service

March 6, 2009

A happy quack to the reader who alerted me to MyRoar.com. This is a vertical search service that relies on natural language processing. I did some sleuthing and learned that François Schiettecatte joined the company earlier this year. Mr.  Schiettecatte  has a distinguished track record in search, natural language processing, and content processing. French by birth, he went to university in the UK and has lived and worked in the US for many years. Here’s what the company says about MyRoar.com:

In today’s current political and economic environment people have never had more questions. MyRoar helps people sort through the hype to find just the answers they are looking for. Extraneous information is eliminated, while saving hours of time or abandonment of search. We provide a fun new interface that keeps users up to date on current news, which helps them formulate the best questions to ask. MyRoar is a Natural Language Processing Question Answering Search Engine. Using integrated technologies we are able to offer high precision allowing users to ask questions relating to finance and news. MyRoar integrates proprietary Question Answer matching techniques with the best English NLP tools that span the globe.

You can use the system here. The system performed quite well on my test queries; for example, “What are the current financials for Parker Hannifin?” returned two results with the data I wanted. I will try to get Mr. Schiettecatte  to participate in the Search Wizards Speak interview series. Give the system a whirl.

Stephen Arnold, March 6, 2009

Ask.com: Vertical Search Push

February 8, 2009

The harsh world of Web search seems to have ground down Ask.com even further. Search Engine Watch’s “Ask.com Parent IAC Sees Disappointing Revenues, Plans Vertical Search Strategy” here tells the tale. You can read the financial details yourself. For me, the most interesting comment is the strategy intended to turn the sea of red ink into a salmon fishery was:

Instead of attempting to take on Google head-on, Ask.com will follow a vertical search strategy, which kicked off last month with deal where Ask will power the search experience on NASCAR.com, provide a NASCAR toolbar, and sponsor a car. IAC plans to roll out from 8 to 10 similar relationships this year.

Yep, the search engine of NASCAR will seek “similar relationships”. One hopes that Ask.com tries to locate a relationship not experiencing sponsor defection and declining attendance. When I was at Ziff, one of the Ziffers was involved with the original AskJeeves.com site. Since its founding more than a decade ago, Ask.com has never been useful for my type of research. Maybe this vertical search approach will work? Vertical search is sort of a hassle for me. I prefer to go to one place and get results. Running the same query on different “vertical” systems means I have to federate the results. Nope, I want the system to do the grunt work.

Stephen Arnold, February 8, 2009

Deep Web Technologies’ Vertical Search for Business Information

January 13, 2009

In the early 1990s, Verity was the dominant enterprise search system. IBM’s confused approach to STAIRS and the complexity of STAIRS derivatives created a market opportunity. Verity took it. Verity’s founders have continued to innovate in search. I was delighted to speak with Abe Lederman (that interview is here) and learn about the innovations his company has made. Deep Web Technologies (DWT) tames the tangled world of US government scientific information. You can explore the Science.gov site here. Now, Mr. Lederman and his team have turned their attention to the needs of the person looking for substantive business information. The company’s new business search system–Biznar–débuted in October 2008.

DWT has identified about 60 business oriented Web sites and federates these sources in near real time. To this core list, the Deep Web (Biznar) takes a user’s query and retrieves results from other Web indexing services. The system then blends the results, producing a results list that is designed to answer business questions. On this select source list are such publications as:

  • Business Week
  • Money Magazine
  • Motley Fool
  • US Patent & Trademark Office
  • Wall Street Journal.

Sample Query

Let’s look at a test query. I used Biznar to obtain information about “bankruptcy liability”. The system generated a result list with 1,706 entries. I ran the same query on Google.com, which returned a result list containing more than 9,400,000 results. Obviously no human could examine a fraction of these 9,400,000 results. Google advertises that it is good by virtue of indexing a lot of content. Biznar focuses on a meaningful result set of 1,700 items.

But for most people, 1,700 items are too many. Biznar makes it easy to navigate the results. Look at the results page below:

clip_image001

You see a two column display. The larger column presents a traditional results list with several useful enhancements:

  1. You see a star rating that provides an indication of the importance of the result for this specific query
  2. The source is displayed for each item; for example, Google Blog Search, Google Scholar, the New York Times, etc.
  3. The link includes a snippet of the content in the document that matches the query.

Read more

Fast API Search: Good Stuff

December 25, 2008

If you have a touch of nerd DNA, you will want to navigate to GotAPI’s Fast API Search. You can find the service here. If you run a query and don’t find what you are looking for, you can click here and contribute to the service. This vertical search service is a product of LogPerspective, Inc. in Massachusetts. A happy quack to the creators of this service.

Stephen Arnold, December 25, 2008

SharePoint: ChooseChicago

December 18, 2008

I scanned the MSDN Web log postings and saw this headline: “SharePoint Web Sites in Government.” My first reaction was that the author Jamesbr had compiled a list of public facing Web sites running on Microsoft’s fascinating SharePoint content management, collaboration, search, and Swiss Army Knife software. No joy. Mr. Jamesbr pointed to another person’s list which was a trifle thin. You can check out this official WSS tally here. Don’t let the WSS fool you. The sites are SharePoint, and there are 432 of them as of December 16, 2008. I navigated to the featured site, ChooseChicago.com. My broadband connection was having a bad hair day. It took 10 seconds for the base page to render and I had to hit the escape key after 30 seconds to stop the page from trying to locate a missing resource. Sigh. Because this was a featured site that impressed Jamesbr, I did some exploring. First, I navigated to the ChooseChicago.com site and saw this on December 16, 2008:

chicago splash

The search box is located at the top right hand corner of the page and also at the bottom right hand corner. But the search system was a tad sluggish. After entering my query “Chinese”, the system cranked for 20 seconds before returning the results list:

chicago result list

Read more

LogRhythm: Analysis and Search of Log Files

December 17, 2008

A couple of years ago I visited a very big US government agency. I asked about log files. I learned that these were often deleted without review. The reason was, as I recall, that log files were too big. Okay, that told me quite a bit about the US government’s interest in log files. Had this big government agency had access to LogRhythm, maybe those log files would have been reviewed. LogRhythm (a variant of logarithm, get it?) is a special purpose content processing system with a search component. You can read the MarketWatch news item here. The company’s system can automate monitoring, analysis and alerting for internal or external threats. The company has added what it calls “intelligent IT search.” The software  classifies content and adds metadata to log entries. One use of the system is to query logs for an audit event; that is, modifications to access authentication privileges linked to user’s network log in. I think this means that an organization fires a guy or gal. LogRhythm makes it easy to find out if said guy or gal has taken an action that the organization deems inappropriate. The metadata generated from log files includes consistent date and time stamping, prioritization of events, and context tags to pinpoint a harmless file transfer versus a file transfer to one that goes to an external IP address from a secure source within the organization. If you are struggling with log file analysis, LogRhythm may be able to help. More information is available here.

Overflight: Pop Up Summaries Implemented

December 17, 2008

Overflight Google provides a dashboard of Google’s Web log posts. You can access the service here. Just select a Google category. You can hover over a Google blog post title, and the Overflight system displays a snippet of the Web post. Google coordinates its public announcements with its Web log posts. One feature of the pop up is that it will identify the article as a unique post or a cross post. The Web logs are grouped into five categories:

We are planning another vertical Overflight. Watch for the announcement. You can search the full text of Google’s Web log posts using the Exalead search system. In head to head comparisons, ArnoldIT.com found that Exalead does a better job of adding metadata such as dates and entity extraction. Run a query on Overflight via Exalead and then run the same query on Google’s Blogsearch. Judge for yourself.

Stephen Arnold, December 17, 2008

« Previous PageNext Page »

  • Archives

  • Recent Posts

  • Meta