Order Google: The Digital GutenbergTop Banner

Social Search: Nay Sayers Eat Twitter Pie

May 27, 2009

My comments will be carried along on the flow of Twitter commentary today. This post is to remind me that at the end of May 2009, the Google era (lots of older Web content) has ended and the Twitter or real time search era has arrived. Granted, the monetization, stability, maturity, and consumerization has not yet climbed on the real time search bandwagon. But I think these fellow travelers are stumbling toward the rocket pad.

Two articles mark this search shift. Sure, I know I need more data, but I want to outline some ideas here. I am not (in case you haven’t noticed) a real journalist. Save the carping for the folks who used to have jobs and are now trying to make a living with Web logs.

The first article is Michael Arrington’s “Topy Search Launches: Retweets Are the New Currency of the Web” here. The key point for me was not the particular service. What hooked me were these two comments in the article:

  1. “Topsy is just a search engine. That has a fundamentally new way of finding good results: Twitter users.” This is a very prescient statement.
  2. “Influence is gained when others retweet links you’ve sent out. And when you retweet others, you lose a little Influence. So the more people retweet you, the more Influence you gain. So, yes, retweets are the new currency on the Web.”

My thoughts on these two statements are:

  • Topsy may not be the winner in this sector. The idea, however, is very good.
  • The time interval between major shifts in determining relevance are now likely to decrease. Since Google’s entrance, there hasn’t been much competition for the Mountain View crowd. The GOOG will have to adapt of face the prospect of becoming another Microsoft or Yahoo.
  • Now that Topsy is available, others will grab this notion and apply it to various content domains. Think federated retweeting across a range of services. The federated search systems have to raise the level of their game.

The second article was Steve Rubel’s “Visits to Twitter Search Soar, Indicating Social Search Has Arrived” here. I don’t have much to add to Mr. Rubel’s write up. The key point for me was:

I think there’s something fundamentally new that’s going on here: more technically savvy users (and one would assume this includes journalists) are searching Twitter for information. Presumably this is in a tiny way eroding searches from Google. Mark Cuban, for example, is one who is getting more traffic to his blog from Twitter and Facebook than Google.

For the purposes of this addled goose, the era of Googzilla seems to be in danger of drawing to a close. The Googlers will be out in force at their developers’ conference this week. I will be interested to see if the company will have an answer to the social search and real time search activity. With Google’s billions, it might be easier for the company to just buy tomorrow’s winners in real time search. Honk.

Stephen Arnold, May 27, 2009

Perfect Search and Adhere Solutions: Google Extender

May 12, 2009

I learned from my son (founder of Adhere Solutions) that his team and Perfect Search have a new product available, OBX. I was impressed with the OBX and the way in which the two companies explained their innovation.

The One Box Extender (OBX) allows users to search databases quickly and cost-effectively within the Google Search Appliance.

The One Box Extender (OBX), this product will extend the Google Search Appliance to enable organizations to search their database content with blistering query speeds – all delivered seamlessly though the Google Search Appliance’s OneBox interface.

Presently, Google Search Appliance users search their database content by sending the query through the OneBox Connector to retrieve results from different systems. This approach places query load on the database(s), and slows down the speed of the search for the end users. The Perfect Search One Box Extender (OBX) for the Google Search Appliance enables rapid search of Oracle, Microsoft SQL, DB2, MySQL, and any other SQL compliant database without placing any additional load on these systems. The OBX integrates within the same Search Engine Results Page for database search through the Google Search Appliance’s OneBox API.

image

Perfect Search and Adhere Solutions… enabling hyper federation.

Traditionally, enterprise search solutions are expensive and can be challenging to implement. The Google Search Appliance with the Perfect Search OBX provides a cost effective, appliance-based solution to index valuable database content. Many current Google Search Appliance users leverage Google’s OneBox connectors as a way to avoid indexing database content purely for cost reasons. Now, these organizations can index their database content, increase speed and relevancy, remove load from their database for a low cost.

Features of the One Box Extender include:

  • Integrates the power of database search with the Google Search Appliance OneBox
  • Provides connectivity to Oracle, Microsoft, DB2, MySQL, and other JDBC databases
  • Can be used to search Microsoft Exchange email records
  • Can index millions, or even billions of database records at a fixed cost
  • Removes load on existing database systems
  • Provides better results than traditional SQL queries
  • Results appear within the Google Search Engine Result Page instantly
  • Much lower cost than traditional enterprise search software approaches
  • Complies with database security policies
  • Customizable database displays.

“Adhere Solutions was founded by a team of search industry veterans with the vision of extending the capabilities of the Google Search Appliance and meeting the demand for associated professional services. We provide Google Enterprise customers with support throughout installation and configuration as well as applications built exclusively for the Google Search Appliance,” said Erik Arnold, director, Adhere Solutions. “Through the partnership with Perfect Search we will be able to offer Google Search Appliance customers the ability to search indexed databases without a massive spike in costs.”

“We are thrilled to be able to partner with such an outstanding organization as Adhere Solutions,” states Tim Stay, CEO of Perfect Search Corporation. “They have deep expertise providing robust solutions utilizing Google’s applications for the enterprise.  With their guidance, we have been able to integrate the speed and capacity of the Perfect Search indexing and search engine to the breadth and functionality of the Google Search Appliance.”

“The OBX extends the functionality of Google’s strong suite of enterprise applications to large content repositories such as massive databases and email archives,” states George Watanabe, VP of Business Development at Perfect Search. “Historically, searching these very large data sets have been very expensive, but today, Perfect Search and Adhere Solutions are providing a cost-effective search solution that works seamlessly through the Google OneBox interface.”

Adhere Solutions is a Google Enterprise Partner providing products and services that help organizations accelerate their adoption of Google technologies and cloud computing. Adhere Solutions’ team of consultants help customers leverage Google’s Enterprise Search products, Google Maps, and Google Apps to improve access to information, productivity, and collaboration.

Perfect Search Corporation is a software innovation company that specializes in development of search solutions, focusing on speed, scalability, stability, and savings. A total of eight patents have been applied for around the developing technology.  The suite of search products at is available on multiple platforms, from small mobile devices, to single servers, to large server farms. For more information, contact Perfect Search at www.perfectsearchcorp.com or +1.801.437.1100.

When I spoke with Perfect Search and got a description of the OBX, I concluded that Perfect Search and Adhere had moved beyond basic mash up and into a new territory.  The phrase that was used to describe this product was “hyper federation.” This was the first time I heard this description, and I think that Perfect Search and Adhere have broken new ground and have a way to explain what their engineers have accomplished.

Stephen Arnold, May 12, 2009

Composite Software

April 12, 2009

I was asked about data virtualization last week. As I worked on a short report for the client, I reminded myself about Composite Software, a company with “data virtualization” as a tagline on on its Web site. You can read about the company here. Quick take: the firm’s technology performs federation. Instead of duplicating data in a repository, Composite Software “uses data where it lives.” If you are a Cognos or BMS customer, you may have some Composite technology chugging away within those business intelligence systems. The company opened for business in 2002 and has found a customer base in financial services, military systems, and pharmaceuticals.

The angle that Composite Software takes is “four times faster and one quarter the cost.” The “faster” refers to getting data where it resides and as those data are refreshed. Repository approaches introduce latency. Keep in mind that no system is latency free, but Composite’s approach minimizes latency associated with more traditional approaches. The “cost” refers to the money saved by eliminating the administrative and storage costs of a replication approach.

The technology makes use of a server that handles querying and federating. The user interacts with the Composite server and sees a single-view of the available data. The system can operate as an enabling process for other enterprise applications, or it can be used as a business intelligence system. In my files, I located this diagram that shows a high level view of Composite’s technology acting as a data services layer:

image

A more detailed system schematic appears in the companies datasheet “Composite Information Server 4.6″ The here. A 2009 explanation of the Composite virtualization process is also available from the same page as the information server document.

The system includes a visual programming tool. The interface makes it easy to point and click through SQL query build up. I found the graphic touch for joins useful but a bit small for my aging eyeballs.

screen shot

If you are a fan of mashups, Composite makes it possible to juxtapose analyzed data from diverse sources. The company makes available a white paper, written by Bloor Research, that provides a useful round up of some of the key players in the data discovery and data federation sector. You have to register before you can download the document. Start the registration process here.

Keep in mind that this sector does not include search and content processing companies. Nevertheless, Composite offers a proven method for pulling scattered, structured data together into one view.

Stephen Arnold, April 12, 2009

Journalists Struggle with Web Logs

March 30, 2009

Gina M. Chen asked, “What do you think?” at the foot of her essay “Is Blogging Journalism”. You can read her write up here. My answer is, “Nope. Web logs are a variant of plain old communications.” Before I defend my assertion, let’s look at the guts of her essay is that “fear of change” creates the challenge. She asserted that blogging is a medium.

image

Web logs are not causing traditional media companies to collapse. Other, more substantive factors are eroding their foundations. Forget fear. Think data termites.

Okay, I can’t push back too much on these points, which strike me as tame and somewhat obvious. I also understand the fear part mostly because my brushes with traditional publishers continue to leave them puzzled and me clueless.

The issue to me is mostly fueled by money. Here’s why:

Read more

Harry Collier, Infonortics, Exclusive Interview

March 2, 2009

Editor’s Note: I spoke with Harry Collier on February 27, 2009, about the Boston Search Engine Meeting. The conference, more than a decade into in-depth explorations of search and content processing, is one of the most substantive search and content processing programs. The speakers have come from a range of information retrieval disciplines. The conference organizing committee has attracted speakers from the commercial and research sectors. Sales pitches and recycled product reviews are discouraged. Substantive presentations remain the backbone of the program. Conferences about search, search engine optimization, and Intranet search have proliferated in the last decade. Some of these shows focus on the “soft” topics in search and wrap the talks with golf outings and buzzwords. The attendee learns about “platinum sponsors” and can choose from sales pitches disguised as substantive presentations. The Infonortics search conference has remained sharply focused and content centric. One attendee told me last year, “I have to think about what I have learned. A number of speakers were quite happy to include equations in their talks.” Yep, equations. Facts. Thought provoking presentations. I still recall the tough questions posed to Larry Page (Google) after his talk in at the 1999 conference. He argued that truncation was not necessary and several in attendance did not agree with him. Google has since implemented truncation. Financial pressures have forced some organizers to cancel some of their 2009 information centric shows; for example, Gartner, Magazine Publishers Association., and Newspaper Publishers Association. to name three. Infonortics continues to thrive with its reputation for delivering content plus an opportunity to meet some of the most influential individuals in the information retrieval business. You can learn more about Infonortics here. The full text of the interview with Mr. Collier, who resides in the Cotswolds with an office in Tetbury, Glou., appears below:

Why did you start the Search Engine Meeting? How does it different from other search and SEO conferences?

The Search Engine Meeting grew out of a successful ASIDIC meeting held in Albuquerque in March 1994. The program was organized by Everett Brenner and, to everyone’s surprise, that meeting attracted record numbers of attendees. Ev was enthusiastic about continuing the meeting idea, and when Ev was enthusiastic he soon had you on board. So Infonortics agreed to take up the Search Engine Meeting concept and we did two meetings in Bath in England in 1997 and 1998, then moved thereafter to Boston (with an excursion to San Francisco in 2002 and to The Netherlands in 2004). Ev set the tone of the meetings: we wanted serious talks on serious search domain challenges. The first meeting in Bath already featured top speakers from organizations such as WebCrawler, Lycos, InfoSeek, IBM, PLS, Autonomy, Semio, Excalibur, NIST/TREC and Claritech. And ever since we have tried to avoid areas such as SEO and product puffs and to keep to the path of meaty, research talks for either search engine developers, or those in an enterprise environment charged with implementing search technology. The meetings tread a line between academic research meetings (lots of equations) and popular search engine optimization meetings (lots of commercial exhibits).

boston copy

Pictured from the left: Anne Girard, Harry Collier, and Joan Brenner, wife of Ev Brenner. Each year the best presentation at the conference is recognized with the Evvie, an award named in honor of her husband, and chair of the first conference in 1997.

There’s a great deal of confusion about the meaning of the word “search”, what’s the scope of the definition for this year’s program?

Yes, “Search” is a meaty term. When you step back, searching, looking for things, seeking, hoping to find, hunting, etc are basic activities for human beings — be it seeking peace, searching for true love, trying to find an appropriate carburetor for an old vehicle, or whatever. We tend now to have a fairly catholic definition of what we include in a Search Engine Meeting. Search — and the problems of search — remains central, but we are also interested in areas such as data or text mining (extracting sense from masses of data) as well as visualization and analysis (making search results understandable and useful). We feel the center of attention is moving away from “can I retrieve all the data?” to that of “how can I find help in making sense out of all the data I am retrieving?”

Over the years, your conference has featured big companies like Autonomy, start ups like Google in 1999, and experts from very specialized fields such as Dr. David Evans and Dr. Liz Liddy. What pulls speakers to this conference?

We tend to get some of the good speakers, and most past and current luminaries have mounted the speakers’ podium of the Search Engine Meeting at one time or another. These people see us as a serious meeting where they will meet high quality professional search people. It’s a meeting without too much razzmatazz; we only have a small, informal exhibition, no real sponsorship, and we try to downplay the commercialized side of the search world. So we attract a certain class of person, and these people like finding each other at a smaller, more boutique-type meeting. We select good-quality venues (which is one reason we have stayed with the Fairmont Copley Plaza in Boston for many years), we finance and offer good lunches and a mixer cocktail, and we select meeting rooms that are ideal for an event of 150 or so people. It all helps networking and making contacts.

What people should attend this conference? Is it for scientists, entrepreneurs, marketing people?

Our attendees usually break down into around 50% people working in the search engine field, and 50 percent those charged with implementing enterprise search. Because of Infonortics international background, we have a pretty high international attendance compared with most meetings in the United States: many Europeans, Koreans and Asians. I’ve already used the word “serious”, but this is how I would characterize our typical attendee. They take lots of notes; they listen; they ask interesting questions. We don’t get many academics; Ev Brenner was always scandalized that not one person from MIT had ever attended the meeting in Boston. (That has not changed up until now).

You have the reputation for delivering a content rich program. Who assisted you with the program this year? What are the credentials of these advisor colleagues?

I like to work with people I know, with people who have a good track record. So ever since the first Infonortics Search Engine Meeting in 1997 we have relied upon the advice of people such as you, David Evans (who spoke at the very first Bath meeting), Liz Liddy (Syracuse University) and Susan Feldman (IDC). And over the past nine years or so my close associate, Anne Girard, has provided non-stop research and intelligence as to what is topical, who is up-and-coming, who can talk on what.These five people are steeped in the past, present and future of the whole world of search and information retrieval and bring a welcome sense of perspective to what we do. And, until his much lamented death in January 2006, Ev Brenner was a pillar of strength, tough-minded and with a 45 year track record in the information retrieval area.

Where can readers get more information about the conference?

The Infonortics Web site (www.infonortics.eu) provides one-click access to the Search Engine Meeting section, with details of the current program, access to pdf versions of presentations from previous years, conference booking form and details, the hotel booking form, etc.

Stephen Arnold, March 2, 2009

US Government’s Federation Challenge

February 1, 2009

I don’t think too much about the US government’s information technology challenges. Been there. Done that. I read Wired Magazine’s “Every Military Net Accessed at Once Thanks to OB1″ here. US central command has 14 networks. Instead of running one query one the individual systems, now an authorized war fighter can look to a day when a single computer can provide results from more than a dozen separate systems. Quite progressive. OB1 stands for one box, one wire. No word when the system will be available. Oh, don’t tell anyone at central command that the Science.gov has been delivering federated search for more than five years. Also, keep it a secret that USA.gov (formerly FirstGov.gov) has been delivering federated search for even longer. Too much information could overload the warfighters. Zip those lips.

Stephen Arnold, February 1, 2009

Melzoo: Googzilla Killer or Googzilla Snack

January 13, 2009

A happy quack to the reader who sent me the link to the Melzoo.com Web search site. I poked around and located on VNUnet an article providing an overview of the service. You can read “MelZoo Takes on Google with Split Screen Search here. The system is a metasearch engine like Ixquick.com and Vivisimo’s Clusty.com. The metasearch technology is not the hook for Melzoo. The company generates an image of the Web site. I first saw this type of preview when I reviewed Girafa.com for my column in Information World Review five, maybe six years ago. Melzoo asserts here:

This preview feature has an enormous impact on the ‘quality of traffic’ delivered to advertisers: the traditional search engines are offering typically only text as a teaser. Chances are that users who enjoy the luxury of a detailed thumbnail preview, will be a lot more selective in visiting the sites they are interested in. This results in a higher effectiveness of use. The chances of “conversion” (i.e. from hit to buy) is currently estimated 5 times higher than with traditional search engines.

I think the vertical metasearch available from Deb Web Technologies is more useful for my work. You can see one of the DWT vertical federated search systems here.

The VNU write up made me sit up and take notice with its inclusion of this assertion in its write up of Melzoo.com:

“MelZoo has improved the experience of browsing the Internet in a totally different way. For years people have used an old technique - text only - to browse the web. MelZoo has revolutionized the way users will browse the web,” said MelZoo chief executive Alex De Backer. “In addition MelZoo is a welcome novelty for the advertisers, as it offers higher quality visitors at a lower cost.”

There are some issues associated with metasearch. These include latency, being blocked, or having to pay the source of the hits for the privilege of using its results. I will keep my eye on Melzoo.com.

Stephen Arnold, January 12, 2009

Deep Web Technologies’ Vertical Search for Business Information

January 13, 2009

In the early 1990s, Verity was the dominant enterprise search system. IBM’s confused approach to STAIRS and the complexity of STAIRS derivatives created a market opportunity. Verity took it. Verity’s founders have continued to innovate in search. I was delighted to speak with Abe Lederman (that interview is here) and learn about the innovations his company has made. Deep Web Technologies (DWT) tames the tangled world of US government scientific information. You can explore the Science.gov site here. Now, Mr. Lederman and his team have turned their attention to the needs of the person looking for substantive business information. The company’s new business search system–Biznar–débuted in October 2008.

DWT has identified about 60 business oriented Web sites and federates these sources in near real time. To this core list, the Deep Web (Biznar) takes a user’s query and retrieves results from other Web indexing services. The system then blends the results, producing a results list that is designed to answer business questions. On this select source list are such publications as:

  • Business Week
  • Money Magazine
  • Motley Fool
  • US Patent & Trademark Office
  • Wall Street Journal.

Sample Query

Let’s look at a test query. I used Biznar to obtain information about “bankruptcy liability”. The system generated a result list with 1,706 entries. I ran the same query on Google.com, which returned a result list containing more than 9,400,000 results. Obviously no human could examine a fraction of these 9,400,000 results. Google advertises that it is good by virtue of indexing a lot of content. Biznar focuses on a meaningful result set of 1,700 items.

But for most people, 1,700 items are too many. Biznar makes it easy to navigate the results. Look at the results page below:

clip_image001

You see a two column display. The larger column presents a traditional results list with several useful enhancements:

  1. You see a star rating that provides an indication of the importance of the result for this specific query
  2. The source is displayed for each item; for example, Google Blog Search, Google Scholar, the New York Times, etc.
  3. The link includes a snippet of the content in the document that matches the query.

Read more

Federate Net Weaver and SharePoint

December 28, 2008

The new year approaches, and you have SAP Net Weaver and Microsoft SharePoint. You want to spend a few minutes making it possible to run one query and retrieve results from each system. Trivial? You bet. In case some of the steps are a tad uncertain, you will want to peruse the SAP white paper here. The title of this useful document is “Federated Search between SAP Net Weaver Enterprise Search and Microsoft Search Server 2008 Using Open Search and SSO.” The authors are SAP wizards Andre Fischer, Pedro Arrontes, and Holger Brucheit. The 15 page document is SAP centric, and the key is to use SAP’s Open Search interface. The paper assumes you know how this middleware and its method works.  If you are fuzzy in Open Search particulars, the white paper provides links to other documents in the SAP technical library. If you want to jump right in, fire up Net Weaver and use the built in templates to specify where the data are and their format. The white paper assumes that you will be using SAP’s security and access control system, which might be incorrect if SAP plays a secondary role in your organization. The information for configuring SharePoint walks through the specific graphical interface settings to use and, thankfully, includes the scripts needed to make SharePoint play nice with Net Weaver. If you work through the white paper and your federating doesn’t federate, SAP has included some troubleshooting tips. Enjoy.

Stephen Arnold, December 28, 2008

Google Needs Ideas, Says the Independent

December 5, 2008

After the London Times muffed the ball with its Microsoft Yahoo deal, the Independent is reporting that Google needs ideas. You can read the full text of the story here. According to “Google Staff Searching for Fresh Ideas” by Steve Foley, Google’s engineers have to spend less time of personal projects and more time to find ways to squeeze more juice from the Google data farm. For me the most important comment reminded me how dependent on advertising Google remains after such high profile revenue initiatives as enterprise search sales. Here’s what Mr. Foley said:

Google’s ad revenues are still growing at a rate of one-third a year, but just three years ago they were doubling annually, and analysts are forecasting the online advertising market will be little better than flat this year. Google makes effectively all its money selling ads next to its search results.

Do I believe that Google is indeed making big moves to remain in front of the tidal wave of financial misfortune? Nope. Not for a New York minute. I think Google is taking advantage of the present economic crisis to chop away some of the excesses of Google’s early hubris and unusual business decisions. The GOOG is in a much better position than some other companies. Google has many ways to turn on nrew revenue streams, and it has a significant lead in infrastructure. I think Google is like one of those mixed martial arts fight clubs. The weaker members of the team and some distractions are being eliminated.

Stephen Arnold, December 5, 2008

Next Page »