Silobreaker Update

August 25, 2009

I was exploring usage patterns via Alexa. I wanted to see how Silobreaker, a service developed by some savvy Scandinavians, was performing against the brand name business intelligence companies. Silobreaker is one of the next generation information services that processes a range of content, automatically indexing and filtering the stream, and making the information available in “dossiers”. A number of companies have attempted to deliver usable “at a glance” services. Silobreaker has been one of the systems I have relied upon for a number of client engagements.

I compared the daily reach of LexisNexis (a unit of the Anglo Dutch outfit Reed Elsevier), Factiva (originally a Reuters Dow Jones “joint” effort in content and value added indexing now rolled back into the Dow Jones mothership), Ebsco (the online arm of the EB Stevens Co. subscription agency), and Dialog (a unit of the privately held database roll up company Cambridge Scientific Abstracts / ProQuest and some investors). Keep in mind that Silobreaker is a next generation system and I was comparing it to the online equivalent of the Smithsonian’s computer exhibit with the Univac and IBM key punch machine sitting side by side:

silo usage

Silobreaker is the blue line which is chugging right along despite the challenging financial climate. I ran the same query on Compete.com, and that data showed LexisNexis showing a growth uptick and more traffic in June 2009. You mileage may vary. These types of traffic estimates are indicative, not definitive. But Silobreaker is performing and growing. One could ask, “Why aren’t the big names showing stronger buzz?”

silo splash

A better question may be, “Why haven’t the museum pieces performed?” I think there are three reasons. First, the commercial online services have not been able to bridge the gap between their older technical roots and the new technologies. When I poked under the hood in Silobreaker’s UK facility, I was impressed with the company’s use of next generation Web services technology. I challenged the R&D team regarding performance, and I was shown a clever architecture that delivers better performance than the museum piece services against which Silobreaker competes. I am quick to admit that performance and scaling remain problems for most online content processing companies, but I came away convinced that Silobreaker’s engineering was among the best I had examined in the real time content sector.

Second, I think the museum pieces – I could mention any of the services against which I compared Silobreaker – have yet to figure out how to deal with the gap between the old business model for online and the newer business models that exist. My hunch is that the museum pieces are reluctant to move quickly to embrace some new approaches because of the fear of [a] cannibalization of their for fee revenues from a handful of deep pocket customers like law firms and government agencies and [b] looking silly when their next generation efforts are compared to newer, slicker services from Yfrog.com, Collecta.com, Surchur.com, and, of course, Silobreaker.com.

Third, I think the established content processing companies are not in step with what users want. For example, when I visit the Dialog Web site here, I don’t have a way to get a relationship map. I like nifty methods of providing me with an overview of information. Who has the time or patience to handcraft a Boolean query and then paying money whether the dataset contains useful information or not. I just won’t play that “pay us to learn there is a null set” game any more. Here’s the Dialog splash page. Not too useful to me because it is brochureware, almost a 1998 approach to an online service. The search function only returns hits from the site itself. There is not compelling reason for me to dig deeper into this service. I don’t want a dialog; I want answers. What’s a ProQuest? Even the name leaves me puzzled.

the dialog page

I wanted to make sure that I was not too harsh on the established “players” in the commercial content processing sector. I tracked down Mats Bjore, one of the founders of Silobreaker. I interviewed him as part of my Search Wizards Speak series in 2008, and you may find that information helpful in understanding the new concepts in the Silobreaker service.

What are some of the changes that have taken place since we spoke in June 2008?

Mats Bjore: There are several news things and plenty more in the pipeline. The layout and design of Silobreaker.com have been redesigned to improve usability; we have added an Energy section to provide a more vertically focused service around both fossil fuels and alternative energy; we have released Widgets and an API that enable anyone to embed Silobreaker functionality in their own web sites; and we have improved our enterprise software to offer corporate and government customers “local” customizable Silobreaker installations, as well a technical platform for publishers who’d like to “silobreak” their existing or new offerings with our technology. Industry-wise,the recent statements by media moguls like Rupert Murdoch make it clear that the big guys want to monetize their information. The problem is that charging for information does not solve the problem of a professional already drowning in information. This is like trying to charge a man who has fallen overboard for water instead of offering a life jacket. Wrong solution. The marginal loss of losing a few news sources is really minimal for the reader, as there are thousands to choose from anyways, so unless you are a “must-have” publication, I think you’ll find out very quickly that reader loyalty can be fickle or short-lived or both. Add to that that news reporting itself has changed dramatically. Blogs and other types of social media are already favoured before many newspapers and we saw Twitters role during the election demonstrations in Iran. Citizen journalism of that kind; immediate, straight from the action and free is extremely powerful. But whether old or new media, Silobreaker remains focused on providing sense-making tools.

What is it going to be, free information or for fee information?

Mats Bjore: I think there will be free, for fee, and blended information just like Starbuck’s coffee.·The differentiators will be “smart software” like Silobreaker and some of the Google technology I have heard you describe. However, the future is not just lots of results. The services that generate value for the user will have multiple ways to make money. License fees, customization, and special processing services—to name just three—will differentiate what I can find on your Web log and what I can get from a Silobreaker “report”.

What can the museum pieces like Dialog and Ebsco do to get out of their present financial swamp?

Mats Bjore: That is a tough question. I also run a management consultancy, so let me put on my consultant hat for a moment. If I were Reed Elsevier, Dow Jones/Factiva, Dialog, Ebsco or owned a large publishing house, I must realize that I have to think out of the box. It is clear that these organizations define technology in a way that is different from many of the hot new information companies. Big information companies still define technology in terms of printing, publishing or other traditional processes. The newer companies define technology in terms of solving a user’s problem. The quick fix, therefore, ought to be to start working with new technology firms and see how they can add value for these big dragons today, not tomorrow.

What does Silobreaker offer a museum piece company?

Mats Bjore: The Silobreaker platform delivers access and answers without traditional searching. Users can spot what is hot and relevant. I would seriously look at solutions such as Silobreaker as a front to create a better reach to new customers, capture revenues from the ads sponsored free and reach a wider audience an click for premium content – ( most of us are unaware of the premium content that is out there, since the legacy contractual types only reach big companies and organizations. I am surprised that Google, Microsoft, and Yahoo have not moved more aggressively to deliver more than a laundry list of results with some pictures.

Is the US intelligence community moving more purposefully with access and analysis?

The interest in open source is rising. However, there is quite a bit of inertia when it comes to having one set of smart software pull information from multiple sources. I think there is a significant opportunity to improve the use of information with smart software like Silobreaker’s.

Stephen Arnold, August 25, 2009

Convera and the Bureau of National Affairs

August 22, 2009

A happy quack to the reader who sent me a Tweet that pointed to the International HR Decision Support Network: The Global Solution for HR Professionals. You can locate the Web site at ihrsearch.bna.com. The Web site identifies the search system for the site as Convera’s. Convera has morphed or be absorbed into another company. This “absorption” struck me as somewhat ironic because the Convera Web site carries a 2008 white paper by a consulting outfit called Outsell. You can read that Convera was named by Outsell as a rising star for 2008. Wow! I ran query for executive compensation in “the Americas” and these results appeared:

bna convera

The most recent result was dated August 14, 2009. Today is August 21, 2009. It appears to me that the Convera Web indexing service continues to operate. I was curious about the traffic to this site. I pulled this Alexa report which suggests that the daily “reach” of the site is almost zero percent.

alexa bna

Compete.com had no profile for the site.

I think that the human resources field is one of considerable interest. My recollection is that BNA has had an online HR service for many years. I could not locate much information about the Human Resource Information Network that originally was based in Indianapolis.

Convera appears to be providing search results to BNA, and BNA has an appetite for an online HR information service. The combination, however, seems to be a weak magnet for traffic. Vertical search may have some opportunities. Will Convera and BNA be able to capitalize on them?

But with such modest traffic I wonder why the service is still online. Anyone have any insights?

Stephen Arnold, August 21, 2009

Svizzer Update: Swiss Search

April 1, 2009

I looked at the Svizzer desktop search tool in 2006. Yesterday (March 30, 2009), one or my two or three readers in Europe asked me, “What’s become of Svizzer?” I had to admit that I had not thought about the Swiss company for two years. Svizzer was in 2006 a Microsoft desktop search system created, according to my yellowing notes, by G10 Software AG. The company crated “an information cock pit for everything”, I wrote. Categorical affirmatives trouble me, and the notion of a cock pit with “everything” strikes me as a quick way to confuse the pilot and probably contribute to increasing the chances of an error, particularly in a combat situation. Under fire, less is more usually works better than floods of data.

At one time, the company asserted, “Microsoft n’a pas de solutions adéquates en ce moment et il n’y a pas de solution convenable sur le marché en vue dan un futur proche.” This statement was true in 2006, and in my opinion, it is true as I write this on March 31, 2009. My notes identify Alexander Rossner, Peter Biewald, and Dieter Eschebach as key executives.,

image

The Svizzer interface from Version 3.5 in 2007.

The 2006 version of the software included a Copernic like metasearch feature. I noted that it could remove ads from the results, but since I have not tested the system in two years, I don’t know if today’s ad technology will be caught by the system. The last download link I had was for Softpedia here. Give it a try.

The idea, I noted, was that Svizzer provides “a single point of search access to site search, enterprise search, desktop search, Web log search, news search, and Web search”. The system, according to my notes, performed a Vivisimo like aggregation and deduplication function.

The license fees for the system begin at about $100 per user. A customer can buy an annual service plan that includes support, upgrades, etc. I noted that the software included an advertising function, but I was not sure how well that would work in some more traditional organizations. I thought that an administrator could use the ad function to put specific information in front of users, a version of hit boosting.

The company is privately held, according to my notes. There were about 15 employees. I estimated the firm’s revenues in the $2.0 million per year range.

The company was European centric, and I noted, had plans to expand within continental Europe. The company’s Web site at www.g10.ch is off line. The last active date I have in my files is June 2008.

I am not certain the company is still viable. The last Web log entry here is dated March 2007. If a reader has more information, please, use the comments section on this Web log to update the information.

Stephen Arnold, March 31, 2009

Francisco Corella, Pomcor, an Exclusive Interview

February 11, 2009

Another speaker on the program at Infonortics’ Boston Search Engine Meeting agreed to be interviewed by Harry Collier, the founder of the premier search and content processing event. Francisco Corella is one of the senior managers of Pomcor. The company’s Noflail search system leverages open source and Yahoo’s BOSS (build your own search system). Navigate to the Infonortics.com Web site and sign up for the conference today. In Boston, you can meet Mr. Corella and other innovators in information retrieval.

The full text of the interview appears below:

Will you describe briefly your company and its search technology?

Pomcor is dedicated to Web technology innovation.  In the area of search we have created Noflail Search, a search interface that runs on the Flex platform.  Search results are currently obtained from the Yahoo BOSS API, but this may change in the future.   Noflail Search helps the user solve tough search problems by prefetching the results of related queries, and supporting the simultaneous browsing of the result sets of multiple queries.  It sounds complicated, but new users find the interface familiar and comfortable from the start.  Noflail Search also lets users save useful queries—yes, queries, not results.  This is akin to bookmarking the queries, but a lot more practical.

What are the three major challenges you see in search / content processing in 2009?

First challenge: what I call the indexable unit problem.  A Web page is often not the desired indexable unit.  If you want to cook sardines with triple sec (after reading Thurber) and issue a query [sardines “triple sec”] you will find pages that have a recipe with sardines and a recipe with triple sec.  If there is a page with a recipe that uses both sardines and triple sec, it may be buried too deep for you to find.  In this case the desired indexable unit is the recipe, not the page.  Other indexable units: articles in a catalog, messages in an email archive, blog entries, news.  There are ad-hoc solutions for blog entries and news, but no general-purpose solutions.

Second challenge: what I call the deep API problem.  Several search engines offer public Web APIs that enable search mashups.  Yahoo, in particular, encourages developers to reorder search results and merge results from different sources.  But no search API provides more than the first 1000 results from any result set, and you cannot reorder a set if you only have a tiny subset of its elements.  What’s needed is a deep API that lets you build your own index from crawler raw data or by combining multiple sources.

Third challenge: incorporate semantic technology into mainstream search engines.

With search processing decades old, what have been the principal  barriers to resolving these challenges in the past?

The three challenges have not been resolved for different reasons. Indexable units require a new standard to specify the units within a page, and a restructuring of the search engines; hence a lot of inertia stands in the way of a solution.  The need for a deep API is new and not widely recognized yet.  And semantics are inherently difficult.

What is your approach to problem solving in search and content processing? Do you focus on smarter software, better content processing, improved interfaces, or some other specific area?

Noflail Search is a substantial improvement on the traditional search interface.  Nothing more, nothing less.  It may be surprising that such an improvement is coming now, after search engines have been in existence for so many years.  Part of the reason for this may be that Google has a quasi-monopoly in Web search, and monopolies tend to stifle innovation.  Our innovations are a direct result of the appearance of public Web APIs, which lower the barrier to entry and foster innovation.

With the rapid change in the business climate, how will the increasing financial pressure on information technology affect search / content processing?

The crisis may have both negative and positive effects on search innovation.  Financial pressure causes consolidation, which reduces innovation.  But the urge to reduce cost could also lead to the development of an ecosystem where different players solve different pieces of the search puzzle.  Some could specialize in crawler software, some in index construction, some in user interface improvements, some in various aspects of semantics, some in various vertical markets.

A technogical ecosystem materialized in the 80’s for the PC industry, and resulted in amazing cost reduction.  Will this happen again for search?  Today we are seeing mixed signals.  We see reasons for hope in the emergence of many alternative search engines, and the release by Microsoft of Live Search API 2.0 with support for revenue sharing. On the other hand, Amazon recently dropped Alexa, and Yahoo is now changing the rules of the game for Yahoo BOSS, reneging on its promise of free API access with revenue sharing.

Multi core processors provide significant performance boosts. But search / content processing often faces bottlenecks and latency in indexing and query processing. What’s your view on the performance of your system or systems with which you are familiar? Is performance a non issue?

Noflail Search is computationally demanding.  When the user issues a query, Noflail Search precomputes the result sets of up to seven related queries in addition to the result set of the original query, and prefetches the first page of each result set.  If the query has no results (which may easily happen in a search restricted to a particular Web site), it determines the most specific subqueries (queries with fewer terms) that do produce results; this requires traversing the entire subgraph of subqueries with zero results and its boundary, computing the results set of each node.  All this is perfectly feasible and actually takes very little real time.

How do we do it? 

Since Noflail Search is built on the Flex platform, the code runs on the Flash plug-in in the user’s computer and obtains
search results directly from the Yahoo Boss API.  Furthermore, the code exploits the inherent parallelism of any Web API.  Related queries are all run simultaneously.  And the algorithm for traversing the zero-result subgraph is carefully designed to maximize concurrency.

Yahoo, however, has just announced that they will be charging fees for API queries instead of sharing ad revenue.  If we continue to use Yahoo BOSS, it may not be econonmically feasible to prefecth the results of related queries or analyze zero results as we do now. Thus, although performance is a non-issue technically, demands of computational power have financial implications.

As you look forward, what are some new features / issues that you think will become more important in 2009?

Obviously we think that the new user interface features in Noflail Search are important and hope they’ll become widely used in 2009.  We have of course filed patent applications on the new features, but we are very willing to license the inventions to others. As for a breakthrough over the next 36 months, as a consumer of search, I very much hope that the indexable unit problem will be solved.  This would increase search accuracy and make life easier for everybody.

Where can I find more information about your products, services, and research?

Noflail Search is available at http://noflail.com/, and white papers on the new features can be found in the Search Technology page (http://www.pomcor.com/search_technology.html) of the Pomcor Web site http://www.pomcor.com/).

Harry Collier, Infonortics Ltd., February 11, 2009

Documenting the Obvious: The Google Generation

January 31, 2009

Google is 10 years old. Who cares? The company now represents the “out there” intellect and YouTube.com content package for lots of people. What’s obvious? The article “Is Technology Producing a Decline in Critical Thniking and Analysis” here in Science Daily confirmed my perception of the trophy generation’s preferred method of learning: watch a video. I prefer books and when I can find a person willing to discuss concepts, I will give that approach a whirl. The study reported by Science Daily documents how the Google Generation sucks in video, news crawls, learning from video games, and other methods I find annoying. Little wonder that a procurement teams with an average age of 30 wants a “just like Google interface,” memos that are less than one page, and analyses that can be converted to a couple of PowerPoint slides. Alexander Pope pointed out the danger of a “little learning”. I wonder what he would have thought about financial VPs, newly hired corporate executives, and venture capital wizards who exist in a cloud of unknowing, fed with a diet of information without provenance captured on the digital equivalent of animated 3×5 inch  note cards.

Stephen Arnold, January 31, 2009

Beyond Search, Public Relations, and News

January 22, 2009

One of my neighbors has been trying out her .60 GE M134 Predator. I don’t think she had depleted uranium bullets, and I haven’t seen squirrels or my neighbors’ dogs since her fusillade. My publisher of my forthcoming Google monograph sent me a joke germane to the weird world I inhabit with this Web log. Disclaimer and editorial policy for the Web log is here in case you are not familiar with the addled goose’s approach.

First, the joke, translated by my demanding, time obsessed editor whom I have known for more than 25 years. We’re still not pals, which provides some insight to my inherent likeability. He’s a gem, of course.

Joke: Consultants, PR People, SEO Mavens Embraced

A shepherd was guarding his flock in the middle of the countryside when, in a cloud of dust, he spied a Range Rover coming towards him. The driver — a young man in an Armani suit, Gucci shoes, Ray Ban sunglasses and Hermés tie — lowered his window and said to the shepherd: “If I can tell you exactly how many sheep there are in your flock, will you give me one of them?”

The shepherd looked at the young man and replied: “Certainly”.

The man parked his car, fired up his laptop computer, connected it to his mobile phone, surfed the Web to the NASA page, communicated with a satellite navigation system, surveyed the region, opened a database and thirty or so Excel sheets with complex formulae. Finally, he printed off a detailed report of around ten pages on his miniature printer and told the shepherd: “You have exactly 1,586 sheep in your flock”.

“That is accurate,” said the shepherd. “As agreed, please take one”.

He watched the young man make his choice and install the animal in the back of his car, then he added: “If I can guess your profession correctly, may I have my animal back?” “Why not?” replied the young man.

“You are a high-powered information consultant”, said the shepherd. “Absolutely correct,” came the reply. “How did you know that?”

“It’s easy. You arrive without having been asked. You want to be paid to answer a question to which I already knew the answer and, quite evidently, you know nothing at all about my business. Now, please give me back my dog”. (Translated by Harry Collier, Infonortics, Ltd. 2009)

If you didn’t “get” the joke. Here’s a visual aid, needed because 20 percent of the US doesn’t read at the high school level and most professionals under the age of 30 (what I call the trophy generation) are often happier with a YouTube.com video than a verbal challenge-response approach to a topic.

image tyson

Tip: the sheep is white. The dog who is ArnoldIT.com’s marketing consultant, is the caramel color animal. Both have similar ears which may confuse some of the trophy generation who think I am a publishing company.

Read more

Nexplore: Another Google Challenger

January 14, 2009

Nexplore Search here is a Web search system with some interesting functions. A reader alerted me to the firm’s sharp increase in Web traffic. I had looked at the system last year, and I wanted to revisit the Web search company’s service.

The company said:

It starts with Nexplore Search Redefined a visually engaging user friendly, interactive multi-media interface makes navigation effortless and drill down obsolete.

The company indexes 50 billion Web pages. According to the company here, its system:

redefines the search experience. A visually engaging, user-friendly, multi-media interface makes navigation effortless and drill down obsolete. Computer intelligence combined with human community fosters greater relevancy — in both search results and ad displays. Intuitive refinement tools and advanced personalization features make search faster, easier and more enjoyable for everyone — from Web newbies to average users to accomplished surfers.

My test queries returned useful results. For example, for “enterprise search” returned links to Vivisimo, Coveo, and Endeca as “sponsored results”, which is okay. The first hit–somewhat surprisingly was to Microsoft.com enterprise search page here, not to the Fast Search page here. The Fast Search page seems a bit spare these days, so Nexplore seems to have indexed the Microsoft page as the number one enterprise search hit. I find this surprising, but I don’t have a good enough feel for what Nexplore is doing to determine relevancy.

nexplore

Nexplore results for the query “enterprise search”.

The interface provides hot links to suggested or related queries, a feature Nexplore calls “Pop Search”. The system includes a link to a “Wiki Search”, which is okay, but the number two result in the hit list is a Wikipedia link. The sponsored results contained a surprise. There was a direct link to Ontolica, a unit of Surf Ray. Surf Ray has been the subject of considerable speculation. In fact, if you run a query for “Surf Ray” from this page on the Beyond Search Web log, you can follow the conversation about the company’s various managerial and financial ills. Obviously someone paid to put the Ontolica ad on the Nexplore results page, so this cannot be an error. So0me of the firms in the Sponsor Results were equally interesting; for example, I don’t think too much about Abbrevity, Accenture, or EMC as big players in the enterprise search sector. But someone is paying to reach eyeballs for the query “enterprise search”. Two results struck me as peculiar in the main results list. First, the inclusion of the Enterprise Search Summit 2009. I heard the show attracted 60 paying customers, so the owner of the show must be working overtime to pump up the search engine optimization to get the program to appear among vendors of search systems. The second anomaly is the exclusion of Google and its Google Search Appliance. Odd. Google has more than 16,000 licensees of its enterprise search appliance, which puts it on an equal footing or slightly ahead of Autonomy, another company not in the results list.

One useful touch is that the results for a news search are run against the query in the query box. No annoying retyping required. The video link did not return a direct link to any videos on the Google Channel. Majority of the videos came from Blinkx, a company touting itself as the largest index of video content. The exclusion of Google may be due to Google, not Nexplore, however.

The image search in response to the query “enterprise search” was not useful. The illustrations did not include the images that I know are available on the Web sites of the leading vendors. For example, the Google search appliance pages include screen shots. Similar images may be found on the Web sites of Autonomy, Coveo, and Endeca, to name just three companies who make visual content available for potential buyers. The inclusion of the defunct Enterprise Search Report was an anomaly. More recent reports such as the Gilbane Beyond Search study and the Galatea Successful Enterprise Search Management were not included on the first page of the results. The image search for this test query was not useful to me. The blog search was not useful either. The majority of the links were not directly about enterprise search. Presumably, the Nexplore indexing system does not handle synonyms for “enterprise search” at this stage of the content processing subsystem’s development. I will monitor this function going forward. A similar statement may be made about enterprise search podcasts. The inclusion of enterprise networking in the results set requires me to listen to a podcast to determine if the information would be of interest to me. My hunch is that “enterprise search” as a podcast subject is too narrow to be of much indexing traction.

The company offers several search related services:

  1. MyCircle–an application agnostic social computing platform
  2. AdCircle–Ad creation and management tool
  3. HitLabel–contents, prizes, and tools for aspiring music stars

The company’s president and founder is Edward Mandel and Dion Hinchcliffe the chief technical officer. Mr. Mandel was in 2004 a distinguished as a finalist for the Ernst & Young Entrepreneur of the Year Award. Prior to Positive Software Systems, Mandel ran a successful technology consulting firm, IIT Consulting. Mr. Hinchcliffe served as president and chief technology officer of Alexandria, Virginia-based Hinchcliffe & Company, a premier Enterprise Web 2.0 consulting and advisory Firm.

has added a former Microsoft vice president (Rowland Hanson) to the firm’s advisory board.

Nexplore has stated that the company is attracting more than five million unique monthly visitors and that the search system ranks in the top 5,000 internationally ranked Web sites, based on Alexa data. You can read the news story here. The company is publicly traded under the symbol NXPC. Ask your broker to pull the data from the “Pink Sheet” listings. You can read the company’s 2008 financial news release here. I scanned the information on the three page document. Several points jumped out at me:

  • The company describes itself as “a development stage company”. I interpreted this phrase that the firm will be seeking additional funding.
  • The company’s net loses through June 2008 were about $17 million. Most of this money is probably due to the investment in the system and software
  • Through June 30, 2008, the company generated almost $700,000 in revenues. The next financial statement will make it easier to determine how the present economic environment is affecting this company

The $64 question is, “Is Nexplore the next Google?” If you want to bet on Nexplore, contact the company here. I will add this search system to my watch list.

Stephen Arnold, January 13, 2009

Interview Exclusive: Exalead’s New US Chief Executive Officer

January 5, 2009

On January 2, 2008, I spoke with Paul Doscher, the newly appointed chief executive officer for Exalead, the Paris-based information access company. I received a preview of Exalead technology in November 2008, and I will summarize some of my impressions in a short white paper on my ArnoldIT.com Web site in the next few days.

The full text of my interview with Mr. Doscher appears below:

Why are you expanding in the US market? What’s your background?

Exalead has seen tremendous growth in Europe over the past few years and unlike some of our competitors, our clients are with us for the long haul. We enjoy 100% customer referenceability in Europe. The US represents a significant growth engine for Exalead and we believe we are in a unique position not just to grow our US business – but to help redefine the information access industry.

I have been in the computer software space for 30 years starting in sales and sales management eventually leading to my most recent role as CEO. I have worked in companies such as Oracle, Business Objects and VMware. Before becoming CEO of Exalead, Inc I was CEO of JasperSoft, the leading open source business intelligence company.

What is the major content processing problem your system solves?

This is a new era in information access. In business, valuable information is increasingly stored in silos – dozens of various locations and data formats – that are hard to retrieve in a way that provides necessary context to the end user. Exalead CloudView has been designed to make sense of the structured and unstructured data found both internally behind the firewall and from external sources. Exalead offers quick-to-implement information access solutions that help workers, partners and customers make better, faster and more accurate business decisions.

What is the basis of your firm’s technical approach?

Exalead provides a highly scalable and manageable information access platform built on open standards. Exalead transforms raw data, whatever its nature, into actionable intelligence through best of breed indexing, extraction and classification technologies.

Can you give me an example of your system in action? You don’t have to mention a company name, but I am interested in what the problem was and what your system delivered to the customer?

Exalead is moving beyond what people generally think of when they think about enterprise search. I’ll give you two examples – one that discusses an innovative use case of searching structured data. The second discusses unstructured data.

First is an example of our dealing with structured data. GEFCO, €3.5 billion company, ranks among Europe’s leading transport and logistics firms. They are using Exalead to track their vehicles. GEFCO’s new “Track and Trace” application is built upon Exalead’s flagship platform that offers powerful search functionality and can provide up-to-the-minute information from an extremely large data set. Integrated into GEFCO’s Internet portal Gefconet, Track and Trace allows GEFCO staff, partners and customers to locate the exact position of vehicles, track their progress and optimize transport schedules in real time.

Second is a project where we search and make sense of unstructured data. Our engineers at Exalead built an unreleased project called Restminer – a site aimed at helping find restaurants in a large city like New York City. What we do here is interesting. Restminer gives the user useful, structured information extracted from the unstructured web including dedicated press, blog posts, restaurant reviews, directories – with relevant tips coming from different sources.

Exalead is French owned company. What’s the customer footprint? As you look forward what is your goal for the footprint in 2009?

At the end of 20008, we have around 190 customers across multiple vertical markets including on-line media/publishing, social networking, the public sector, on-line directories, financial services and telecommunications. We are looking for 50% growth in our customer base in 2009.

The Exalead software was quite solid? What are the benefits your system delivers to a typical enterprise customer? Is it search or another type of solution?

Exalead provides information access and search solutions in basically three market segments: OEM, B2C and B2B.

In the OEM [original equipment manufacturing] market, software companies have realized what a powerful embedded search platform can bring to their own solution. ISVs [independent software vendors] enrich their functional capabilities by introducing new sources of content and more powerful access retrieval into their core applications.

In the B2C space, consumer web sites such as our customer RightMove in the UK are finding that a highly scalable information access solution can save on hardware costs and make their visitor’s experience much better (for www.rightmove.co.uk). Globally, we are seeing sites use our cutting edge semantic mash-up technologies to bring search result from video, audio and text, such as http://virgilio.alice.it/ in Italy.

For our B2B customers, we are seeing companies implement real-time search across multiple data repositories. Any search platform tied to mission critical business applications have to be flexible, scalable and fast. Exalead’s product is used in various mission critical implementations, including track and tracing trucks; operational reporting and large scale document searches.

I recall hearing that your firm has patented technology? Can you provide me with a snapshot of this invention? What’s the patent application number? How many patents does your firm have? What are the key features of the Exalead CloudView system?

Exalead has a significant number of patents granted and pending both in the US and EU relating to the areas of intelligent searching, indexing, keyword extraction and other aspects of the search technology. For example, US Patent 7152064 was issued to Exalead in 2006, providing for improved unified search results – allowing for end users to more easily navigate and refine complex search results.

Our explosive growth continues to drive innovation and functionality into our products – we continue to submit for new patents as our product expands.

In the OEM sector, Autonomy seems to be the giant with its OEM deals with BEA and the Verity OEM deals. Some of the Verity deals date from the late 1980s. How do you see Exalead fitting into this sector?

There is always a place for innovation. We are confident in our capabilities and how they can meet the growing demands of OEMs.

We are beginning to see customers move away from our competitor’s legacy OEM solutions. We provide an easy to implement, scalable and manageable solution. Also, we see growing demand for our simpler licensing model – which makes life much easier for our customers.

Exalead OEM has all the rich features as our other product platforms such as Enterprise Search Edition and the 360 Edition. No matter how huge the volume of information processed by the OEM application, Exalead CloudView provides an easy to implement SOA architecture. OEM customers build applications that search their own system’s content – as well as from any kind of other sources that can be relevant. OEMs can dramatically increase their product functionality and differentiation by adding search of external Web sites, external knowledge bases and building in new hybrid services using our developer kit.

There’s quite a bit of turmoil in search. In fact, the last few weeks Alexa (an Amazon company) closed its web search unit and Lycos Europe (which purchased software from my partner and me in the mid 1990s) said it would close up shop. What’s that mean for Exalead going forward?

Our web search engine is available at www.exalead.com/search. Based on CloudView, it provides Internet users with an innovative way of discovering results and content from the Web’s 8 billion+ pages. Web search has always been a real world lab to test our technologies and user features – some of which, like facial recognition, have been implemented on Exalead well in advance of their use on other major search sites. But, more than this, we consider the Web as a key source of information – competitive intelligence, partner information, customer information, legal documents, external database providers, blogs, etc. There is more and more key information on the web that enterprises need to manage effectively. Exalead Web search is key in the overall Exalead strategy – and the functionality on our Internet search site will continue to drive innovation in our information access platform.

One trend in enterprise content processing is the shift from results lists to answers. Among the companies in this sector are Relegence (a Time Warner company), Connotate (privately held but backed by Goldman Sachs), and Attivio (a company describing itself as delivering active intelligence). Each of these firms is really in the search business but positioning search as “intelligence”. What’s your take on the changing face of search in an organization?

If making information instantly available for decisions is intelligence, we definitely are working in the information intelligence business. Our approach is driven by customer demand for TCO and ROI – we bring real value to businesses looking to make better, faster decisions. For example, at our customer GEFCO, structured data is available in real time for staff and customers so transportation cycles can be adjusted in real time – significantly improving their bottom line.

As the economic crisis depends, we continue to see our partners such as Capgemini, Logica, and Sogeti come up with new, exciting solutions for Exalead CloudView for their customers.

Google has been a disruptive force in search. In one US agency, different Google resellers have placed search appliances, often at $400,000 a unit in a major US government agency. No single person realized that there were more than $6 million worth of devices. As a result, the project to “fix” search means that Google is the default search system. What are the challenges and opportunities Google presents to Exalead? What about the challenges and opportunities Microsoft presents with its strong grip on the desktop and a growing presence in servers?

Ironically, former Google and Microsoft customers fuel much of our sales funnel – so we appreciate and benefit from everyone’s niche in this marketplace.

Google raised end-user expectations about what web search can achieve – it brought a new level of simplicity, relevancy and interactivity. But as we’ve seen as more Google Enterprise Search customers move to Exalead – bringing this functionality to enterprises is a different matter all together.

Google Enterprise Search has technical and functional limits in terms of scalability, security compliance, the ability to search structure and unstructured data and the ability to provide all the necessary context to make a search relevant. Enterprises know that information access means more than a flat list of results – which is driving more companies to look at Exalead.

Microsoft and its acquisition of FAST Search & Transfer brought many opportunities to us as well. For example, we’ve seen a growing number of companies who use Linux or other non-Microsoft operating systems look for a new partner instead of Microsoft.

Mobile search is slowly making headway. Some of the push has been because of the iPhone and Google’s report that queries on an iPhone are higher than from users with other brands of smart phones? What does Exalead provide for mobile search?

Exalead is actively working with mobile companies and telcos in a number of ways. We launched an iPhone search www.exalead.com/iphone in Europe. We are also working with mobile companies to help connect mobile devices to PCs and help accelerate access to mobile content. We will announce more of this functionality in 2009.

The economic climate is quite weak. How is Exalead adjusting to this global problem? I have heard that you have built out a US office with more than two dozen people? Is that correct?

We met all of our aggressive sales numbers in 2008 – in large part because our technologies provide our customers a high return on their investment. We unleash new levels of information access and allow better, faster decision-making. So far, it appears the appetite for our offerings is growing in this economic client.

What are the three major trends you see with regards to search and content processing in 2009?

The biggest trend we see in 2009 is that search will become a development platform. Open product platforms like Exalead will become a platform for new, unexpected solutions by 3rd party vendors.

Other big trends in 2009 will be continuation of what we’ve seen over past few years: smarter context around search results and better searching of rich content including audio and video.

Can you hint at what’s coming in 2009 in terms of features in the CloudView system?

The launch of Exalead CloudView 360 later this year will be a game changer for the industry. Exalead CloudView 360 will have functionality that will transform heterogeneous corporate data into contextualized building blocks of business information that can be directly searched and queried – and allow for an explosion of new applications to be built on top of the platform.

Stephen Arnold, January 5, 2008

Information 2009: Challenges and Trends

December 4, 2008

Before I was once again sent back to Kentucky by President Bush’s appointees, I recall sitting in a meeting when an administration official said, “We don’t know what we don’t know.” When we think about search, content processing, assisted navigation, and text mining, that catchphrase rings true.

Successes

But we are learning how to deliver some notable successes. Let me begin by highlighting several.

Paginas Amarillas is the leading online business directory in Columbia. The company has built a new systems using technology from a search and content processing company called  Intelligenx. Similar success stories and be identified for Autonomy, Coveo, Exalead, and ISYS Search Software. Exalead has deployed a successful logistics information system which has made customers’ and employees’ information lives easier. According to my sources, the company’s chief financial officer is pleased as well because certain time consuming tasks have been accelerated which reduces operating costs. Autonomy has enjoyed similar success at the US Department of Energy.

Newcomers such as Attivio and Perfect Search also have satisfied customers. Open source companies can also point to notable successes; for example, Lemur Consulting’s use of Flax for a popular UK home furnishing Web site. In Web search, how many of you use Google? I can conclude that most of you are reasonably satisfied with ad-supported Web search.

Progress Evident

These companies underscore the progress that has been made in search and content processing. But there are some significant challenges. Let me mention several which trouble me.

These range from legal inquiries into financial improprieties at Fast Search & Transfer, now part of Microsoft to open Web squabbles about the financial stability of a Danish company which owns Mondosoft, Ontolica, and Speed of Mind. Other companies have shut their doors; for example, Alexa Web search, Delphes, and Lycos Europe. Some firms such as one vendor in Los Angeles has had to slash its staff to three employees and take steps to sell the firm’s intellectual property which rightly concerns some of the company’s clients.

User Concerns

Another warning may be found in the results from surveys such as the one I conducted for a US government agency in 2007 that found dissatisfaction with existing search systems in the 65 percent range. AIIM, a US trade group, out-of-orderreported slightly lower levels of dissatisfaction. Jane McConnell’s recently released study in Paris reports data in line with my findings. We need to be mindful that user expectations are changing in two different ways.

First, most people today know how to search with Google and get useful information most of the time. The fact that Google is search for upwards of 65 percent of North American users and almost 75 percent of European Union users means that Google is the search system by which users measure other types of information access. Google’s influence has been essentially unchecked by meaningful competition for 10 years. In my Web log, I have invested some time in describing Microsoft’s cloud computing initiatives from 1999 to the present day.

For me and maybe many of you, Google has become an environmental factor, and it is disrupting, possibly warping, many information spaces, including search, content processing, data management, applications like word processing, mapping, and others.

time-space-warping

Microsoft is working to counter Google, and its strategy is a combination of software and low adoption costs. I believe that Microsoft’s SharePoint has become the dominant content management, collaboration, and search platform with 100 million licenses in organizations. SharePoint, however, is not well understood as technically complex and a work in progress. Anyone who asserts that SharePoint is simple or easy is misrepresenting the system. Here’s a diagram from a Microsoft Certified Gold vendor in New Zealand. Simple this is not.

sharepoint-vendor-diagram

Read more

Exalead: Voice to Text

November 3, 2008

A happy quack to the stylish Parisian who alerted me to Exalead’s voice to text demonstration. To use the service, navigate to http://labs.exalead.com or click here. I entered several test queries and looked at the quality of the ASCII. I was impressed. I was able to get useful hits on my trusty query ‘bush and iraq”. My Google queries worked well too. Keep in mind that the system has processed a chunk of audio and video. The voices in the files are converted, indexed, and made searchable. One nifty feature is that if a video contains several references to the query term, an icon on the play bar allowed me to jump from relevant comment to relevant comment. No more serial listening to talking heads. Two happy quacks for the Exalead engineers who worked on this demo. Several other nice touches warrant highlighting:

  1. The system can parse a query such as ‘show me videos about iraq’
  2. Entities are automatically extracted and displayed in a side bar for assisted navigation
  3. A tab allows you to limit your query to audio, video, video on demand, or the entire suite of content.

For me, the most useful feature was the ability to click the ‘text’ link and see the transcribed text of the news show. Here’s a snippet of the machine converted and transcribed text:

the big apple behind the turntable strolling down the house makes tonight in chicago is craig alexander find your way to the bone bloomer whom you’ve gone and only together since the first of the year the brian james van by achieving their goal of crafting and plain old b. s. rock and roll the show tonight is that the hurricane in kansas city that’s a for tonight’s live music on the east coast air midwest for a look at what’s gone down monday night in the south boston that soars southern music reporter john spellman

My recommendation to Exalead is to start processing more content. I would love to have a transcript of the Google lecture series. A collection of security podcasts would be really useful. I don’t like to listen to 50 minutes of lousy audio to find one or two useful chunks of information.

I usually try to remind the French that folks from Kentucky know how to cook chicken correctly. None of coq au vin stuff. We use lard and whatever is growing behind the compost heap. But in this case, I won’t make any reference to cuisine. I will just say, “Voice to text… well done.”

Stephen Arnold, November 3, 2008 from somewhere in Europe

« Previous PageNext Page »

  • Archives

  • Recent Posts

  • Meta