Autonomy’s Lynch: Governance Market

January 25, 2009

I have been confused by the words used by search wizards to describe what their systems do. As each wave of search and content processing technology crashes against the corporate beach head, the buyers surf, watch, or flee. The waves recede and after a while another wave builds and heads towards the beach head. An endless cycle think I. Governance, short for risk, is now an official buzzword.

CIO Magazine (UK version) ran an interview with Sir Michael Lynch, senior wizard at Autonomy, the Cambridge-based information systems company. I have praised Autonomy for two core attributes that other vendors cannot quite duplicate. First, a sense of what the market wants to buy or hear. Second, the ability to close deals in business sectors where other competitors have either failed or have a smaller presence. Perhaps it is drinking the water of the River Cam? Whatever the reason, Autonomy has been outflanking outfits like the scrutinized Fast Search & Transfer, dozens of newcomers, and some high-profile players like Endeca.

In Martin Veitch’s interview with Sir Michael here, I noticed several interesting comments. Let me highlight two so you have to read the original and not rely upon me as your Kentucky intellect (heaven help you!).

Point one: content management is a discipline that’s going to change. Autonomy, I opine, wants to lead the charge. Interwoven provides the helmet and the lance. Autonomy provides the knights, the battle gear, and the charger named IDOL. CMS has been one of those quasi software inventions that start out small and then multiply like gerbils. You have lots of gerbils, but they are not too useful in my opinion. Then the CMS consultants–almost always members of the trophy generation who think of content as a Web page–run up their bills. Autonomy, as I understand Sir Michael, wants to gallop into the CMS vendors and clear the field.

Point two: worries about multiple products and services that do similar functions are not the issue. The focus is on growing revenue. Autonomy will pick its friends and then makes sales. Over time, the duplication of products and services will sort themselves out. That strategy seems to have worked with IDOL and Verity’s K2.

The more interesting question to me is, “Which search and content processing vendor will challenge Autonomy in this new sector?” Any suggestions?

Stephen Arnold, January 25, 2009

Semantico CEO Interviewed

January 19, 2009

Richard Padley is an engaging information, content and search wizard. I visited with his technical team at the Semantico exhibit at the Online London show. Unlike the US search conferences, the team working on the London show for Incisive does a much better job of attracting interesting speakers and exhibitors. Semantico works with organizations producing content. Mr. Padley helps these organizations make better use of staff time and prepare content so that it can be quickly and economically repurposed. I followed up with him in early January 2009 and managed to convince him to participate in a Search Wizards Speak interview. You can find the full text of that interview here.

One of the intriguing comments, Mr. Padley made was:

Because of the sort of company we are, it’s less about technology invention for us than about adaptation and selection. Open source is very important to us, for instance, and we’re strong believers in using the best tool for the job. In the past 18 months we’ve started using the Mark Logic database to build the publishing platforms we deliver – which is a very innovative platform. It allows us to put different kinds of content together and query them in ways that allow publishers to build new kinds of products, while still respecting the different sizes and shapes that content comes in. ‘Content in context’ is a phase I hate, but it does provide a way of getting a handle on this for the layman: it’s about providing small snippets of contextualized information within the workflow. Say you’re a vet – we can build a platform that can integrate reference information about veterinary medicine with available drugs and your own patient records, and so on.

What’s interesting is that MarkLogic has carved out an interesting niche providing a flexible, robust data management system. Now an ecosystem is beginning to form around that MarkLogic platform. Search, while a key function in Semantico’s arsenal, is secondary to solving information problems. As companies like Microsoft chase the search dream with questionable technologies that are getting long in the tooth, Semantico is helping create a new approach to content processing. Note: you can read the interview with Mark Logic’s top executive here.

You can get more information about Semantico here.

Stephen Arnold, January 19, 2009

Xsearch CEO Norbert Weitkämper Interviewed

January 12, 2009

Weitkämper Technology–based in Staffelsee, Germany–is a search and content processing vendor with a low profile in North America. The firm offers its multi-source search suite that incorporates proprietary technology to deliver fast content and query processing. The company’s XSEARCH package is customizable to focus on the client’s specific need. It offers nine variables: Clustering Engine, Suggest, DidYouMean, Summarizer, Linguistic Engine, Federated Search, Facet Navigator, Entity Extractor and Intelligent Classifier.

The industrial engineer was dissatisfied with the search results available from commercial products. Norbert Weitkämper developed  Xsearch after working in electronic publishing. He told Search Wizards Speak:

As we are specialized on search for more than a decade our package is very well tuned; not only for speed but also for content for example. We will combine our new HitEngine with our established technologies like Linguistic, Did-You-Mean, clustering, synonyms and ontologies, or our personal ranking mechanisms. They are already released, we just have to melt them together.

He added:

For the complex roman languages our linguistic engine with its morphologic analysis is a big advantage, because algorithmic approaches like Bayesian or Porter, which are doing a good job for English, are a miserable failure.

On the subject of semantic analysis, Mr. Weitkämper said:

Semantic analysis is much more difficult for European languages than for English. We are already able to integrate thesauri or ontologies. I have not seen any system yet which meets the requirements for semantic analysis – at least when you have a closer look into the system. But storing information in a quick and accessible way is even more important for this approach, as you have to consider much more than only keywords and positions. So I can imagine that our optimized index structure may help also in this field to achieve adequate results in an acceptable amount of time.

More information about the company is available at its Web site, http://www.weitkamper.com. The full text of the interview with Mr. Weitkämper is at http://www.arnoldit.com/search-wizards-speak/xsearch.html.

Stephen Arnold, January 12, 2009

Search Pioneer Upshifts: Interview with Mike Weiner

January 6, 2009

In the 1980s I relied on a very fast search system for my personal computer. The program was Gopher from Microlytics. In the late 1990s, I met the founder of Gopher and tracker his interest in linguistic-centric search systems. I lost track of Mike Weiner, former president of Microlytics, but we spoke on the telephone a day or two ago. You can get information about Technology Innovations here. I captured his comments in an interview which is now available on the ArnoldIT.com Search Wizards Speaks sub site here.

Two comments in my conversation with Mr. Weiner struck a chord with me. Let me highlight these in this brief news item about the interview.

First, search has grown beyond the desktop. Mr. Weiner said in response to a question about desktop search:

…the desktop of today and tomorrow are connected to the “world.” So there can be very clever background processing done on your behalf that can leverage off the information you access and the information you create. The question will be, what’s useful and important to you, and can the system fetch, or generate, this, for you, and in an efficient form you can cognitively benefit from. One of the next potentials for incredible retrieval will be intelligent “information extraction.”

Second, Mr. Weiner’s new interests pivot on innovation. Technology Innovations holds patents on different facets of electronic paper or “epaper”. About the future of epaper, Mr. Weiner said:

I see epaper heavily used in educational publications, where children and learners have questions, need definitions, etc. You may see a speller and thesaurus, and translation technology coming bundled on books with electronic chips in them.

If you are interested in search and publishing in the 21st century, you will find the Mike Weiner interview interesting.

Stephen Arnold, January 6, 2008

New Conference Pushes beyond Search

January 5, 2009

After watching some of the traditional search and content processing conferences fall on their swords, muffins, and self-assurance in 2008, I have rejiggled my conference plans for 2009. One new venue that caught my attention is The Rockley Group’s event in Palm Springs, California, January 29-30, 2009. You can get more informatio0n about the program here. The event organizer is Ann Rockley, who is one of the people emphasizing the importance of intelligent content.

image

Ann Rockley, The Rockley Group

I spoke with Ms. Rockley on January 2, 2008. The text of that conversation appears below:

Why is another conference needed?

Admittedly there are a lot of conferences around for people to attend, but not one that focuses specifically on the topic of Intelligent Content. My background is content management, structured content and XML. There are lots of conferences that focus mainly on the technology, others that focus on the content vehicle or channel (e.g., web) and others that focus on XML. The technology oriented conferences often lose sight of the content; who it’s for, how can we most effectively create it and most importantly how can we optimize it for our customers. The content channel oriented conferences e.g. Web, focus only on the vehicle and forget that content is not just about the way we distribute it; content should be optimized for each channel yet at the same time it should be possible to repurpose and reconfigure the content for multiple channels. And XML conferences tend to be highly technical, focusing on the code and the applications and not on how we can optimize our content using XML so that we can manipulate it and transform it much the way we do data. So this conference is all about the CONTENT! Identifying how we can most effectively create it so that we can manipulate it, transform it and deliver it in a multitude of ways personalized for a particular audience is an area of focus sadly lacking in many conferences.

With topics like Web 2.0 and Social Search I am at a loss to know what will be covered. What are the issues your conference will address?

Web 2.0 is about social networking and sharing of content and media and it has had a tremendous influence on content. Organizations have huge volumes of content stuck in static web pages or files and they have a growing volume of content stuck, and sometimes lost in the masses of content being accumulated in wikis, blogs, etc. How can organizations integrate their content, share their content and make it most useful to their customers and readers without a lot of additional work? How do we combine the best of Web 2.0 with the best of traditional content practices? Organizations don’t have the time, resources or budget to do all the things we need and want to do for our customers, but if we create our content intelligently in the first place (structure it, tag it, store it) we can increase our ability to do so much more and increase our ability to effectively meet our customers’ needs. This conference was specifically designed to answer those questions.

Intelligent Content provides a venue for sharing information on such topics as:

  • Personalization (structured content, metadata and XQuery)
  • Intelligent publishing (dynamic multichannel delivery)
  • Hybrid content strategies (integrating Web 2.0 content with traditional customer content)
  • Dynamic messaging/personalized marketing
  • Increasing findability
  • Content/Information Management

Most attendees complain about two things: The quality of the presentations and the need for better networking with other attendees. How are you addressing these issues?

We are doing things a little differently. All the speakers have been assigned a mentor for review of their outline, drafts and final materials. We are working with them closely to ensure that the presentations are top notch and we have asked them all to summarize their information in Best Practices and Tips. In addition, Intelligent Content was designed to be a small intimate conference with lots of opportunities to network. We will have a luncheon with tables focused on special interests and we are arranging “Birds of a Feather” dinners where like-minded people can get together over a great meal and chat, have fun and network. We also have a number of panels which are designed to work interactively with the audience. And to increase the feeling intimacy we have not chosen to hold the conference in a traditional “big box” hotel, rather we have chosen a boutique hotel, the Parker Palm Springs (http://www.starwoodhotels.com/lemeridien/property/overview/index.html?propertyID=1911), a hotel favored by Hollywood stars from the 1930s. It is a very cool hotel with lots of character that encourages people to have fun while they interact and network.

What will you offer attendees?

The two day conference includes 16 sessions, 2 panels, breakfast, lunch and snacks. It also includes organized networking sessions both for lunch and dinner, and opportunities to ask the Experts key questions. And the conference isn’t over when it is over, we are setting up a Community of Practice including a blog, discussion forum, and webinars to continue to share and network so that every attendee will have an instant ongoing network.

I enjoy small group sessions and simple things like going to dinner with a group of people whom I don’t know. Will you include activities to help attendees make connections?

Absolutely. We deliberately designed the conference to be a small intimate learning experiencing so people weren’t lost in the crowd and we have specifically created a number of luncheon and dinner networking experiences.

How can I register to attend? What is the url for more information.

The conference information can be found at www.intelligentcontent2009.com. Contact info@intelligentcontent2009.com if you have questions. Note that the conference hotel is really a vacation destination so we can only hold the rooms at the special rate for a limited time and that expires January 12th so act quickly. And we’ve extended the early bird registration to Jan. 12 as well. If you have any other questions you can contact us at moreinfo@rockley.com.

Stephen Arnold, January 5, 2008

Interview Exclusive: Exalead’s New US Chief Executive Officer

January 5, 2009

On January 2, 2008, I spoke with Paul Doscher, the newly appointed chief executive officer for Exalead, the Paris-based information access company. I received a preview of Exalead technology in November 2008, and I will summarize some of my impressions in a short white paper on my ArnoldIT.com Web site in the next few days.

The full text of my interview with Mr. Doscher appears below:

Why are you expanding in the US market? What’s your background?

Exalead has seen tremendous growth in Europe over the past few years and unlike some of our competitors, our clients are with us for the long haul. We enjoy 100% customer referenceability in Europe. The US represents a significant growth engine for Exalead and we believe we are in a unique position not just to grow our US business – but to help redefine the information access industry.

I have been in the computer software space for 30 years starting in sales and sales management eventually leading to my most recent role as CEO. I have worked in companies such as Oracle, Business Objects and VMware. Before becoming CEO of Exalead, Inc I was CEO of JasperSoft, the leading open source business intelligence company.

What is the major content processing problem your system solves?

This is a new era in information access. In business, valuable information is increasingly stored in silos – dozens of various locations and data formats – that are hard to retrieve in a way that provides necessary context to the end user. Exalead CloudView has been designed to make sense of the structured and unstructured data found both internally behind the firewall and from external sources. Exalead offers quick-to-implement information access solutions that help workers, partners and customers make better, faster and more accurate business decisions.

What is the basis of your firm’s technical approach?

Exalead provides a highly scalable and manageable information access platform built on open standards. Exalead transforms raw data, whatever its nature, into actionable intelligence through best of breed indexing, extraction and classification technologies.

Can you give me an example of your system in action? You don’t have to mention a company name, but I am interested in what the problem was and what your system delivered to the customer?

Exalead is moving beyond what people generally think of when they think about enterprise search. I’ll give you two examples – one that discusses an innovative use case of searching structured data. The second discusses unstructured data.

First is an example of our dealing with structured data. GEFCO, €3.5 billion company, ranks among Europe’s leading transport and logistics firms. They are using Exalead to track their vehicles. GEFCO’s new “Track and Trace” application is built upon Exalead’s flagship platform that offers powerful search functionality and can provide up-to-the-minute information from an extremely large data set. Integrated into GEFCO’s Internet portal Gefconet, Track and Trace allows GEFCO staff, partners and customers to locate the exact position of vehicles, track their progress and optimize transport schedules in real time.

Second is a project where we search and make sense of unstructured data. Our engineers at Exalead built an unreleased project called Restminer – a site aimed at helping find restaurants in a large city like New York City. What we do here is interesting. Restminer gives the user useful, structured information extracted from the unstructured web including dedicated press, blog posts, restaurant reviews, directories – with relevant tips coming from different sources.

Exalead is French owned company. What’s the customer footprint? As you look forward what is your goal for the footprint in 2009?

At the end of 20008, we have around 190 customers across multiple vertical markets including on-line media/publishing, social networking, the public sector, on-line directories, financial services and telecommunications. We are looking for 50% growth in our customer base in 2009.

The Exalead software was quite solid? What are the benefits your system delivers to a typical enterprise customer? Is it search or another type of solution?

Exalead provides information access and search solutions in basically three market segments: OEM, B2C and B2B.

In the OEM [original equipment manufacturing] market, software companies have realized what a powerful embedded search platform can bring to their own solution. ISVs [independent software vendors] enrich their functional capabilities by introducing new sources of content and more powerful access retrieval into their core applications.

In the B2C space, consumer web sites such as our customer RightMove in the UK are finding that a highly scalable information access solution can save on hardware costs and make their visitor’s experience much better (for www.rightmove.co.uk). Globally, we are seeing sites use our cutting edge semantic mash-up technologies to bring search result from video, audio and text, such as http://virgilio.alice.it/ in Italy.

For our B2B customers, we are seeing companies implement real-time search across multiple data repositories. Any search platform tied to mission critical business applications have to be flexible, scalable and fast. Exalead’s product is used in various mission critical implementations, including track and tracing trucks; operational reporting and large scale document searches.

I recall hearing that your firm has patented technology? Can you provide me with a snapshot of this invention? What’s the patent application number? How many patents does your firm have? What are the key features of the Exalead CloudView system?

Exalead has a significant number of patents granted and pending both in the US and EU relating to the areas of intelligent searching, indexing, keyword extraction and other aspects of the search technology. For example, US Patent 7152064 was issued to Exalead in 2006, providing for improved unified search results – allowing for end users to more easily navigate and refine complex search results.

Our explosive growth continues to drive innovation and functionality into our products – we continue to submit for new patents as our product expands.

In the OEM sector, Autonomy seems to be the giant with its OEM deals with BEA and the Verity OEM deals. Some of the Verity deals date from the late 1980s. How do you see Exalead fitting into this sector?

There is always a place for innovation. We are confident in our capabilities and how they can meet the growing demands of OEMs.

We are beginning to see customers move away from our competitor’s legacy OEM solutions. We provide an easy to implement, scalable and manageable solution. Also, we see growing demand for our simpler licensing model – which makes life much easier for our customers.

Exalead OEM has all the rich features as our other product platforms such as Enterprise Search Edition and the 360 Edition. No matter how huge the volume of information processed by the OEM application, Exalead CloudView provides an easy to implement SOA architecture. OEM customers build applications that search their own system’s content – as well as from any kind of other sources that can be relevant. OEMs can dramatically increase their product functionality and differentiation by adding search of external Web sites, external knowledge bases and building in new hybrid services using our developer kit.

There’s quite a bit of turmoil in search. In fact, the last few weeks Alexa (an Amazon company) closed its web search unit and Lycos Europe (which purchased software from my partner and me in the mid 1990s) said it would close up shop. What’s that mean for Exalead going forward?

Our web search engine is available at www.exalead.com/search. Based on CloudView, it provides Internet users with an innovative way of discovering results and content from the Web’s 8 billion+ pages. Web search has always been a real world lab to test our technologies and user features – some of which, like facial recognition, have been implemented on Exalead well in advance of their use on other major search sites. But, more than this, we consider the Web as a key source of information – competitive intelligence, partner information, customer information, legal documents, external database providers, blogs, etc. There is more and more key information on the web that enterprises need to manage effectively. Exalead Web search is key in the overall Exalead strategy – and the functionality on our Internet search site will continue to drive innovation in our information access platform.

One trend in enterprise content processing is the shift from results lists to answers. Among the companies in this sector are Relegence (a Time Warner company), Connotate (privately held but backed by Goldman Sachs), and Attivio (a company describing itself as delivering active intelligence). Each of these firms is really in the search business but positioning search as “intelligence”. What’s your take on the changing face of search in an organization?

If making information instantly available for decisions is intelligence, we definitely are working in the information intelligence business. Our approach is driven by customer demand for TCO and ROI – we bring real value to businesses looking to make better, faster decisions. For example, at our customer GEFCO, structured data is available in real time for staff and customers so transportation cycles can be adjusted in real time – significantly improving their bottom line.

As the economic crisis depends, we continue to see our partners such as Capgemini, Logica, and Sogeti come up with new, exciting solutions for Exalead CloudView for their customers.

Google has been a disruptive force in search. In one US agency, different Google resellers have placed search appliances, often at $400,000 a unit in a major US government agency. No single person realized that there were more than $6 million worth of devices. As a result, the project to “fix” search means that Google is the default search system. What are the challenges and opportunities Google presents to Exalead? What about the challenges and opportunities Microsoft presents with its strong grip on the desktop and a growing presence in servers?

Ironically, former Google and Microsoft customers fuel much of our sales funnel – so we appreciate and benefit from everyone’s niche in this marketplace.

Google raised end-user expectations about what web search can achieve – it brought a new level of simplicity, relevancy and interactivity. But as we’ve seen as more Google Enterprise Search customers move to Exalead – bringing this functionality to enterprises is a different matter all together.

Google Enterprise Search has technical and functional limits in terms of scalability, security compliance, the ability to search structure and unstructured data and the ability to provide all the necessary context to make a search relevant. Enterprises know that information access means more than a flat list of results – which is driving more companies to look at Exalead.

Microsoft and its acquisition of FAST Search & Transfer brought many opportunities to us as well. For example, we’ve seen a growing number of companies who use Linux or other non-Microsoft operating systems look for a new partner instead of Microsoft.

Mobile search is slowly making headway. Some of the push has been because of the iPhone and Google’s report that queries on an iPhone are higher than from users with other brands of smart phones? What does Exalead provide for mobile search?

Exalead is actively working with mobile companies and telcos in a number of ways. We launched an iPhone search www.exalead.com/iphone in Europe. We are also working with mobile companies to help connect mobile devices to PCs and help accelerate access to mobile content. We will announce more of this functionality in 2009.

The economic climate is quite weak. How is Exalead adjusting to this global problem? I have heard that you have built out a US office with more than two dozen people? Is that correct?

We met all of our aggressive sales numbers in 2008 – in large part because our technologies provide our customers a high return on their investment. We unleash new levels of information access and allow better, faster decision-making. So far, it appears the appetite for our offerings is growing in this economic client.

What are the three major trends you see with regards to search and content processing in 2009?

The biggest trend we see in 2009 is that search will become a development platform. Open product platforms like Exalead will become a platform for new, unexpected solutions by 3rd party vendors.

Other big trends in 2009 will be continuation of what we’ve seen over past few years: smarter context around search results and better searching of rich content including audio and video.

Can you hint at what’s coming in 2009 in terms of features in the CloudView system?

The launch of Exalead CloudView 360 later this year will be a game changer for the industry. Exalead CloudView 360 will have functionality that will transform heterogeneous corporate data into contextualized building blocks of business information that can be directly searched and queried – and allow for an explosion of new applications to be built on top of the platform.

Stephen Arnold, January 5, 2008

Expert System’s Luca Scagliarini

December 18, 2008

ArnoldIT.com’s Search Wizards Speak’s series has landed another exclusive. Hard on the heels of the interview with Autonomy’s chief operating officer, Luca Scagliarini, one of the senior executives at Expert System in Modena, Italy, explains the company’s technology and strategy for 2009. Mr. Scagliarini is a technologist’s technologist and a recognized leader in next generation search systems. The company’s COGITO technology has cut a wide swath through European markets and is now available in North America. Mr. Scagliarini told ArnoldIT.com’s Beyond Search:

A major mobile handheld manufacturer uses our technology to address the issue of supporting new users in learning how to use the device. The objective was to reduce the return rate of the device AND to reduce the customer support costs. This natural language-based solution leverages our semantic technology to provide their customers with a simple and effective tool to answer questions and how-to queries with consistency and high precision. As of today the system has answered, in only 5 months, more than 4 million questions with more than 87% precision.

Search is no longer key word matching and long lists of results. Mr. Scagliarini said:

To deliver an effective question and answer system that works on more than a small set of FAQ, it is very important to have a deep understanding of the text. This is possible only through deep semantic analysis. We have several implementations of our natural language Q&A product recently renamed COGITO Answer. In the next 12 months, we will be investing to expand our footprint worldwide–especially in the U.S. and in the Persian Gulf region to replicate our European success there. In the U.S, we are now supporting customer service operations with natural language Q&A for a government unit of the Department of the Interior and we are one of only 5 semantic partners actively promoted by Oracle.

You can read the complete interview with Mr. Scagliarini on the ArnoldIT.com Web site or you can click here. More information about the company and its technology may be found on the firm’s Web site http://www.expertsystem.net or click here.

Semantic Search Laid Bare

December 17, 2008

Yahoo’s Search Blog here has an interesting interview with Dr. Rudi Studer. The focus is semantic search technologies, which are all the rage in enterprise search and Web search circles. Dr. Studer, according to Yahoo:

is no stranger to the world of semantic search. A full professor in Applied Informatics at University of Karlsruhe, Dr. Studer is also director of the Karlsruhe Service Research Institute, an interdisciplinary center designed to spur new concepts and technologies for a services-based economy. His areas of research include ontology management, semantic web services, and knowledge management. He has been a past president of the Semantic Web Science Association and has served as Editor-in-Chief of the journal Web Semantics.

If you are interested in semantics, you will want to read and save the full text of this interview. I want to highlight three points that caught my attention and then–in my goosely manner–offer several observations.

First, Dr. Studer suggests that “lightweight semantic technologies” have a role to play. He said:

In the context of combining Web 2.0 and Semantic Web technologies, we see that the Web is the central point. In terms of short term impact, Web 2.0 has clearly passed the Semantic Web, but in the long run there is a lot that Semantic Web technologies can contribute. We see especially promising advancements in developing and deploying lightweight semantic approaches.

The key idea is lightweight, not giant semantic engines grinding in a lights out data center.

Second, Dr. Studer asserts:

Once search engines index Semantic Web data, the benefits will be even more obvious and immediate to the end user. Yahoo!’s SearchMonkey is a good example of this. In turn, if there is a benefit for the end user, content providers will make their data available using Semantic Web standards.

The idea is that in this chicken and egg problem, it will be the Web page creators’s job to make use of semantic tags.

Finally, Dr. Studer identifies tools as an issue. He said:

One problem in the early days was that the tool support was not as mature as for other technologies. This has changed over the years as we now have stable tooling infrastructure available. This also becomes apparent when looking at the at this year’s Semantic Web Challenge. Another aspect is the complexity of some of the technologies. For example, understanding the foundation of languages such as OWL (being based on Description Logics) is not trivial. At the same time, doing useful stuff does not require being an expert in Logics – many things can already be done exploiting only a small subset of all the language features.

I am no semantic expert. I have watched several semantic centric initiatives enter the world and–somewhat sadly–watched them die. Against this background, let me offer three observations:

  1. Semantic technology is plumbing and like plumbing, semantic technology should be kept out of sight. I want to use plumbing in a user friendly, problem free setting. Beyond that, I don’t want to know anything about plumbing. Lightweight or heavyweight, I think some other users may feel the same way. Do I look at inverted indexes? Do you?
  2. The notion of putting the burden on Web page or content creators is a great idea, but it won’t work. When I analyzed the five Programmable Search Engine inventions by Ramanathan Guha as part of an analysis for the late, great BearStearns, it was clear that Google’s clever Dr. Guha assumed most content would not be tagged in a useful way. Sure, if content was properly tagged, Google could ingest that information. But the core of the PSE invention was Google’s method for taking the semantic bull by the horns. If Dr. Guha’s method works, then Google will become the semantic Web because it will do the tagging work that most people cannot or will not do.
  3. The tools are getting better, but I don’t think users want to use tools. Users want life to be easy, and figuring out how to create appropriate tags, inserting them, and conforming to “standards” such as they are is no fun. The tools will thrill developers and leave most people cold. Check out the tools section at a hardware store. What do you see? Hobbyists and tinkerers and maybe a few professionals who grab what they need and head out. Semantic tools will be like hardware: of interest to a few.

In my opinion, the Google – Guha approach is the one to watch. The semantic Web is gaining traction, but it is in its infancy. If Google jump starts the process by saying, “We will do it for you”, then Google will “own” the semantic Web. Then what? The professional semantic Web folks will grouse, but the GOOG will ignore the howls of protest. Why do you think the GOOG hired Dr. Guha from IBM Almaden? Why did the GOOG create an environment for Dr. Guha to write five patent applications, file them on the same day, and have the USPTO publish five documents on the same day in February 2007? No accident tell you I.

Stephen Arnold, December 17, 2008

Stephen Arnold

New Open Source Search Vendor

December 5, 2008

Hans-Christian Brockmann, founder of brox, an open source search and content processing company, reveals his vision for his company here. Mr. Brockmann wants to provide organizations with an alternative to proprietary search systems. In an exclusive interview with ArnoldIT.com, a management consulting firm founded by Stephen E. Arnold, Mr. Brockmann said:

Having spent 10 years catering to enterprise customers, we are very much aware of the strategic and political aspects of deploying IT-infrastructures to large organizations. Our project SMILA (SeMantic Information Logistics Architecture) surely can be considered an infrastructure project which embraces open source search within it. Open source solutions are being absorbed in the enterprise, as evidenced by Lucene interest.

The statistical significance of the 180,000 Lucene projects is it underscores the sheer amount of “do it yourself” projects out there, actually massively more than there are professional commercial search implementations.

We want to provide these “do it yourself” search projects with the tools they all are missing..

Having spent 10 years catering to enterprise customers, we are very much aware of the strategic and political aspects of deploying IT-infrastructures to large organizations. Our project SMILA (SeMantic Information Logistics Architecture) surely can be considered an infrastructure project which embraces open source search within it. Open source solutions are being absorbed in the enterprise, as evidenced by Lucene interest. The statistical significance of the 180,000 Lucene projects is it underscores the sheer amount of “do it yourself” projects out there, actually massively more than there are professional commercial search implementations. We want to provide these “do it yourself” search projects with the tools they all are missing.Having spent 10 years catering to enterprise customers, we are very much aware of the strategic and political aspects of deploying IT-infrastructures to large organizations. Our project SMILA (SeMantic Information Logistics Architecture) surely can be considered an infrastructure project which embraces open source search within it. Open source solutions are being absorbed in the enterprise, as evidenced by Lucene interest. 

Mr. Brockmann’s estimate of the number of Lucene installations is one of the first ArnoldIT.com has been able to obtain.

Mr. Brockmann added:

Putting in the plumbing for the next generation of semantic applications is something every organization will have to do in order to remain competitive. In the course of this they will more or less all stumble across the same issues. To name a few: Security, scalability, connectivity, longevity and of course, maintenance and support for such a large infrastructure. We suggest it is best to share the cost of implementing and maintaining the infrastructure – which is not part of the strategic competencies of any company – on a shared basis. If you look at the top 1000 companies globally and ask them to deliver a list of application software they are using,the lists will be almost identical. The breadth and depth of use of certain products may be different, but basically the tools are similar.

More information about brox, the company Mr. Brockmann founded is here. You can read the full text of the interview here.

Stephen Arnold, December 5, 2008

ISYS Search Software CEO Interview

December 1, 2008

Scott Coles has joined ISYS Search Software as the firm’s chief executive officer. Ian Davies, founder, remains the chairman of the company. Among Mr. Coles’s tasks will be to lead the firm’s new strategic direction characterized by an expanded presence in Europe and Asia, specialized vertical-market offerings, a broader channel sales strategy, and a deeper set of embedded search solutions for original equipment manufacturers and independent software vendors.

Coles joins ISYS with a significant background in the commercialization of innovation for multinational corporations, holding senior executive roles with companies such as EDS, Lucent Technologies and Avaya. In the mid-1990s, Scott was the driving force behind the establishment and success of AT&T Bell Labs in Australia.

In his interview with ArnoldIT.com’s Search Wizards Speak, Coles provided information about the company’s focus in 2009.

On this topic, he said:

We are seeing significant increase in other software vendors coming to us to license our engine for incorporation into their products. This marks a general industry trend that I believe will increase significantly in the coming year. A number of applications today that previously had either none or only rudimentary search are finding that their products can be significantly enhanced with a sophisticated search engine. The amount of data that these applications have to deal with is now becoming so large that some form of pre-processing to narrow down to that which is relevant is becoming essential.

Mr. Coles also noted that Microsoft SharePoint continues to capture market share in content management and collaboration. However, the SharePoint user needs access to a range of content and:

ISYS can search all data, both inside and outside of SharePoint. In addition, ISYS provides high quality relevant results through features such as Boolean search operators, multi-dimensional clustering, and many others for which SharePoint users have expressed a desire that are currently not available in the native SharePoint product…we’ve taken great care to ensure our new “intelligent content analysis” methods are reliable, predictable and easily understood by the end user. These include parametric search and navigation, visual timeline refinement bars, intelligence clouds, de-duplication and intelligent query expansion. We’ve even added additional post-query processing to help streamline the e-discovery process. The end result is a core set of new capabilities that help our customers better cull and refine efficiently, without cutting corners on accuracy or relevance.

You can read the full text of the interview with Scott Coles at http://www.arnoldit.com/search-wizards-speak or click here.

« Previous PageNext Page »

  • Archives

  • Recent Posts

  • Meta