An Interview with Norbert Weitkämper
In December 2008, I met Norbert Weitkämper, the chief executive officer of Weitkämper Technology in Staffelsee, Germany. I followed up with Mr. Weitkämper in January 2009, gathering more information about the company's multi-source search suite. The company's technology processes text rapidly. Mr. Weitkämper is an industrial engineer. He has developed a system that can be used to complement existing enterprise search and content processing systems or be used as a solution. The full text of the interview appears below. |
What's your background? How did you become interested in search and information access?
I am an industrial engineer and graduated in Management Decision Systems. In 1988 I started as the head of the Electronic Publishing division for WEKA Publishing. We developed CD-ROM applications containing professional information in the field of law and tax and search was already the nut and bolt.
In 1992 I started my own company, focusing on CD-ROM development and consulting in this field. We partnered with several retrieval providers like Folio, CD-SEARCH, LASEC or DATAWARE, companies which don't exist in this market anymore today. In 1997 we established a fruitful cooperation with VERITY as an OEM partner and developed the leading CD-ROM publishing system in the field of law and tax in Germany. We expanded our focus to Intranet applications and also partnered with Information Dimensions, now OpenText.
While search became a key role in so many applications we recognized that more and more companies would like to keep their main system and architecture, so we started to develop search enhancements for usage on top on nearly any search engine or database system. Namely our Clustering Engine, Linguistic, DidYouMean and Suggest.
Can you give me an overview of the search system your team has developed?
While developing our Suggest solution we recognized the limit of inverted index structures related to speed, which nearly every Search Engine uses. But these inverted indexes are not so easy to improve, as their programming is quite handy and efficient. We did a lot of trial and error for the last three years but finally in the mid of 2008 we found the philosopher's stone for the problem of minimal entropy in combination with completions.
Being so much faster than any other solution we know our new HitEngine provides the whole hit list, all categories and completions instantly while typing. With the HitEngine the Return key to run a search is not needed anymore.
What problems does your system solve? What's an example of a representative use of your system?
The iterative process of a human search is always looking like this:
- setting oneself a target
- typing a keyword
- press Return
- scanning the result list
- redefine keywords
- scanning the result list.
The last steps are repeated as long as required to obtain the desired results.
By providing the result list while typing, our HitEngine is changing this user behavior and shortens the search process efficiently.
Let's consider an average eCommerce application as an example:
You are looking for the Eidos game "Tomb Raider" for the Nintendo Wii. While typing the first letters "NIN", you will already get all results of entries containing words starting with "NIN". When you refine this search by typing "WII" and "TOMB" as next key words the relevant Hits will be on your screen. Categories for faceted navigation are also provided dynamically, so you can choose in these categories immediately the best price, a favorite seller etc.
Additionally the HitEngine is equipped with our Linguistic and Did-You-Mean technology, which enables the engine to provide hits for misspelled words or irregular nouns (for example if you typed "TOBM" instead of "TOMB").
Oracle tied to sell its Secure Enterprise Search by pitching one benefit--security? What are the two or three major unique selling propositions for your system?
It's the speed for this completion task which makes the Return key needless and therefore speeds up any search. This leads to a new user behavior.
With search vendors pushing into customer support, business intelligence, and eDiscovery, what are the market sectors with the greatest appetite for your system?
First we will focus on traditional eCommerce applications and other content with roughly structured data, like trademark or business information system.
When we did the demo, you emphasized the speed of the system? What is the throughput of your system when processing content? Think about a dual processor server with two dual core Intel chips, 4 GB of RAM, and a Linux distribution with one terabyte of SATA discs.
For a database containing 5 million product entries, about 700 completions for NIN (words starting with NIN) and 20.000 hits containing words starting with NIN are returned within 5 milliseconds, with one core. Considering these 700 words starting with NIN, this equals to 140.000 words per second.
It is easy to run different processes for every core, so this figure scales linearly up/down (???) with the number of cores.
Astonishingly the index size is very similar to conventional well compressed inverted index structures.
I have a list of more than 350 vendors of search and content processing systems. The majority of these are destined to be small companies with revenue of about $1 million per year. What is your plan to differentiate your system and grow revenues?
As we are specialized on search for more than a decade our package is very well tuned; not only for speed but also for content for example. We will combine our new HitEngine with our established technologies like Linguistic, Did-You-Mean, clustering, synonyms and ontologies, or our personal ranking mechanisms. They are already released, we just have to melt them together.
For the complex European continental languages like German our linguistic engine with its morphologic analysis is a big advantage, because algorithmic approaches like Bayesian or Porter, which are doing a good job for English, are a miserable failure.
At the moment we are not looking for any VC money. It couldn't make the technology much better and we feel that the pressure would take too much of our success. Developing algorithms is a long term business. And success is not really controllable by money in the long run. But really good technology will nearly always end in monetary success.
Are you going to seek partnerships, OEM deals to embed your technology in other enterprise software, or work with value added resellers?
That's a very good approach especially for expanding business into other countries. We have problems finding an entrance in the very big US market on our own. Partnerships are certainly ideal to save large investments for marketing.
What is the applicability of your search system to mobile phone search?
Currently we don't focus on mobile phone search, but definitely speed plays a key role in this sector too.
Can you provide a broad brush description of how your system works? Have you obtained a patent or patents for your approach?
In Europe it's quite difficult to get a patent for software, and impossible for pure algorithms. Some companies file a patent for software because it could take some years until the application is officially declined. In the meantime it's a good marketing idea to promote the system as "patent pending". So the best method to keep competitors at bay is to seal the knowledge within your company and be constantly one step ahead by developing steadily enhancements. Please understand that I can't tell you more.
A number of companies have reduced their expenditures for information technology. What's your view of the financial opportunities for a company like yours in 2009?
Our HitEngine offers a real ROI. Let's take a look at an eCommerce seller: By integrating our technology, user will spend perhaps 50% less time with searching to find their products. Customer loyalty rises and people will come back even if the store doesn't offer lowest prices on all products. Our estimation is a 10% to 15% turnover increase, so perhaps our technology could help avoiding a significant sales collapse in 2009 for certain online stores.
A number of companies -- Attivio and Clearwell Systems -- are providing interfaces that answer questions.
Search is in the system, but the emphasis is on the value added software that eliminates the need for a user to run queries to find an answer. What's your view of these types of systems? How will that trend in search influence your innovations in 2009?
Semantic analysis is much more difficult for European languages than for English. We are already able to integrate thesauri or ontologies. I have not seen any system yet which meets the requirements for semantic analysis - at least when you have a closer look into the system. But storing information in a quick and accessible way is even more important for this approach, as you have to consider much more than only keywords and positions. So I can imagine that our optimized index structure may help also in this field to achieve adequate results in an acceptable amount of time.
As you look toward 2009, what are the major trends that you see gaining momentum?
I feel that in 2009 and the years after we will have many opportunities being a smaller company. Sales overhead of global players is huge and Fortune 500 will learn to trust smart, reliable technology from smaller companies for relatively little money. Perhaps VCs and shareholders will withdraw their money and look for other opportunities.
Technologically I feel that graphical analysis of information could get more importance than in the past.
If a person wants more information about your system, where should they go? What should they do?
We would be delighted to welcome them at www.weitkamper.com or email me to weitkamper@weitkamper.de. We can easily prepare a demo with their specific content to show our HitEngine or other applications.
ArnoldIT Comment
The technology performs a number of interesting processes very rapidly. We explored the clusterpat.com service, which includes patents from the US and European Patent Offices and had good results from our test queries. If you want more information, navigate to the company's Web site at www.weitkamper.com. The system warrants a close look.
Stephen E. Arnold, January 12, 2009