TeezIR BV: Coquette or Quitter
September 26, 2008
For my first visit to Utrecht, once a bastion of Catholicism and now Rabobank stronghold, I wanted to speak with interesting companies engaged in search and content processing. After a little sleuthing, I spotted TeezIR, a company founded in November 2007. When I tried to track down one of the principals–Victor Van Tol, Arthus Van Bunningen, and Thijs Westerveld–I was stonewalled. I snagged a taxi and visited the firm’s address (according to trusty Google Maps) at Kanaalweg 17L-E, Building A6. I made my way to the second floor but was unable to rouse the TeezIR team. I am hesitant to say, “No one was there”. My ability to peer through walls after a nine hour flight is limited.
I asked myself, “Is TeezIR playing the role of a coquette or has the aforementioned team quit the search and content processing business?” I still don’t know. At the Hartmann conference, no one had heard of the company. One person asked me, “How did you find out about the company?” I just smiled my crafty goose grin and quacked in an evasive manner.
The trick was that one of my two or three readers of this Web log sent me a snippet of text and asked me if I knew of the company:
Proprietary, state-of-the-art technology is information retrieval and search technology. Technology is built up in “standardized building blocks” around search technology.
So, let’s assume TeezIR is still in business. I hope this is true because search, content processing, and the enterprise systems dependent on these functions are in a sorry state. Cloud computing is racing toward traditional on premises installations the way hurricanes line up to smash the American south east. There’s a reason cloud computing is gaining steam–on premises installations are too expensive, too complicated, and too much of a drag on a struggling business. I wanted to know if TeezIR was the next big thing.
My research revealed that TeezIR had some ties to the University of Twente. One person at the Hartmann conference told me that he thought he heard that a company in Ede had been looking for graduate students to do some work in information retrieval. Beyond that tantalizing comment, I was able to find some references to Antal van den Bosch, who has expertise in entity extraction. I found a single mention of Luuk Kornelius, who may have been an interim officer at TeezIR and at one time a laborer in the venture capital field with Arengo (no valid link found on September 16, 2009). Other interesting connections emerged from TeezIR to Arjen P. de Vries (University of Twente), Thomas Roelleke (once hooked up with Fredhopper), and Guido van’t Noordende (security specialist). Adding these names to the management team here, TeezIR looked like a promising start up.
Since I was drawing a blank on getting people affiliated with TeezIR to speak with me, I turned to my own list of international search engines here, and I began the thrilling task of hunting for needles in hay stacks. I tell people that research for me is a matter of running smart software. But for TeezIR, the work was the old-fashioned variety.
Overview
Here’s what I learned:
First, the company seemed to focus on the problem of locating experts. I grudgingly must call this a knowledge problem. In a large organization, it can be hard to find a colleague who, in theory, knows an answer to another employee’s question. Here’s a depiction of the areas in which TeezIR is (was?) working:
Second, TeezIR’s approach is (was?) to make search an implicit function. Like me, the TeezIR team realized that by itself search is a commodity, maybe a non starter in the revenue department. Here’s how TeezIR relates content processing to the problem of finding experts:
The idea behind the expert finder function seems to be that the company will process email, rĂ©sumĂ©s, documents, and other content. The people–actually entities–are identified and tagged. The TeezIR technology would then match people to a topic or problem. Representative operations, based on the modest amount of data I could locate in my hay stacks, include matching rĂ©sumĂ©s to job specifications, match a broader notion from a strategy document to the people known to the system, perform a retrieval for a document by a person matching a query.
Here’s a diagram showing the alleged work flow for the TeezIR system:
I was able to locate a comparison of the “classic” expert profiling functions, which I assume mean the types of operations performed by a company like Tacit Software, a sector leader.
Other factoids asserted in the documents I located suggested that the TeezIR system performs these operations:
- Seamless processing of a range of “daily life” documents
- Analysis of relationships from email and other message traffic
- Support for multi lingual document sets
Third, the company seems (seemed?) to be probing the “sentiment” analysis sector. “Sentiment analysis” means that software “reads” user generated content or information in magazines and trade journals. During the “read” process, the software determines if the intent of the document is positive, negative, or neutral. Now this is a vast over simplification–my particular field of expertise to be frank–but marketers with the zeal of those under the age of 20 find these broad signals useful. The figure below shows an example of the TeezIR sentiment report:
Based on the flimsy sources to which I had access, this TeezIR system can process sentiment in “near” real time. The TeezIR system also performs Web log monitoring, which is a tough problem when high traffic systems are generating the log data. In order to figure out if the sentiment is running for you or against you, TeezIR performs:
- Linguistic analysis
- Aggregation
- Source identification, which is useful for public relations planning.
TeezIR, if I translated the Dutch document I located, calls this “opinion mining,” which I think is a nifty phrase. (If the company is shutting down, I might usurp this Utrecht coinage).
Fourth, a wizard name Thijs Westerveld may–note the conditional–have been working with TeezIR before the company refused to speak with me. Mr. Westerveld is involved with a government project called INEX. The idea is for companies to participate in a TREC like project to demonstrate the efficacy of their different image and multimedia search project. I located one example of a result of a query on a system that may be part of the TeezIR line up. Here it is:
I found a draft technical paper on a university server that seemed to suggest TeezIR was developing a system that used a Weibull distribution and other jazzy methods to generate a ranked list of similar images and then cluster those results by concept. The draft document linked to this screen shot which may be representative of the TeezIR approach:
Then, according to the draft document, the image similarity function delivers this type of output:
Technology
I want to caution the reader that this discussion of the TeezIR technology must be verified by your own research. Maybe the company will talk with you too? What I did was gather up the fruits of my manual research and make a list of the technologies that were mentioned or used by the individuals whom I could tie to the TeezIR operation. I don’t think this summary is more than 85 percent accurate, but it does provide a useful glimpse of the though processes in operation based on the research results I could locate. Herewith is what I think is germane to TeezIR:
- The engine is / was PF Tijah, an XML system which seems to be a bit of a hobby horse for the University of Twente crowd. If this link is valid, you can get more information here. PF Tijah, as you may know, is part of the open source MonetDB / XQuery code. More information is here.
- TeezIR may have used the PathFinder (PF) XQuery compiler
- TeezIR tweaked the open source code to permit biasing longer pertinent segments in a source document and using “relevance priors”. I take it that “relevance priors” are a type of comparative analysis of a document with the indexed collection. The idea is to make certain that important short items are included with the longer text segments / documents.
- The TeezIR relevance method uses an infinite random walk method for entity extraction. If you are a fan of equations, my research suggests that this expression is relevant to the TeezIR approach. Remember, you need to verify my open source research:
Technology
As I worked through the handful of Dutch documents I located using my list of international search engines, TeezIR seems to have focused on methods that allowed the company to focus on retrieval of arbitrary parts of XML documents. The methods does not require a definition at indexing time of what constitutes a document or a document field. The company relied on the NEXI query language, which is a sub set of the XQuery language. If my translation is accurate, TeezIR can / did support text search plus traditional database querying.
The company has / had a crawler. Among its features was the ability to download only content and then perform classification of coherent pieces of a Web page. One document I reviewed said:
For blogs we investigate how to separately classify individual posts and their comments in a generic manner that needs no (or little) adaptation for new blogs or when the structure or layout of a blog has changed.
Net Net
I am not sure if TeezIR gets classified as a profile or a case history of a dead search system. I will post this as a feature for now. My conclusion is that TeezIR is / was following the path of using open source to perform certain core functions. The company’s technical wizards then crafted bits and pieces of proprietary software to build out the unique features of the system.
Oh, if someone from TeezIR reads this, I am still interested in getting your input on the topics raised in this diary-like Web log post. I would really appreciate it if you abandoned your Google-like approach to inquiries and answered the email, fax, and telephone calls I made to your company during the week of September 15, 2008. The world doesn’t need another company operating with Google-style methods for analysts.
Stephen Arnold, September 26, 2008
Comments
4 Responses to “TeezIR BV: Coquette or Quitter”
Dear Arnold,
First of all, thank you for the interest in and all the free publicity around Teezir! Although we quite regularly hit the news with our innovative solutions around Opinion Mining, Expert Finding, People Matching and Audio Detection (for instance: the launch of our Online Audio Detection Tool for Buma/Stemra last week) as well as our price-winning concepts (for instance our double nomination for the Accenture Innovation Awards), it’s always nice to see people go through great lengths to get to know our company.
And you for sure did your homework! However, despite your implications in that direction, Teezir is more alive than ever and we are indeed located in our brand new office in Utrecht. Very unfortunate that we missed you when you visited our offices, but next time you are around please give us a heads up, so we can make sure we have a beer chilled for you and we can discuss our mutual views on the Search Arena.
Best regards and see you in Utrecht!
Victor van Tol
Funnily enough they were presenting at a British Computer Society event on London on the 23rd. The main focus was on the reputation management/opinion mining side – not enough data to confidently say whether it works or not.
My company has been using Tacit Software’s expert profiling functions for more than two years on the global basis. I don’t agree with the assessment listed on the comparison of the “classic” expert systems. Tacit’s solution automatically creates user profiles and matches the top experts based on users’ questions. It doesn’t require users to do any manual maintenance of their profiles. All the connections are shown in the experts’ email inbox, so they can easily reply and help the person asked the question. One success connection through Tacit’s software saved my company millions of dollars and new success stories are still frequently being submitted.
aOne,
Thanks. I have liked Tacit Software since I saw a demo a number of years ago. Glad you like a social system that monitors and makes datan useful.
Stephen Arnold, October 7, 2008