Order Google: The Digital GutenbergTop Banner

Featured

Concept Searching Update

Founded in 2002, Concept Searching provides licensees with search, auto-classification, taxonomy management and metadata tagging solutions. You can download a fact sheet about the privately firm here. The software can be used on an individual user’s computer or mounted on servers to deliver enterprise solutions. The company’s secret sauce is its statistical metadata generation and classification method. The technology uses concept extraction and compound term processing to facilitate access to unstructured information. The company operates from Stevenage in Hertsfordshire. A list of the Concept Searching offices is here.

The company emphasizes the value of lateral thinking, and its approach to content analysis implements numerical recipes to find these insights and linkages within unstructured text.

When I updated my profile for this company earlier this year, I noted that the firm had signed Portal Solutions, a company that focuses on things Microsoft. The idea is to make it possible for a user to search for “insider dealing” and retrieve documents where that bound phrase does not appear but a related phrase such as “insider trading” does appear. This type of system appeals to intelligence officers and financial analysts. Concept Searching’s methods generated lists of related topics. You can see an example of the system in action by navigating to this page. I ran several test queries and the interface provided useful information and suggestions about other related content in the processed corpus. A screen shot of the output appears below:

concept hmso

Concept Searching is a Microsoft and Fast Search partner. The idea is that Concept Searching’s technology complements and in some cases extends the search and content processing services in Microsoft products. In May 2009, the company sponsored a best practices site for Microsoft SharePoint. The deal involves a number of companies, including ShemaLogic, KnowlegeLake, and K2 Technologies among others. The site is supposed to go live in the next couple of weeks, but I don’t have a url or a date at this time.

The company had a busy May, signing deals with Allianz Global Investors, Directory, and AT&T Government Solutions.

For me, the most interesting system that Concept Searching offers is its ability to generate and classify terms found in SharePoint documents into a taxonomy. The company has prepared a brief video that demonstrates this functionality. You can find the video here. The company’s approach does not require a separate index. Microsoft Enterprise Search can use the outputs of the Concept Searching system. I noted two “uniques” in the narrative to the video, and I remain skeptical about categorical affirmatives. I think the bound phrase extraction and the close integration with SharePoint are benefits. I just bristle when I hear “unique”, which means the one and only anywhere in the world. Broad assertion in my experience.

concept searching block diagram

Concept Searching’s president, Martin Garland, said here:

Our intellectual property is still unique as we are the only statistical search technology able to indentify multi-word patterns within text and insert these patterns directly into the index at ingestion or creation time. We call this “Compound Term Processing”.

Last week I sat in a briefing given by one of Microsoft’s enterprise search team. I thought I heard descriptions of functions that struck me as quite similar to those performed by Concept Search and such companies as Interse in Copenhagen, Denmark.

I think it will be fruitful to watch what features and functions are baked into the upcoming Microsoft Fast ESP version of the old Fast Search & Transfer system. Remember: the roots of Fast Search stretch deep to 1997, a year before Google poked its nose from the Stanford baby crib.

Partners like Concept Searching have invested significant resources in Microsoft technologies. Will Microsoft respect these investments, or will Microsoft in an effort to recoup is $1.23 billion investment take a hard line toward such companies as Concept Searching.

I am on the fence regarding this issue.

Stephen Arnold, July 3, 2009

Interviews

Francois Schiettecatte, FS Consulting

Through a mutual contact, I reconnected with François Schiettecatte, a search engine expert with other computer wizard skills in his toolbox. Mr. Schiettecatte worked on a natural language processing project in the late 1990s. He shifted focus and was a co-founder of Feedster.com. He told that he had contributed to a number of interesting projects and revealed that he was working on a new search and content processing system.

Mr. Schiettecatte consented to an interview. I spoke with him on May 29, and I put the full text of our discussion in the ArnoldIT.com Search Wizards Speak collection. You can find that series of interviews with influential figures in search and content processing here.

Mr. Schiettecatte and I had a lively discussion and he offered some interesting insights into the trajectory of search and retrieval. Let me highlight two of his comments and invite you to read the full text of the discussion here.

In response to a question about the new start ups entering the search and retrieval sector, Mr. Schiettecatte said:

You can apply different search approaches to different data sets, for example traditional search as well as NLP search to the same set of documents. And certain data set will lend themselves more naturally to one type of search as opposed to another. Of course user needs are key here in deciding what approaches work best for what data. I would also add that we have only begun to tackle search and that there is much more to be done, and new companies are usually the ones willing to bring new approaches to the market.

We then discussed the continuing interest in semantic technology. On this matter, Mr. Schiettecatte offered:

More data to search usually means more possible answers to a search, which means that I have to scan more to arrive at the answer, improved precision will go a long way to address that issue. A more pedestrian way to put this is: “I don’t care if there are about a million result, I just want the one result”. Also, having the search engine take the extra step in extracting data out of the search results and synthesizing that data into a meaningful table/report. This is more complicated but I has the potential to really save time in the long run.

For more information about Mr. Schiettecatte’s most recent project, read the full text of the interview here.

Stephen Arnold, June 2, 2009

Profiles

Vyre: Software, Services, Search, and More

A happy quack to the reader who sent me a link to Vyre, whose catchphrase is “dissolving complexity.” The last time I looked at the company, I had pigeon holed it as a consulting and content management firm. The news release my reader sent me pointed out that the company has a mid market enterprise search solution that is now at version 4.x. I am getting old, or at least too sluggish to keep pace with content management companies that offer search solutions. My recollection is that Crown Point moved in this direction. I have a rather grim view of CMS because software cannot help organizations create high quality content or at least what I think is high quality content.

The Wikipedia description of Vyre matches up with the information in my archive:

VYRE, now based in the UK, is a software development company. The firm uses the catchphrase “Enterprise 2.0″ to describe its enterprise  solutions for business.The firm’s core product is Unify. The Web based services allows users to build applications and content management. The company has technology that manages digital assets. The firm’s clients in 2006 included Diageo, Sony, Virgin, and Lowe and Partners. The company has reinvented itself several times since the late 1990s doing business as NCD (Northern Communication and Design), Salt, and then Vyre.

You can read Wikipedia summary here. You can read a 2006 Butler Group analysis here. My old link worked this evening (March 5, 2009), but click quickly.  In my files I had a link to a Vyre presentation but it was not about search. Dated 2008, you may find the information useful. The Vyre presentations are here. The link worked for me on March 5, 2009. The only name I have in my archive is Dragan Jotic. Other names of people linked to the company are here. Basic information about the company’s Web site is here. Traffic, if these data are correct, seem to be trending down. I don’t have current interface examples. The wiki for the CMS service is here. (Note: the company does not use its own CMS for the wiki. The wiki system is from MedioWiki. No problem for me, but I was curious about this decision because the company offers its own CMS system.  You can get a taste of the system here.

image

Administrative Vyre screen.

After a bit of poking around, it appears that Vyre has turned up the heat on its public relations activities. The Seybold Report here presented a news story / news release about the search system  here. I scanned the release and noted this passage as interesting for my work:

…version 4.4 introduces powerful new capabilities for performing facetted and federated searching across the enterprise. Facetted search provides immediate feedback on the breakdown of search results and allows users to quickly and accurately drill down within search results. Federated search enables users to eradicate content silos by allowing users to search multiple content repositories.

Vyre includes a taxonomy management function with its search system, if I read the Seybold article correctly. I gravitate to the taxonomy solution available from Access Innovations, a company run by my friend and colleagues Marje Hlava and Jay Ven Eman. Their system generates ANSI standard thesauri and word lists, which is the sort of stuff that revs my engine.

Endeca has been the pioneer in the enterprise sector for “guided navigation” which is a synonym in my mind for faceted search. Federated search gets into the functions that I associated with Bright Planet, Deep Web Technologies, and Vivisimo, among others. I know that shoving large volumes of data through systems that both facetize content and federated it are computationally intensive. Consequently, some organizations are not able to put the plumbing in place to make these computationally intensive systems hum like my grandmother’s sewing machine.

If you are in the market for a CMS and asset management company’s enterprise search solution, give the company’s product a test drive. You can buy a report from UK Data about this company here. I don’t have solid pricing data. My notes to myself record the phrase, “Sensible pricing.” I noted that the typical cost for the system begins at about $25,000. Check with the company for current license fees.

Stephen Arnold, March 6, 2009

Latest News

Ask, Search Marketing, and NASCAR – A Winner as NASCAR Attendance Drops

Michael Smith’s “Ask’s Next Question” provides a useful case study of search engine marketing. The story appeared in Sports Business Journal and reviews... Read more »

July 3, 2009 | Leave a Comment

Vivisimo Lands HCPro Deal

Vivisimo has a new client. HCPro, a health care regulation and revenue cycle management company, will use the Velocity platform, to power MedicareFind.com. That... Read more »

July 3, 2009 | Leave a Comment

Sci Tech Publishers: Doom Looms for the Tech Challenged

Quite interesting essay by Michael Nielsen: “Is Scientific Publishing about to Be Disrupted?” The answer is soon. I don’t agree. Sci tech publishing is in... Read more »

July 3, 2009 | Leave a Comment

OECD Data Diving

Short honk: Want to explore OECD country data. First, read the BBC story “Exploring the OECD Web Site” then navigate to OECD Explorer. Ideal for those who want... Read more »

July 3, 2009 | Leave a Comment

UFC 2010: HTML 5, Air, and Silverlight

Mary Jo Foley opened my eyes to a new unlimited online fighting battle in 2010. Her story with a lamentably cryptic headline appeared on June 11, 2009 as “Microsoft... Read more »

July 3, 2009 | 1 Comment

Google Books: Legal Eagles Carry On… Er, Carrion

It is official. an investigation of Google Books is stumbling forward. You can get “D” word on this by reading DOJ Confirms Antitrust Investigation Into Google... Read more »

July 2, 2009 | Leave a Comment

YAGG: Google App Engine Takes a Long Lunch

Short honk: Fresh from its criticism of Microsoft’s approach to data centers, Google makes clear its engineering approach to reliability. TechCrunch reported “Google... Read more »

July 2, 2009 | Leave a Comment