Semantic Search Laid Bare

December 17, 2008

Yahoo’s Search Blog here has an interesting interview with Dr. Rudi Studer. The focus is semantic search technologies, which are all the rage in enterprise search and Web search circles. Dr. Studer, according to Yahoo:

is no stranger to the world of semantic search. A full professor in Applied Informatics at University of Karlsruhe, Dr. Studer is also director of the Karlsruhe Service Research Institute, an interdisciplinary center designed to spur new concepts and technologies for a services-based economy. His areas of research include ontology management, semantic web services, and knowledge management. He has been a past president of the Semantic Web Science Association and has served as Editor-in-Chief of the journal Web Semantics.

If you are interested in semantics, you will want to read and save the full text of this interview. I want to highlight three points that caught my attention and then–in my goosely manner–offer several observations.

First, Dr. Studer suggests that “lightweight semantic technologies” have a role to play. He said:

In the context of combining Web 2.0 and Semantic Web technologies, we see that the Web is the central point. In terms of short term impact, Web 2.0 has clearly passed the Semantic Web, but in the long run there is a lot that Semantic Web technologies can contribute. We see especially promising advancements in developing and deploying lightweight semantic approaches.

The key idea is lightweight, not giant semantic engines grinding in a lights out data center.

Second, Dr. Studer asserts:

Once search engines index Semantic Web data, the benefits will be even more obvious and immediate to the end user. Yahoo!’s SearchMonkey is a good example of this. In turn, if there is a benefit for the end user, content providers will make their data available using Semantic Web standards.

The idea is that in this chicken and egg problem, it will be the Web page creators’s job to make use of semantic tags.

Finally, Dr. Studer identifies tools as an issue. He said:

One problem in the early days was that the tool support was not as mature as for other technologies. This has changed over the years as we now have stable tooling infrastructure available. This also becomes apparent when looking at the at this year’s Semantic Web Challenge. Another aspect is the complexity of some of the technologies. For example, understanding the foundation of languages such as OWL (being based on Description Logics) is not trivial. At the same time, doing useful stuff does not require being an expert in Logics – many things can already be done exploiting only a small subset of all the language features.

I am no semantic expert. I have watched several semantic centric initiatives enter the world and–somewhat sadly–watched them die. Against this background, let me offer three observations:

  1. Semantic technology is plumbing and like plumbing, semantic technology should be kept out of sight. I want to use plumbing in a user friendly, problem free setting. Beyond that, I don’t want to know anything about plumbing. Lightweight or heavyweight, I think some other users may feel the same way. Do I look at inverted indexes? Do you?
  2. The notion of putting the burden on Web page or content creators is a great idea, but it won’t work. When I analyzed the five Programmable Search Engine inventions by Ramanathan Guha as part of an analysis for the late, great BearStearns, it was clear that Google’s clever Dr. Guha assumed most content would not be tagged in a useful way. Sure, if content was properly tagged, Google could ingest that information. But the core of the PSE invention was Google’s method for taking the semantic bull by the horns. If Dr. Guha’s method works, then Google will become the semantic Web because it will do the tagging work that most people cannot or will not do.
  3. The tools are getting better, but I don’t think users want to use tools. Users want life to be easy, and figuring out how to create appropriate tags, inserting them, and conforming to “standards” such as they are is no fun. The tools will thrill developers and leave most people cold. Check out the tools section at a hardware store. What do you see? Hobbyists and tinkerers and maybe a few professionals who grab what they need and head out. Semantic tools will be like hardware: of interest to a few.

In my opinion, the Google – Guha approach is the one to watch. The semantic Web is gaining traction, but it is in its infancy. If Google jump starts the process by saying, “We will do it for you”, then Google will “own” the semantic Web. Then what? The professional semantic Web folks will grouse, but the GOOG will ignore the howls of protest. Why do you think the GOOG hired Dr. Guha from IBM Almaden? Why did the GOOG create an environment for Dr. Guha to write five patent applications, file them on the same day, and have the USPTO publish five documents on the same day in February 2007? No accident tell you I.

Stephen Arnold, December 17, 2008

Stephen Arnold

Comments

2 Responses to “Semantic Search Laid Bare”

  1. Relationship = Attention x Trust « C3 - Complete Community Connection on December 21st, 2008 10:37 am

    […] Semantic Search Laid Bare […]

  2. The Semantic Web in Education | Educationload.com on January 20th, 2009 6:03 am

    […] Semantic Search Laid Bare […]

  • Archives

  • Recent Posts

  • Meta