Sorry, Experts. NLP and Semantic Technology Will Guarantee Higher Precision and Recall

August 3, 2015

I read “5 Reasons for Developers to Build NLP and Semantic Search Skills” is one of those bait and switch write ups. The title suggests that NLP and semantic search are “skills.” The content of the article presents without factual substantiation assertions about the differences between Web search and enterprise search. The reality is that both are more closely related than they appear to some “experts.” Neither works particularly well for reasons which have to do with cost control, system management, and focus. The technology is, from my point of view, more stable than some search mavens believe.

Here’s the passage I highlighted in pale mauve because I did not have purple:

It at times feels magical that Search engines know, with unbelievable accuracy, exactly what you are looking for. This is the result of a heavy investment in NLP and Semantic technologies. These, along with speech-recognition, have the potential of enabling a future where search will transform into a smart machine that uses “connected knowledge” to answer significantly complex questions – a Star Trek Computer may not be too far away after all, if Amit Singhal – brain behind Google’s search engine evolution, has be to believed.

More remarkable was the introduction of the phrase “big, unstructured data.” I also found the notion of “commoditization” of data science amusing.

One idea warrants comment. The article calls attention to the “widening gap between enterprise search platforms and general purpose search engines.” Anyone who has attempted to index Web content quickly learns that it is a fruit basket which is in the process of being shoved into a blender. The notion of the enterprise search system was to process the content normally found inside an organization. But guess what? After the first query run on a restricted domain of content, the user says, “I need access to Internet content.” The “gap” is one of perception. The underlying components of the system and much of the gee whiz technology are similar. The fact that the Web search systems have been shaped to handle a restricted body of content is lost on some folks. Similarly the enterprise search systems are struggling because they, like Web search engines, cannot handle efficiently and automatically certain types of content. In short, neither works particularly well.

Will NLP and semantic skills help a developer? Not too much if the search system is not focused, the content is not reliable, and functions poorly defined. Forget big data, little data, and unstructured or structured data. Get the basics wrong and one has a lousy search system, which sadly, is more common than not.

Stephen E Arnold, August 3, 2015

Comments

Comments are closed.

  • Archives

  • Recent Posts

  • Meta