Featured

Understanding Intention: Fluffy and Frothy with a Few Factoids Folded In
Introduction

One of my colleagues forwarded me a document called “Understanding Intention: Using Content, Context, and the Crowd to Build Better Search Applications.” To get a copy of the collateral, one has to register at this link. My colleague wanted to know what I thought about this “book” by Lucidworks. That’s what Lucidworks calls the 25 page marketing brochure. I read the PDF file and was surprised at what I perceived as fluff, not facts or a cohesive argument.

image

The topic was of interest to my colleague because we completed a five month review and analysis of “intent” technology. In addition to two white papers about using smart software to figure out and tag (index) content, we had to immerse ourselves in computational linguistics, multi-language content processing technology, and semantic methods for “making sense” of text.

The Lucidworks’ document purported to explain intent in terms of content, context, and the crowd. The company explains:

With the challenges of scaling and storage ticked off the to-do list, what’s next for search in the enterprise? This ebook looks at the holy trinity of content, context, and crowd and how these three ingredients can drive a personalized, highly-relevant search experience for every user.

The presentation of “intent” was quite different from what I expected. The details of figuring out what content “means” were sparse. The focus was not on methodology but on selling integration services. I found this interesting because I have Lucidworks in my list of open source search vendors. These are companies which repackage open source technology, create some proprietary software, and assist organizations with engineering and integrating services.

The book was an explanation anchored in buzzwords, not the type of detail we expected. After reading the text, I was not sure how Lucidworks would go about figuring out what an utterance might mean. The intent-centric systems we reviewed over the course of five months followed several different paths.

Some companies relied upon statistical procedures. Others used dictionaries and pattern matching. A few combined multiple approaches in a content pipeline. Our client, a firm based in Madrid, focused on computational linguistics plus a series of procedures which combined proprietary methods with “modules” to perform specific functions. The idea for this approach was to reduce the errors in intent identification from accuracy between 65 percent to 80 percent to accuracy approaching and often exceeding 90 percent. For text processing in multi-language corpuses, the Spanish company’s approach was a breakthrough.

I was disappointed but not surprised that Lucidworks’ approach was breezy. One of my colleagues used the word “frothy” to describe the information in the “Understanding Intention” document.

As I read the document, which struck me as a shotgun marriage of generalizations and examples of use cases in which “intent” was important, I made some notes.

Let me highlight five of the observations I made. I urge you to read the original Lucidworks’ document so you can judge the Lucidworks’ arguments for yourself.

Imitation without Attribution

My first reaction was that Lucidworks had borrowed conceptually from ideas articulated by Dr. Gregory Grefenstette and his book Search Based Applications: At the Confluence of Search and Database Technologies. You can purchase this 2011 book on Amazon at this link. Lucidworks’ approach, unlike Dr. Grefenstette’s borrowed some of the analysis but did not include the detail which supports the increasing importance of using search as a utility within larger information access solutions. Without detail, the Lucidworks’ document struck me as a description of the type of solutions that a company like Tibco is now offering its customers.

Read more »

Interviews

Bitext: Exclusive Interview with Antonio Valderrabanos

On a recent trip to Madrid, Spain, I was able to arrange an interview with Dr. Antonio Valderrabanos, the founder and CEO of Bitext. The company has its primary research and development group in Las Rosas, the high-technology complex a short distance from central Madrid. The company has an office in San Francisco and a number of computational linguists and computer scientists in other locations. Dr. Valderrabanos worked at IBM in an adjacent field before moving to Novell and then making the jump to his own start up. The hard work required to invent a fundamentally new way to make sense of human utterance is now beginning to pay off.

Antonio Valderrabanos of Bitext

Dr. Antonio Valderrabanos, founder and CEO of Bitext. Bitext’s business is growing rapidly. The company’s breakthroughs in deep linguistic analysis solves many difficult problems in text analysis.

Founded in 2008, the firm specializes in deep linguistic analysis. The systems and methods invented and refined by Bitext improve the accuracy of a wide range of content processing and text analytics systems. What’s remarkable about the Bitext breakthroughs is that the company support more than 40 different languages, and its platform can support additional languages with sharp reductions in the time, cost, and effort required by old-school systems. With the proliferation of intelligent software, Bitext, in my opinion, puts the digital brains in overdrive. Bitext’s platform improves the accuracy of many smart software applications, ranging from customer support to business intelligence.

In our wide ranging discussion, Dr. Valderrabanos made a number of insightful comments. Let me highlight three and urge you to read the full text of the interview at this link. (Note: this interview is part of the Search Wizards Speak series.)

Linguistics as an Operating System

One of Dr. Valderrabanos’ most startling observations addresses the future of operating systems for increasingly intelligence software and applications. He said:

Linguistic applications will form a new type of operating system. If we are correct in our thought that language understanding creates a new type of platform, it follows that innovators will build more new things on this foundation. That means that there is no endpoint, just more opportunities to realize new products and services.

Better Understanding Has Arrived

Some of the smart software I have tested is unable to understand what seems to be very basic instructions. The problem, in my opinion, is context. Most smart software struggles to figure out the knowledge cloud which embraces certain data. Dr. Valderrabanos observed:

Search is one thing. Understanding what human utterances mean is another. Bitext’s proprietary technology delivers understanding. Bitext has created an easy to scale and multilingual Deep Linguistic Analysis or DLA platform. Our technology reduces costs and increases user satisfaction in voice applications or customer service applications. I see it as a major breakthrough in the state of the art.

If he is right, the Bitext DLA platform may be one of the next big things in technology. The reason? As smart software becomes more widely adopted, the need to make sense of data and text in different languages becomes increasingly important. Bitext may be the digital differential that makes the smart applications run the way users expect them to.

Snap In Bitext DLA

Advanced technology like Bitext’s often comes with a hidden cost. The advanced system works well in a demonstration or a controlled environment. When that system has to be integrated into “as is” systems from other vendors or from a custom development project, difficulties can pile up. Dr. Valderrabanos asserted:

Bitext DLA provides parsing data for text enrichment for a wide range of languages, for informal and formal text and for different verticals to improve the accuracy of deep learning engines and reduce training times and data needs. Bitext works in this way with many other organizations’ systems.

When I asked him about integration, he said:

No problems. We snap in.

I am interested in Bitext’s technical methods. In the last year, he has signed deals with companies like Audi, Renault, a large mobile handset manufacturer, and an online information retrieval company.

When I thanked him for his time, he was quite polite. But he did say, “I have to get back to my desk. We have received several requests for proposals.”

Las Rosas looked quite a bit like Silicon Valley when I left the Bitext headquarters. Despite the thousands of miles separating Madrid from the US, interest in Bitext’s deep linguistic analysis is surging. Silicon Valley has its charms, and now it has a Bitext US office for what may be the fastest growing computational linguistics and text analysis system in the world. Worth watching this company I think.

For more about Bitext, navigate to the firm’s Web site at www.bitext.com.

Stephen E Arnold, April 11, 2017

Latest News

Online Fraud: Loophole, Soft Freeze, Hard Freeze, or Just Business in 2017?

Consumer Alert: A credit freeze may not do what one expects. After the Equifax data loss, I promptly put a credit freeze on my unwanted “credit rating”... Read more »

October 19, 2017 | | Comment

IBM Watson: A Fashionista Never Says Sorry

I haven’t paid much attention to IBM Watson since the popular media began poking holes in IBM’s marketing assertions. However, I feel compelled to highlight... Read more »

October 19, 2017 | | Comment

Equifax Hack Has Led to Oracle Toughening Up

According to a timely piece in SearchOracle, its parent company has muscled up in response to its recent troubles, according to the article, “Machine Learning... Read more »

October 19, 2017 | | Comment

Google Eyes AI in China

Google wants to elbow its way into Chinese markets.  As the most populous country in the world, China is more or less an untouched by Google and it is a big, potential... Read more »

October 19, 2017 | | Comment

Palantir Technologies: Valuation Doubts?

i read “Palantir Will Struggle to Hold On to $20 Billion Valuation, Study Says.” Interesting stuff because beating up on hapless Silicon Valley companies is... Read more »

October 18, 2017 | | Comment

IBM: A Roll Downhill?

I read “IBM Reports Marginal Dip in Quarterly Revenue.” I think the headline qualifies as politically correct information. I don’t have the energy to point... Read more »

October 18, 2017 | | Comment

Listen Notes: A Podcast Search Engine

I read “This Podcast Search Engine Helps You Discover New Shows You’ll Love.” The search engine is called Listen Notes. The content pool is 350,000 podcasts... Read more »

October 18, 2017 | | Comment

Big Data Might Just Help You See Through Walls

It might sound like science fiction or, worse, like a waste of time, but scientists are developing cameras that can see around corners. More importantly, these visual... Read more »

October 18, 2017 | | Comment

Social Media Should Be Social News

People are reading news more than ever due to easy information access on the Internet.  While literacy rates soar, where people are reading news stories has changed... Read more »

October 18, 2017 | | Comment

Google Management: Technology and Managment Wobbles

Bloomberg has another “Google is a bum” story. Remember the cheerful days of 2002 when Google was the cat’s pajamas. 23 skidoo now invokes the bum’s rush. Navigate... Read more »

October 17, 2017 | | Comment


  • Archives

  • Recent Posts

  • Meta