Progress: From Selling NLP to Providing NLP Services

December 11, 2017

Years ago, Progress Software owned an NLP system. I recall conversations with natural language processing wizards from Easy Ask. Larry Harris developed a natural language system in 1999 or 2000. Progress purchased EasyAsk in 2005 if memory serves. I interviewed Craig Bassin in 2010 as part of my Search Wizards Speak series.

The recollection I have was that Progress divested itself of EasyAsk in order to focus on enterprise applications other than NLP. No big deal. Software companies are bought and sold everyday.

However, what makes this recollection interesting to me is the information in “Beyond NLP: 8 Challenges to Building a Chatbot.” Progress went from a software company who owned an NLP system to a company which is advising people like me how challenging a chatbot system can be to build and make work. (I noted that the Wikipedia entry for Progress does not mention the EasyAsk acquisition and subsequent de-acquisition.) Either small potatoes or a milestone best jumped over I assume.)

Presumably it is easier to advise and get paid to implement than funding and refining an NLP system like EasyAsk. If you are not familiar with EasyAsk, the company positions itself in eCommerce site search with its “cognitive eCommerce” technology. EasyAsk’s capabilities include voice enabled natural language mobile search. This strikes me as a capability which is similar to that of a chatbot as I understand the concept.

History is history one of my high school teachers once observed. Let’s move on.

What are the eight challenges to standing up a chatbot which sort of works? Here they are:

  1. The chat interface
  2. NLP
  3. The “context” of the bot
  4. Loops, splits, and recursions
  5. Integration with legacy systems
  6. Analytics
  7. Handoffs
  8. Character, tone, and persona.

As I review this list, I note that I have to decide whether to talk to a chatbot or type into a box so a “customer care representative” can assist me. The “representative” is, the assumption is, a smart software robot.

I also notice that the bot has to have context. Think of a car dealer and the potential customer. The bot has to know that I want to buy a car. Seems obvious. But okay.

“Loops, splits, and recursions.” Frankly I have no idea what this means. I know that chatbot centric companies use jargon. I assume that this means “programming” so the NLP system returns a semi-on point answer.

Integration with legacy systems and handoffs seem to be similar to me. I would just call these two steps “integration” and be done with it.

The “character, tone, and persona” seems to apply to how the chatbot sounds; for example, the nasty, imperious tone of a Kroger automated check out system.

Net net: Progress is in the business of selling advisory and engineering services. The reason, in my opinion, was that Progress could not crack the code to make search and retrieval generate expected payoffs. Like some Convera executives, selling search related services was a more attractive path.

Stephen E Arnold, December 11, 2017

Google: Headphones and Voice Magic

November 23, 2017

I read two interesting articles. Each provides some insight into Google’s effort to put the NLP and chatbot doggies in an Alphabet corral.

The first article is “Google SLING: An Open Source Natural Language Parser.” To refresh your memory, “SLING is a combination of recurrent neural networks and frame based parsing.”

The second article is “Google Introduces Dialogflow Enterprise Edition, a Conversational Apps Building Platform.” The idea is to provide “a platform for building voice and text conversational applications.”

Both are interesting because each seems to be “free.” I won’t drag you, gentle reader, through the consequences of building a solution around a “free” Google service. One Xoogler watches me like a hawk to remind me that Google doesn’t treat people in a will of the wisp way. Okay. Let’s move on, shall we?

Both of these systems advance Google’s quest to become the Big Dog of where the world is heading for computer interaction. Both are germane to the wireless headphones Google introduced. These headphones, unlike other wireless alternatives, can translate. Hence, the largesse for free NLP and voice freebies.

I read “Trying Out Google’s Translating Headphones” informed me that:

The most important thing you should know about Pixel Buds is that their full features only work with Google’s newest smartphone, the Pixel 2.

Is this vendor lock in?

I learned from the write up:

To be honest, it’s not exactly real-time. You call up the feature by tapping on your right earbud and asking Google Assistant to “help me speak” one of 40 languages. The phone will then open the Google Translate app. From there, the phone will translate what it hears into the language of your choice, and you’ll hear it in your ear.

Not quite like Star Trek’s universal translator, suggests the article. I noted this statement:

it’s worth realizing that the Pixel Buds are more than just a pair of headphones. They’re an early illustration of what we can expect from Google, which will try to make products that stand out from the pack with unusual artificial intelligence services such as translation.

A demo. I suppose doing the lock in tactic with a demo is better than basing lock in on vaporware.

Then there are the free APIs. These, of course, will never go away or cost too much money. The headphones are $159. The phone adds another $649.

Almost free.

Stephen E Arnold, November 23, 2017

Natural Language Processing: Tomorrow and Yesterday

October 31, 2017

I read “Will Natural Language Processing Change Search as We Know It?” The write up is by a search specialist who, I believe, worked at Convera. The Search Technologies’ Web site asserts:

He was the architect and inventor of RetrievalWare, a ground-breaking natural-language based statistical text search engine which he started in 1989 and grew to $50 million in annual sales worldwide. RetrievalWare is now owned by Microsoft Corporation.

I think Fast Search acquired a portion of Convera. When Microsoft purchased Fast Search, the Convera technology was part of the deal. When Convera faded, one rumor I captured in 2007 was that some of the Convera technology was used by Ntent, formed as the result of a merger between Convera Corporation and Firstlight ERA. If accurate, the history of Convera is fascinating with Excalibur, ConQuest, and Allen & Co. in the mix.

In the “Will Natural Language Processing Change Search As We Know It” blog post, I noted these points:

  • Intranets incorporating NLP, semantic search and AI can fuel chatbots as well as end-to-end question-answering systems that live on top of search. It is a truly semantic extension to the search box with far-reaching implications for all types of search.
  • With NLP, enterprise knowledge contained in paper documentation can be encoded in a machine-readable format so the machine can read, process and understand it enough to formulate an intelligent response.
  • it’s good to know about established tool sets and methodologies for developing and creating effective solutions for use cases like technical support. But like all development projects, take care to create the tools based on mimicking the responses of actual human domain experts. Otherwise, you may run into the proverbial development problem of “garbage in, garbage out” which has plagued many such expert system initiatives.

Mr. Nelson is painting a reasonable picture about the narrow use of widely touted technologies. In fact, the promise of NLP has been part of enterprise search marketing for decades.

What I found interesting was the Convera document called “Accurate Search: What a Concept, published by Convera in 2002. I noted this passage on page 4 of the document:

Concept Search capitalizes on the richness of language, with its multiple term meanings, and transforms it from a problem into an advantage. RetrievalWare performs natural language processing and search term expansion to paraphrase queries, enabling retrieval of documents that contain the specific concepts requested rather than just the words typed during the query while also taking advantage of its semantic richness to rank documents in results lists. RetrievalWare’s powerful pattern search abilities overcome common errors in both content and queries, resulting in greater recall and user satisfaction.

I find the shift from a broad solution to a more narrow solution interesting. In the span of 15 years, the technology of search seems to be struggling to deliver.

Perhaps consulting and engineering services are needed to make search “work”? Contrast search with mobile phone technology. Progress has been evident. For search, success narrows to improving “documentation” and “customer support.”

Has anyone tried to reach PayPal’s customer support or United Airlines’ customer support? Try it. United was at one time a “customer” of Convera’s. From my point of view, United Airlines’ customer service has remained about the same over the last decade or two.

Enterprise search, broad or narrow, remains a challenge for marketers and users in my opinion. NLP, I assume, has arrived after a long journey. For a free profile of Convera, check out this link.

Stephen E Arnold, October 31, 2017

Free Language Learning Resources That Are Not Duolingo

October 25, 2017

For those who wish to learn a foreign language, the fun and engaging Duolingo has become a go-to free resource, offering courses in more than 20 languages. However, it is not the only game in town; MakeUseOf  gives us a rundown of “The Best (Completely Free) Language Learning Alternatives to Duolingo.” Writer Briallyn Smith tells us:

One of the reasons some people are looking to move away from Duolingo is the recent introduction of in-app purchases. While the core functions of Duolingo are still free, the purchase options can give learners a boost when playing games — much like the bonuses and extra lives you can purchase on Bejewelled or other addictive gaming apps. Learners may become frustrated when they are prevented from working on a specific language skill or accomplishment because they ran out of ‘hearts’ or need to purchase ‘gems’ to continue. Other in-app purchases allow users to remove ads from their learning experience and to download offline content.

While there’s nothing wrong with Duolingo charging fees for its services, it can be frustrating for those looking for a truly free resource. Other language learners simply do not enjoy learning through games. This is especially true for those who require industry-specific vocabulary or who already have a background in the language. Thankfully, there are many other online resources available for language learners. While you won’t get the same kind of program as Duolingo for free, you can easily use these resources to put together a language learning strategy that works well for you.

Before getting to her list, Smith takes a moment to advocate for paid language-learning services, like Babbel. Basically, if you are serious about your language studies and can afford it, they are worth the investment.

The resource list begins with a compound entry, Online Communities; included here are Fluent in 3 Months/r/LanguageLearning, and The Polyglot Club. Then there are Rhino Spike, Mango Languages, the Yojik Website, and, of course, YouTube (with a list of 10 suggested channels). Furthermore,  Smith supplies a link to OpenCulture for more even options. See the article for more about each of these entries.

Cynthia Murrell, October 25, 2017

Chatbots: The Negatives Seem to Abound

September 26, 2017

I read “Chatbots and Voice Assistants: Often Overused, Ineffective, and Annoying.” I enjoy a Hegelian antithesis as much as the next veteran of Dr. Francis Chivers’ course in 19th Century European philosophy. Unlike some of Hegel’s fans, I am not confident that taking the opposite tack in a windstorm is the ideal tactic. There are anchors, inboard motors, and distress signals.

The article points out that quite a few people are excited about chatbots. Yep, sales and marketing professionals earn their keep by crating buzz in order to keep their often-exciting corporate Beneteau 22’s afloat. With VCs getting pressured by those folks who provided the cash to create chatbots, the motive force for an exciting ride hurtles onward.

The big Sillycon Valley guns have been army the chatbot army for years. Anyone remember Ask Jeeves when it pivoted to a human powered question answering machine into a customer support recruit. My recollection is that the recruit washed out, but your mileage may vary.

With Amazon, Facebook, Google, IBM, and dozens and dozens of companies with hard-to-remember names on the prowl, chatbots are “the future.” The Infoworld article is a thinly disguised “be careful” presented as “real news.”

That’s why I wrote a big exclamation point and the words “A statement from the Captain Obvious crowd” next to this passage:

Most of us have been frustrated with misunderstandings as the computer tries to take something as imprecise as your voice and make sense of what you actually mean. Even with the best speech processing, no chatbots are at 100-percent recognition, much less 100-percent comprehension.

I am baffled by this fragment, but I am confident it makes sense to those who were unaware that dealing with human utterances is a pretty tough job for the Googlers and Microsofties who insist their systems are the cat’s pajamas. Note this indication of Infoworld quality in thought an presentation:

It seems very inefficient to resort to imprecise systems when we have [sic]

Yep, an incomplete thought which my mind filled in as saying, “humans who can maybe answer a question sometimes.”

The technology for making sense of human utterance is complex. Baked into the systems is the statistical imprecision that undermines the value of some chatbot implementations.

My thought is that Infoworld might help its readers if it were to answer questions like these:

  • What are the components of a chatbot system? Which introduce errors on a consistent basis?
  • How can error rates of chatbot systems be reduced in an affordable, cost effective manner?
  • What companies are providing third party software to the big girls and boys in the chatbot dodge ball game?
  • Which mainstream chatbot systems have exemplary implementations? What are the metrics behind “exemplary”?
  • What companies are making chatbot technology strides for languages other than English?

I know these questions are somewhat more difficult to answer than a write up which does little more than make Captain Obvious roll his eyes. Perhaps Infoworld and its experts might throw a bone to their true believers?

Stephen E Arnold, September 26, 2017

New Beyond Search Overflight Report: The Bitext Conversational Chatbot Service

September 25, 2017

Stephen E Arnold and the team at Arnold Information Technology analyzed Bitext’s Conversational Chatbot Service. The BCBS taps Bitext’s proprietary Deep Linguistic Analysis Platform to provide greater accuracy for chatbots regardless of platform.

Arnold said:

The BCBS augments chatbot platforms from Amazon, Facebook, Google, Microsoft, and IBM, among others. The system uses specific DLAP operations to understand conversational queries. Syntactic functions, semantic roles, and knowledge graph tags increase the accuracy of chatbot intent and slotting operations.

One unique engineering feature of the BCBS is that specific Bitext content processing functions can be activated to meet specific chatbot applications and use cases. DLAP supports more than 50 languages. A BCBS licensee can activate additional language support as needed. A chatbot may be designed to handle English language queries, but Spanish, Italian, and other languages can be activated with via an instruction.

Dr. Antonio Valderrabanos said:

People want devices that understand what they say and intend. BCBS (Bitext Chatbot Service) allows smart software to take the intended action. BCBS allows a chatbot to understand context and leverage deep learning, machine intelligence, and other technologies to turbo-charge chatbot platforms.

Based on ArnoldIT’s test of the BCBS, accuracy of tagging resulted in accuracy jumps as high as 70 percent. Another surprising finding was that the time required to perform content tagging decreased.

Paul Korzeniowski, a member of the ArnoldIT study team, observed:

The Bitext system handles a number of difficult content processing issues easily. Specifically, the BCBS can identify negation regardless of the structure of the user’s query. The system can understand double intent; that is, a statement which contains two or more intents. BCBS is one of the most effective content processing systems to deal correctly  with variability in human statements, instructions, and queries.

Bitext’s BCBS and DLAP solutions deliver higher accuracy, and enable more reliable sentiment analyses, and even output critical actor-action-outcome content processing. Such data are invaluable for disambiguating in Web and enterprise search applications, content processing for discovery solutions used in fraud detection and law enforcement and consumer-facing mobile applications.

Because Bitext was one of the first platform solution providers, the firm was able to identify market trends and create its unique BCBS service for major chatbot platforms. The company focuses solely on solving problems common to companies relying on machine learning and, as a result, has done a better job delivering such functionality than other firms have.

A copy of the 22 page Beyond Search Overflight analysis is available directly from Bitext at this link on the Bitext site.

Once again, Bitext has broken through the barriers that block multi-language text analysis. The company’s Deep Linguistics Analysis Platform supports more than 50 languages at a lexical level and +20 at a syntactic level and makes the company’s technology available for a wide range of applications in Big Data, Artificial Intelligence, social media analysis, text analytics,  and the new wave of products designed for voice interfaces supporting multiple languages, such as chatbots. Bitext’s breakthrough technology solves many complex language problems and integrates machine learning engines with linguistic features. Bitext’s Deep Linguistics Analysis Platform allows seamless integration with commercial, off-the-shelf content processing and text analytics systems. The innovative Bitext’s system reduces costs for processing multilingual text for government agencies and commercial enterprises worldwide. The company has offices in Madrid, Spain, and San Francisco, California. For more information, visit www.bitext.com.

Kenny Toth, September 25, 2017

Quote to Note: The Role of US AI Innovators

March 24, 2017

I read “Opening a New Chapter of My Work in AI.” After working through the non-AI output, I concluded that money beckons the fearless leader, Andrew Ng. However, I did note one interesting quotation in the apologia:

The U.S. is very good at inventing new technology ideas. China is very good at inventing and quickly shipping AI products.

What this suggests to me is that the wizard of AI sees the US as good at “ideas”, and China an implementer. A quick implementer at that.

My take is that China sucks up intangibles like information and ideas. Then China cranks out products. Easy to monetize things, avoiding the question, “What’s the value of that idea, pal?”

Ouch. On the other hand, software is the new electricity. So who is Thomas Edison? I wish I “knew”.

Stephen E Arnold, March 24, 2017

Search Like Star Trek: The Next Frontier

February 28, 2017

I enjoy the “next frontier”-type article about search and retrieval. Consider “The Next Frontier of Internet and Search,” a write up in the estimable “real” journalism site Huffington Post. As I read the article, I heard “Scotty, give me more power.” I thought I heard 20 somethings shouting, “Aye, aye, captain.”

The write up told me, “Search is an ev3ryday part of our lives.” Yeah, maybe in some demographics and geo-political areas. In others, search is associated with finding food and water. But I get the idea. The author, Gianpiero Lotito of FacilityLive is talking about people with computing devices, an interest in information like finding a pizza, and the wherewithal to pay the fees for zip zip connectivity.

And the future? I learned:

he future of search appears to be in the algorithms behind the technology.

I understand algorithms applied to search and content processing. Since humans are expensive beasties, numerical recipes are definitely the go to way to perform many tasks. For indexing, humans fact checking, curating, and indexing textual information. The math does not work the way some expect when algorithms are applied to images and other rich media. Hey, sorry about that false drop in the face recognition program used by Interpol.

I loved this explanation of keyword search:

The difference among the search types is that: the keyword search only picks out the words that it thinks are relevant; the natural language search is closer to how the human brain processes information; the human language search that we practice is the exact matching between questions and answers as it happens in interactions between human beings.

This is as fascinating as the fake information about Boolean being a probabilistic method. What happened to string matching and good old truncation? The truism about people asking questions is intriguing as well. I wonder how many mobile users ask questions like, “Do manifolds apply to information spaces?” or “What is the chemistry allowing multi-layer ion deposition to take place?”

Yeah, right.

The write up drags in the Internet of Things. Talk to one’s Alexa or one’s thermostat via Google Home. That’s sort of natural language; for example, Alexa, play Elvis.

Here’s the paragraph I highlighted in NLP crazy red:

Ultimately, what the future holds is unknown, as the amount of time that we spend online increases, and technology becomes an innate part of our lives. It is expected that the desktop versions of search engines that we have become accustomed to will start to copy their mobile counterparts by embracing new methods and techniques like the human language search approach, thus providing accurate results. Fortunately these shifts are already being witnessed within the business sphere, and we can expect to see them being offered to the rest of society within a number of years, if not sooner.

Okay. No one knows the future. But we do know the past. There is little indication that mobile search will “copy” desktop search. Desktop search is a bit like digging in an archeological pit on Cyprus: Fun, particularly for the students and maybe a professor or two. For the locals, there often is a different perception of the diggers.

There are shifts in “the business sphere.” Those shifts are toward monopolistic, choice limited solutions. Users of these search systems are unaware of content filtering and lack the training to work around the advertising centric systems.

I will just sit here in Harrod’s Creek and let the future arrive courtesy of a company like FacilityLive, an outfit engaged in changing Internet searching so I can find exactly what I need. Yeah, right.

Stephen E Arnold, February 28, 2017

CREST Includes Additional Documents

January 22, 2017

Short honk: The CIA has responded to a Freedom of Information Act request and posted additional documents. These are searchable via the CREST system. The content is accessible at this link.

Stephen E Arnold, January 22, 2017

Smart Software: An Annoying Flaw Will Not Go Away

December 22, 2016

Machines May Never Master the Distinctly Human Elements of Language” captures one of the annoying flaws in smart software. Machines are not human—at least not yet. The write up explains that “intelligence is mysterious.” Okay, big surprise for some of the Sillycon Valley crowd.

The larger question is, “Why are some folks skeptical about smart software and its adherents’ claims?” Part of the reason is that publications have to show some skepticism after cheerleading.  Another reason is that marketing presents a vision of reality which often runs counter to one’s experience. Try using that voice stuff in a noisy subway car. How’s that working out?

The write up caught my attention with this statement from the Google, one of the leaders in smart software’s ability to translate human utterances:

“Machine translation is by no means solved. GNMT can still make significant errors that a human translator would never make, like dropping words and mistranslating proper names or rare terms, and translating sentences in isolation rather than considering the context of the paragraph or page.”

The write up quotes a Stanford wizard as saying:

She [wizard Li] isn’t convinced that the gap between human and machine intelligence can be bridged with the neural networks in development now, not when it comes to language. Li points out that even young children don’t need visual cues to imagine a dog on a skateboard or to discuss one, unlike machines.

My hunch is that quite a few people know that smart software works in some use cases and not in others. The challenge is to get those with vested interests and the marketing millennials to stick with “as is” without confusing the “to be” with what can be done with available tools. I am all in on research computing, but the assertions of some of the cheerleaders spell S-I-L-L-Y. Louder now.

Stephen E Arnold, December 22, 2016

Next Page »

  • Archives

  • Recent Posts

  • Meta