Real Time Translation: Chatbots Emulate Sci Fi

April 16, 2018

The language barrier is still one of the world’s major problems. Translation software, such as Google Translate is accurate, but it still makes mistakes that native speakers are needed to correct. Instantaneous translation is still a pipe dream, but the technology is improving with each new development. Mashable shares a current translation innovation and it belongs to Google: “Google Pixel Buds Vs. Professional Interpreters: Which Is More Accurate?”

Apple angered many devout users when it deleted the headphone jack on phones, instead replacing it with Bluetooth headphones called AirPods. They have the same minimalist sleek design as other Apple products, but Google’s Pixel Buds are far superior to them because of real time translation or so we are led to believe. Author Raymond Wong tested the Pixel Buds translation features at the United Nations to see how they faired against professional translators. He and his team tested French, Arabic, and Russian. The Pixel Buds did well with simple conversations, but certain words and phrases caused errors.

One hilarious example was when Google translated the Arabic for, “I want to eat salad” to “I want to eat power” in English. When it comes to real time translation, the experts are still the best because they can understand the context and other intricacies, such as tone, that comes with human language. The professional translators liked the technology, but it still needs work:

“Ayad and Ivanova both agreed that Pixel Buds and Google Translate are convenient technologies, but there’s still the friction of holding out a Pixel phone for the other person to talk into. And despite the Pixel Buds’ somewhat speedy translations, they both said it doesn’t compare to a professional conference interpreters, who can translate at least five times faster Google’s cloud.”

Keep working on those foreign language majors kids. Marketing noses in front of products that deliver in my view.

Whitney Grace, April 17, 2018

Udpipe for R: An NLP Solution for R

March 19, 2018

Natural language processing is a huge component in not only big data, but machine learning when it relates to reading and understanding languages. Natural language processing is not only important to English, but any foreign language in the modern age that needs to take advantage of AI and machine learning. RBloggers takes a look at another new tool in the area of NLP and its updated features, “Natural Language Processing For Non-English Languages With Udpipe.”

We learned from the write up:

“BNOSAC is happy to announce the release of the udpipe R package (https://bnosac.github.io/udpipe/en) which is a Natural Language Processing toolkit that provides language-agnostic ‘tokenization’, ‘parts of speech tagging’, ‘lemmatization’, ‘morphological feature tagging’ and ‘dependency parsing’ of raw text. Next to text parsing, the package also allows you to train annotation models based on data of ‘treebanks’ in ‘CoNLL-U’ format…”

The udpipe R package supports a wide range of languages from Latin-based to Asian, including Slavonic, Russian, Vietnamese, Finnish, Turkish, Serbian, Japanese, Basque, and Greek.

BNOSAC designed the udpipe R package for designer to build NLP applications that can integrate parts of speech tags, tokens, morphological features and dependency on parsing output. BNOSAC really wants non-English speaking designs to take advantage of the upgrade for their applications, because tools like this should not be restricted to English only communities.

Whitney Grace, March 19, 2018

The New York Times Wants to Change Your Google Habit

March 1, 2018

Sunday is a slightly less crazy day. I took time to scan “The Case Against Google.” I had the dead tree edition of the New York Times Magazine for February 25, 2018. You may be able to access this remarkable hybridization of Harvard MBA think, DNA engineered to stick pins in Google, and good old establishment journalism toasted at Yale University.

image

The author is a wildly successful author. Charles Duhigg loves his family, makes time for his children, writes advice books, and immerses himself in a single project at a time. When he comes up for air, he breathes deeply of Google outputs in order to obtain information. If the Google fails, he picks up the phone. I assume those whom he calls answer the ring tone. I find that most people do not answer their phones, but that’s another habit which may require analysis.

I worked through the write up. I noted three things straight away.

First, the timeline structure of the story is logical. However, leaving it up to me to figure out which date matched which egregious Google action was annoying. Fortunately, after writing The Google Legacy, Google Version 2.0, and Google: The Digital Gutenberg, I had the general timeline in mind. Other readers may not.

Second, the statement early in the write up reveals the drift of the essay’s argument. The best selling author of The Power of Habit writes:

Within computer science, this kind of algorithmic alchemy is sometimes known as vertical search, and it’s notoriously hard to master. Even Google, with its thousands of Ph.D.s, gets spooked by vertical-search problems.

I am not into arguments about horizontal and vertical search. I ran around that mulberry tree with a number of companies, including a couple of New York investment banks. Been there. Done that. There are differences in how the components of a findability solution operate, but the basic plumbing is similar. One must not confuse search with the specific technology employed to deliver a particular type of output. Want to argue? First, read The New Landscape of Search, published by Pandia before the outfit shut down. Then, send me an email with your argument.

Third, cherry picking from Google’s statements makes it possible to paint a somewhat negative picture of the great and much loved Google. With more than 60,000 employees, many blogs, many public presentations, oodles of YouTube videos, and a library full of technical papers and patents, the Google folks say a lot. The problem is that finding a quote to support almost any statement is not hard; it just takes persistence. Here’s an example:

We absolutely  do not make changes 5to our search algorithm to disadvantage competitors.

Read more

Progress: From Selling NLP to Providing NLP Services

December 11, 2017

Years ago, Progress Software owned an NLP system. I recall conversations with natural language processing wizards from Easy Ask. Larry Harris developed a natural language system in 1999 or 2000. Progress purchased EasyAsk in 2005 if memory serves. I interviewed Craig Bassin in 2010 as part of my Search Wizards Speak series.

The recollection I have was that Progress divested itself of EasyAsk in order to focus on enterprise applications other than NLP. No big deal. Software companies are bought and sold everyday.

However, what makes this recollection interesting to me is the information in “Beyond NLP: 8 Challenges to Building a Chatbot.” Progress went from a software company who owned an NLP system to a company which is advising people like me how challenging a chatbot system can be to build and make work. (I noted that the Wikipedia entry for Progress does not mention the EasyAsk acquisition and subsequent de-acquisition.) Either small potatoes or a milestone best jumped over I assume.)

Presumably it is easier to advise and get paid to implement than funding and refining an NLP system like EasyAsk. If you are not familiar with EasyAsk, the company positions itself in eCommerce site search with its “cognitive eCommerce” technology. EasyAsk’s capabilities include voice enabled natural language mobile search. This strikes me as a capability which is similar to that of a chatbot as I understand the concept.

History is history one of my high school teachers once observed. Let’s move on.

What are the eight challenges to standing up a chatbot which sort of works? Here they are:

  1. The chat interface
  2. NLP
  3. The “context” of the bot
  4. Loops, splits, and recursions
  5. Integration with legacy systems
  6. Analytics
  7. Handoffs
  8. Character, tone, and persona.

As I review this list, I note that I have to decide whether to talk to a chatbot or type into a box so a “customer care representative” can assist me. The “representative” is, the assumption is, a smart software robot.

I also notice that the bot has to have context. Think of a car dealer and the potential customer. The bot has to know that I want to buy a car. Seems obvious. But okay.

“Loops, splits, and recursions.” Frankly I have no idea what this means. I know that chatbot centric companies use jargon. I assume that this means “programming” so the NLP system returns a semi-on point answer.

Integration with legacy systems and handoffs seem to be similar to me. I would just call these two steps “integration” and be done with it.

The “character, tone, and persona” seems to apply to how the chatbot sounds; for example, the nasty, imperious tone of a Kroger automated check out system.

Net net: Progress is in the business of selling advisory and engineering services. The reason, in my opinion, was that Progress could not crack the code to make search and retrieval generate expected payoffs. Like some Convera executives, selling search related services was a more attractive path.

Stephen E Arnold, December 11, 2017

Google: Headphones and Voice Magic

November 23, 2017

I read two interesting articles. Each provides some insight into Google’s effort to put the NLP and chatbot doggies in an Alphabet corral.

The first article is “Google SLING: An Open Source Natural Language Parser.” To refresh your memory, “SLING is a combination of recurrent neural networks and frame based parsing.”

The second article is “Google Introduces Dialogflow Enterprise Edition, a Conversational Apps Building Platform.” The idea is to provide “a platform for building voice and text conversational applications.”

Both are interesting because each seems to be “free.” I won’t drag you, gentle reader, through the consequences of building a solution around a “free” Google service. One Xoogler watches me like a hawk to remind me that Google doesn’t treat people in a will of the wisp way. Okay. Let’s move on, shall we?

Both of these systems advance Google’s quest to become the Big Dog of where the world is heading for computer interaction. Both are germane to the wireless headphones Google introduced. These headphones, unlike other wireless alternatives, can translate. Hence, the largesse for free NLP and voice freebies.

I read “Trying Out Google’s Translating Headphones” informed me that:

The most important thing you should know about Pixel Buds is that their full features only work with Google’s newest smartphone, the Pixel 2.

Is this vendor lock in?

I learned from the write up:

To be honest, it’s not exactly real-time. You call up the feature by tapping on your right earbud and asking Google Assistant to “help me speak” one of 40 languages. The phone will then open the Google Translate app. From there, the phone will translate what it hears into the language of your choice, and you’ll hear it in your ear.

Not quite like Star Trek’s universal translator, suggests the article. I noted this statement:

it’s worth realizing that the Pixel Buds are more than just a pair of headphones. They’re an early illustration of what we can expect from Google, which will try to make products that stand out from the pack with unusual artificial intelligence services such as translation.

A demo. I suppose doing the lock in tactic with a demo is better than basing lock in on vaporware.

Then there are the free APIs. These, of course, will never go away or cost too much money. The headphones are $159. The phone adds another $649.

Almost free.

Stephen E Arnold, November 23, 2017

Natural Language Processing: Tomorrow and Yesterday

October 31, 2017

I read “Will Natural Language Processing Change Search as We Know It?” The write up is by a search specialist who, I believe, worked at Convera. The Search Technologies’ Web site asserts:

He was the architect and inventor of RetrievalWare, a ground-breaking natural-language based statistical text search engine which he started in 1989 and grew to $50 million in annual sales worldwide. RetrievalWare is now owned by Microsoft Corporation.

I think Fast Search acquired a portion of Convera. When Microsoft purchased Fast Search, the Convera technology was part of the deal. When Convera faded, one rumor I captured in 2007 was that some of the Convera technology was used by Ntent, formed as the result of a merger between Convera Corporation and Firstlight ERA. If accurate, the history of Convera is fascinating with Excalibur, ConQuest, and Allen & Co. in the mix.

In the “Will Natural Language Processing Change Search As We Know It” blog post, I noted these points:

  • Intranets incorporating NLP, semantic search and AI can fuel chatbots as well as end-to-end question-answering systems that live on top of search. It is a truly semantic extension to the search box with far-reaching implications for all types of search.
  • With NLP, enterprise knowledge contained in paper documentation can be encoded in a machine-readable format so the machine can read, process and understand it enough to formulate an intelligent response.
  • it’s good to know about established tool sets and methodologies for developing and creating effective solutions for use cases like technical support. But like all development projects, take care to create the tools based on mimicking the responses of actual human domain experts. Otherwise, you may run into the proverbial development problem of “garbage in, garbage out” which has plagued many such expert system initiatives.

Mr. Nelson is painting a reasonable picture about the narrow use of widely touted technologies. In fact, the promise of NLP has been part of enterprise search marketing for decades.

What I found interesting was the Convera document called “Accurate Search: What a Concept, published by Convera in 2002. I noted this passage on page 4 of the document:

Concept Search capitalizes on the richness of language, with its multiple term meanings, and transforms it from a problem into an advantage. RetrievalWare performs natural language processing and search term expansion to paraphrase queries, enabling retrieval of documents that contain the specific concepts requested rather than just the words typed during the query while also taking advantage of its semantic richness to rank documents in results lists. RetrievalWare’s powerful pattern search abilities overcome common errors in both content and queries, resulting in greater recall and user satisfaction.

I find the shift from a broad solution to a more narrow solution interesting. In the span of 15 years, the technology of search seems to be struggling to deliver.

Perhaps consulting and engineering services are needed to make search “work”? Contrast search with mobile phone technology. Progress has been evident. For search, success narrows to improving “documentation” and “customer support.”

Has anyone tried to reach PayPal’s customer support or United Airlines’ customer support? Try it. United was at one time a “customer” of Convera’s. From my point of view, United Airlines’ customer service has remained about the same over the last decade or two.

Enterprise search, broad or narrow, remains a challenge for marketers and users in my opinion. NLP, I assume, has arrived after a long journey. For a free profile of Convera, check out this link.

Stephen E Arnold, October 31, 2017

Free Language Learning Resources That Are Not Duolingo

October 25, 2017

For those who wish to learn a foreign language, the fun and engaging Duolingo has become a go-to free resource, offering courses in more than 20 languages. However, it is not the only game in town; MakeUseOf  gives us a rundown of “The Best (Completely Free) Language Learning Alternatives to Duolingo.” Writer Briallyn Smith tells us:

One of the reasons some people are looking to move away from Duolingo is the recent introduction of in-app purchases. While the core functions of Duolingo are still free, the purchase options can give learners a boost when playing games — much like the bonuses and extra lives you can purchase on Bejewelled or other addictive gaming apps. Learners may become frustrated when they are prevented from working on a specific language skill or accomplishment because they ran out of ‘hearts’ or need to purchase ‘gems’ to continue. Other in-app purchases allow users to remove ads from their learning experience and to download offline content.

While there’s nothing wrong with Duolingo charging fees for its services, it can be frustrating for those looking for a truly free resource. Other language learners simply do not enjoy learning through games. This is especially true for those who require industry-specific vocabulary or who already have a background in the language. Thankfully, there are many other online resources available for language learners. While you won’t get the same kind of program as Duolingo for free, you can easily use these resources to put together a language learning strategy that works well for you.

Before getting to her list, Smith takes a moment to advocate for paid language-learning services, like Babbel. Basically, if you are serious about your language studies and can afford it, they are worth the investment.

The resource list begins with a compound entry, Online Communities; included here are Fluent in 3 Months/r/LanguageLearning, and The Polyglot Club. Then there are Rhino Spike, Mango Languages, the Yojik Website, and, of course, YouTube (with a list of 10 suggested channels). Furthermore,  Smith supplies a link to OpenCulture for more even options. See the article for more about each of these entries.

Cynthia Murrell, October 25, 2017

Chatbots: The Negatives Seem to Abound

September 26, 2017

I read “Chatbots and Voice Assistants: Often Overused, Ineffective, and Annoying.” I enjoy a Hegelian antithesis as much as the next veteran of Dr. Francis Chivers’ course in 19th Century European philosophy. Unlike some of Hegel’s fans, I am not confident that taking the opposite tack in a windstorm is the ideal tactic. There are anchors, inboard motors, and distress signals.

The article points out that quite a few people are excited about chatbots. Yep, sales and marketing professionals earn their keep by crating buzz in order to keep their often-exciting corporate Beneteau 22’s afloat. With VCs getting pressured by those folks who provided the cash to create chatbots, the motive force for an exciting ride hurtles onward.

The big Sillycon Valley guns have been army the chatbot army for years. Anyone remember Ask Jeeves when it pivoted to a human powered question answering machine into a customer support recruit. My recollection is that the recruit washed out, but your mileage may vary.

With Amazon, Facebook, Google, IBM, and dozens and dozens of companies with hard-to-remember names on the prowl, chatbots are “the future.” The Infoworld article is a thinly disguised “be careful” presented as “real news.”

That’s why I wrote a big exclamation point and the words “A statement from the Captain Obvious crowd” next to this passage:

Most of us have been frustrated with misunderstandings as the computer tries to take something as imprecise as your voice and make sense of what you actually mean. Even with the best speech processing, no chatbots are at 100-percent recognition, much less 100-percent comprehension.

I am baffled by this fragment, but I am confident it makes sense to those who were unaware that dealing with human utterances is a pretty tough job for the Googlers and Microsofties who insist their systems are the cat’s pajamas. Note this indication of Infoworld quality in thought an presentation:

It seems very inefficient to resort to imprecise systems when we have [sic]

Yep, an incomplete thought which my mind filled in as saying, “humans who can maybe answer a question sometimes.”

The technology for making sense of human utterance is complex. Baked into the systems is the statistical imprecision that undermines the value of some chatbot implementations.

My thought is that Infoworld might help its readers if it were to answer questions like these:

  • What are the components of a chatbot system? Which introduce errors on a consistent basis?
  • How can error rates of chatbot systems be reduced in an affordable, cost effective manner?
  • What companies are providing third party software to the big girls and boys in the chatbot dodge ball game?
  • Which mainstream chatbot systems have exemplary implementations? What are the metrics behind “exemplary”?
  • What companies are making chatbot technology strides for languages other than English?

I know these questions are somewhat more difficult to answer than a write up which does little more than make Captain Obvious roll his eyes. Perhaps Infoworld and its experts might throw a bone to their true believers?

Stephen E Arnold, September 26, 2017

New Beyond Search Overflight Report: The Bitext Conversational Chatbot Service

September 25, 2017

Stephen E Arnold and the team at Arnold Information Technology analyzed Bitext’s Conversational Chatbot Service. The BCBS taps Bitext’s proprietary Deep Linguistic Analysis Platform to provide greater accuracy for chatbots regardless of platform.

Arnold said:

The BCBS augments chatbot platforms from Amazon, Facebook, Google, Microsoft, and IBM, among others. The system uses specific DLAP operations to understand conversational queries. Syntactic functions, semantic roles, and knowledge graph tags increase the accuracy of chatbot intent and slotting operations.

One unique engineering feature of the BCBS is that specific Bitext content processing functions can be activated to meet specific chatbot applications and use cases. DLAP supports more than 50 languages. A BCBS licensee can activate additional language support as needed. A chatbot may be designed to handle English language queries, but Spanish, Italian, and other languages can be activated with via an instruction.

Dr. Antonio Valderrabanos said:

People want devices that understand what they say and intend. BCBS (Bitext Chatbot Service) allows smart software to take the intended action. BCBS allows a chatbot to understand context and leverage deep learning, machine intelligence, and other technologies to turbo-charge chatbot platforms.

Based on ArnoldIT’s test of the BCBS, accuracy of tagging resulted in accuracy jumps as high as 70 percent. Another surprising finding was that the time required to perform content tagging decreased.

Paul Korzeniowski, a member of the ArnoldIT study team, observed:

The Bitext system handles a number of difficult content processing issues easily. Specifically, the BCBS can identify negation regardless of the structure of the user’s query. The system can understand double intent; that is, a statement which contains two or more intents. BCBS is one of the most effective content processing systems to deal correctly  with variability in human statements, instructions, and queries.

Bitext’s BCBS and DLAP solutions deliver higher accuracy, and enable more reliable sentiment analyses, and even output critical actor-action-outcome content processing. Such data are invaluable for disambiguating in Web and enterprise search applications, content processing for discovery solutions used in fraud detection and law enforcement and consumer-facing mobile applications.

Because Bitext was one of the first platform solution providers, the firm was able to identify market trends and create its unique BCBS service for major chatbot platforms. The company focuses solely on solving problems common to companies relying on machine learning and, as a result, has done a better job delivering such functionality than other firms have.

A copy of the 22 page Beyond Search Overflight analysis is available directly from Bitext at this link on the Bitext site.

Once again, Bitext has broken through the barriers that block multi-language text analysis. The company’s Deep Linguistics Analysis Platform supports more than 50 languages at a lexical level and +20 at a syntactic level and makes the company’s technology available for a wide range of applications in Big Data, Artificial Intelligence, social media analysis, text analytics,  and the new wave of products designed for voice interfaces supporting multiple languages, such as chatbots. Bitext’s breakthrough technology solves many complex language problems and integrates machine learning engines with linguistic features. Bitext’s Deep Linguistics Analysis Platform allows seamless integration with commercial, off-the-shelf content processing and text analytics systems. The innovative Bitext’s system reduces costs for processing multilingual text for government agencies and commercial enterprises worldwide. The company has offices in Madrid, Spain, and San Francisco, California. For more information, visit www.bitext.com.

Kenny Toth, September 25, 2017

Quote to Note: The Role of US AI Innovators

March 24, 2017

I read “Opening a New Chapter of My Work in AI.” After working through the non-AI output, I concluded that money beckons the fearless leader, Andrew Ng. However, I did note one interesting quotation in the apologia:

The U.S. is very good at inventing new technology ideas. China is very good at inventing and quickly shipping AI products.

What this suggests to me is that the wizard of AI sees the US as good at “ideas”, and China an implementer. A quick implementer at that.

My take is that China sucks up intangibles like information and ideas. Then China cranks out products. Easy to monetize things, avoiding the question, “What’s the value of that idea, pal?”

Ouch. On the other hand, software is the new electricity. So who is Thomas Edison? I wish I “knew”.

Stephen E Arnold, March 24, 2017

Next Page »

  • Archives

  • Recent Posts

  • Meta