Google: Translation King?

March 1, 2017

I read “Google’s AI Software Wins Top Score among Machines in Translation Battle.” Good news for the GOOG. The company recently limited free online translation, and I noted when I was translating a test passage from Persian to English that the free Google system truncated the passage, a problem which did not plague the FreeTranslatioins.org system. Persian is a bit more of hill climb than translating Spanish to Italian, but the unpredictable behavior was telling.

The write up, however, encountered no glitches it seems. I learned:

Artificial intelligence language software by US Internet giant Google Inc., scored higher than its rival AI machines in a translation battle between humans and machines held in South Korea [in February 2017].

The Google system made kimchi of four human translators, Systran (a go to fave for many years), and the Naver system (anyone remember Naver search?).

The Google system performed well, according to the “real” news outfit Korea Herald:

the organizers said the four professional translators scored better in translating random English articles — literature and non-literature — into Korean and other Korean articles into English than the machines. Of the machines, Google scored a total of 28 out of 60, followed by Naver’s automated translation app called Papago with 17 and Systran with 15, the tech company officials with knowledge of the matter said.

Yikes. Humans did better. No guaranteed annual income for these folks.

Who lost the battle? Systran International.

The factoid I noted was: “The new systems considered an “entire sentence as one unit.”

But humans? Better.

Stephen E Arnold, March 1, 2017

IQwest IT Steps Up Its Machine Translation Marketing

February 3, 2017

Machine translation means that a computer converts one language into another. The idea is that the translation is accurate; that is, presents the speaker’s or writer’s message payload without distortion, odd ball syntax, and unintended humor. What’s a “nus”? The name of a nuclear consulting company or a social mistake? Machine translation, as an idea, has been around since that French whiz Descartes allegedly cooked up the idea in the 17th century.

I read two almost identical articles, which triggered by content marketing radar. The first write up appeared in KV Empty Pages as “Finding the Needle in the Digital Multilingual Haystack.” The second article appeared in the Medium online publication as “Finding the Needle in the Digital Multilingual Haystack.”

image

image

Notice the similarity. Intrigued I ran a query for IQwest. I noted that the domain name IQwest.com refers to a bum domain name. I did a bit of poking around and learned that there are companies using IQwest for engineering services, education, and legal technologies. The IQwest.com domain is owned by Qwest Communications in Denver.

The machine translation write up belongs to the IQwestIT.com group. No big deal, of course, but knowing which company’s name overlaps with other companies’ usage is interesting.

Now what’s the message in these two identical essays beyond content marketing? For me, the main point is that a law firm can use software translation to eliminate documents irrelevant to the legal matter at hand. For documents not in the lawyer’s native language, machine translation can churn out a good enough translation. The value of machine translation is that it is cheaper than a human translator and a heck of a lot less expensive.

Okay, I understand, but I have understood the value of machine translation since I had access to a Systran based system years ago. Furthermore, machine translation systems have been an area of interest in some of the government agencies with which I am familiar for decades.

The write up states:

building a model and process that takes advantage of benefits of various technologies, while minimizing the disadvantages of them would be crucial. In order to enhance any and all of these solution’s capabilities, it is important to understand that machines and machine learning by itself cannot be the only mechanism we build our processes on. This is where human translations come into the picture. If there was some way to utilize the natural ability of human translators to analyze content and build out a foundation for our solutions, would we be able improve on the resulting translations? The answer is a resounding yes!

Another, okay from me. The solution, which I anticipated, is a rah rah for the IQwest machine translation system. What’s notable is that the number of buzzwords used to explain the system caught my attention; for instance:

  • Classification
  • Clustering
  • N grams
  • Summarization

These standard indexing functions are part of the IQwest machine translation system. That system, the write up notes, can be supplemented with humans who ride herd on the outputs and who interact with the system to make sure that entities (people, places, things, events, etc.) are identified and translated. This is a slippery fish because some persons of interest have different names, handles, nicknames, code words, and legends. Informed humans might be able to spot these entities because no system with which I am familiar is able to knit together some well crafted aliases. Remember those $5,000 teddy bears on eBay. What did they represent?

The write up seems to be aimed at attorneys. I suppose that group of professionals may not be aware of the machine translation systems available online and for on premises installation. For the non attorney reader, the write up tills some familiar ground.

I understand the need to whip up sales leads, but the systems available from Google and Microsoft, to name just two work reasonably well. When those systems are not suitable, one can turn to SDL or Systran, to name two vendors with workable systems.

Net net: My thought is that two identical versions of the same article directed at a legal audience represents a bit of marketing wonkiness. The write up’s shotgun approach to reaching attorneys is interesting. I noticed the duplication of content, and my hunch is that Google’s duplicate detection system did as well.

Perhaps placing the write up in an online publication reaching lawyers would be a helpful use of the information?  What’s clear is that IQwest represents an opportunity for some motivated marketing expert to offer his or her services to the company.

My take is that IQwest offers a business process for reducing costs for litigation related document processing. The translation emphasis is okay, but the idea of making a phone call and getting the job done is what differentiates IQwest from, for example, the GOOG. I remember Rocket Docket. A winner. When I looked at that “package,” the attorneys with whom I spoke did not care about what was under the hood. The hook was speed, reduced cost, and more time to do less dog work.

But the lawyers may need to hurry. “Lawyers Are Being Replaced by Machines That Read.” Dragging one’s feet technologically and demanding high salaries despite a glut of legal eagles may change the game and quickly.

Plus, keep in mind FreeTranslations.org. You can get voice translations as well as text translations. The increasingly frugal Google has trimmed its online translation service. Sigh. The days of pasting lengthy text into a box is gone like a Loon balloon drifting away from Sri Lanka.

There are options, gentle reader.

Stephen E Arnold, February 3, 2017

Google and Its Smart Chinese Translation Neural Machine Thing

October 5, 2016

Google has a new neural translation system for Chinese. Read more here. It sort of works, but poetry is not its strong suit. Many Chinese student memorize Shi Jing’s “Cry of the Ospreys.” In Chinese, the first line of the poem is:

image

Google produces this translation of the line:

“Guan guanju dove, in the river of the continent.”

image

A standard English translation is:

Guan, guan, trill the ospreys, upon the island in the creek.

The standard English translation makes evident the sound of the ospreys from the island in the creek. Google sticks in a “dove” and dumps the island. Close enough for ospreys if not making the meaning clear to a non Chinese reader. Shi Jing is not around to offer an opinion which is probably a good thing.

Stephen E Arnold, October 5, 2016

Microsoft Changing Everything: At Least What Daesh Means in Redmond

August 30, 2016

I reported that Microsoft’s chief envisioning officer (I love that title) asserted that artificial intelligence will change everything. I pointed out that Microsoft has not been able to “change” China. Now Microsoft has learned that it cannot change the meaning of the word “Daesh,” which is one of the names of the Islamic State. I read “Bing Translates “Daesh” As “Saudi Arabia”, Angers Entire Kingdom.” The write up points out:

Bing Translation of “Daesh” the Arabic acronym for a global terrorist group backed by Kingdom of Saudi Arabia to “Saudi Arabia” has put the Microsoft Corporation in hot water with the Kingdom. Apparently, when the Arabic word

image

was typed into Bing Translate, the words “Saudi Arabia” would appear as the English translation, according to Khaberni. The so-called technical error caused an uproar in Saudi Arabia, where many Saudi social media users called for a boycott of Bing and Microsoft. The Microsoft Corporation has formally issued an apology to the Kingdom of Saudi Arabia, calling the error “unintentional”.

In what seems like the blink of an eye, Microsoft rolled out the bot which quickly learned to be somewhat interesting. The bot rolled away. Then Microsoft made Windows 10 Web cam hostile. Now Microsoft’s smart translation system has managed to anger the nation state Saudi Arabia. I assume Microsoft’s professionals anticipate smooth, seamless processing when entering the Kingdom from the USA. Now let’s think about the “change everything” statement. Doesn’t seem exactly correct, does it? How about some snap inspections of luggage to brighten one’s day? What’s the word for that? Sheesh? Oh, tay?

Stephen E Arnold, August 30, 2016

Statistical Translation: Dead Like Marley

June 16, 2016

I read “Facebook Says Statistical Machine Translation Has Reached End of Life.” Hey, it is Facebook. Truth for sure. I learned:

Scale is actually one reason Facebook has invested in its own MT technology. According to Packer [Facebook wizard’’], there are more than two trillion posts and comments, which grows by over a billion each day. “Pretty clearly, we’re not going to solve this problem with a roomful or even a building-full of human translators,” he quipped, adding that to have even “a hope of solving this problem, we need AI; we need automation.” The other reason is adaptability. “We tried that,” said Packer about using third-party MT, but it “did not work well enough for our needs.” The reason? The language of Facebook is different from what is on the rest of the Web. Packer described Facebook language as “extremely informal. It’s full of slang, it’s very regional.” He said it is also laden with metaphors, idiomatic expressions, and is riddled with misspellings (most of them intentional). Additionally, as in the rest of the world, there is a marked difference in the way different age groups communicate on Facebook.

I wonder if it is time to send death notices to the vendors who use statistical methods? Perhaps I should wait a bit. Predictions are often different from reality.

Stephen E Arnold, June 16, 2016

Be the CIA Librarian

May 3, 2016

Research is a vital tool for the US government, especially the Central Intelligence Agency which is why they employee librarians.  The Central Intelligence Agency is one of the main forces of the US Intelligence Community, focused on gathering information for the President and the Cabinet.  The CIA is also the topic of much fictionalized speculation in stories, mostly spy and law enforcement dramas.  Having played an important part in the United States history, could you imagine the files in its archives?

If you have a penchant for information, the US government, and a library degree then maybe you should apply to the CIA’s current job opening: as a CIA librarian.  CNN Money explains one of the perks of the job is its salary: “The CIA Is Hiring…A $100,000 Librarian.”  Beyond the great salary, which CNN is quick to point out is more than the typical family income.  Librarians server as more than people who recommend decent books to read, they serve as an entry point for research and bridge the gap between understanding knowledge and applying it in the actual field.

“In addition to the cachet of working at the CIA, ‘librarians also have opportunities to serve as embedded, or forward deployed, information experts in CIA offices and select Intelligence Community agencies.’  Translation: There may be some James Bond-like opportunities if you want them.”

Most of this librarian’s job duties will probably be assisting agents with tracking down information related to intelligence missions and interpreting it.  It is just a guess, however.  Who knows, maybe the standard CIA agent touts a gun to the stacks?

 

Whitney Grace, May 3, 2016
Sponsored by ArnoldIT.com, publisher of the CyberOSINT monograph

Online Translation: Google or Microsoft?

March 1, 2016

HI have solved the translation problem. I live in Harrod’s Creek, Kentucky. Folks here speak Kentucky. No other language needed. However, gentle reader, you may want to venture into lands where one’s native language is not spoken or written. You will need online translation.

Should I forget Systran and other industrial strength solutions of yesteryear. Today the choice is Google or Microsoft if I understand “2 Main Reasons Why Google Translate Is Ahead of Microsoft and Skype.” (The link worked on February 22, 2016. If it does not work when you read this blog post, you may have to root around. That’s life in the zip zip world today.)

Reason one is that Google supports more languages than Microsoft. The total is 100 plus. The write up is sufficiently amazed to describe the language support of the Alphabet Google thing as “mind blowing.” Okay.

Reason two is that Google’s translation function works on smartphone. The write up points out:

You can hand-write, speak, type, or even take a picture of a given language and Google Translate will translate it for you. Not only this but on Android, some of the translation features are available offline. So, some features are accessible even if you do not have access to the internet.

The write up does not dig too deeply into Microsoft’s translation capability. If you are interested in Microsoft’s quite capable and useful services, navigate to the Microsoft Language Portal. Google is okay, but one service may not do the job a person who does not speak Kentucky requires.

Stephen E Arnold, February 27, 2016

When Google Translate Is Not Enough

September 16, 2015

I read a delightful article called “The British Library Is Crowdsourcing the Translation of a Mysterious 13th-Century Sword Inscription.” I am not too keen on edged weapons. Nevertheless, I am interested in becoming sharper when it comes to translation methods.

The write up states:

+NDXOXCHWDRGHDXORVI+ This inscription, engraved on a 13th-century double-edged sword owned by the British Museum, is the medieval mystery of the moment. Stumped by its cryptic engraving, last week the British Library tapped the interwebs for its crowd wisdom, asking commenters to help decode the meaning.

What makes the article entertaining is the fact that the British Library, backed with the formidable talents of British universities where linguistics absolutely thrives is turning to the hoi polloi for assistance.

And assist did the rustics. Consult the original article for the full span of human ingenuity. Here’s the comment I enjoyed from a non rustic:

“Everything is explained in Winnie the Pooh.”

A Google search reveals more questions:

image

Helpful.

Stephen E Arnold, September 16, 2015

Captain Page Delivers the Google Translator

January 15, 2015

Well, one of the Star Trek depictions is closer to reality. Google announced a new and Microsoft maiming translate app. You can read about this Bing body blow in “Hallo, Hola, Ola to the New, More Powerful Google Translate App.” Google has more translation goodies in its bag of floppy discs. My hunch is that we will see them when Microsoft responds to this new Google service.

The app includes an image translation feature. From my point of view, this is helpful when visiting countries that do not make much effort to provide US English language signage. Imagine that! No English signs in Xi’an or Kemerovo Oblast.

The broader impact will be on the industrial strength, big buck translation systems available from the likes of BASIS Tech and SDL. These outfits will have to find a way to respond, not to the functions, but the Google price point. Ouch. Free.

Stephen E Arnold, January 15, 2015

Machine versus Human Translations

January 7, 2015

I am fascinated with the notion of real time translation. I recall with fondness lunches with my colleagues at Ziff in Foster City. Then we talked about the numerous opportunities to create killer software solutions. Translation would be “solved”. Now 27 years later, progress has been made, just slowly.

Every once in a while an old technical cardboard box gets hauled out from under the car port. There are old ideas that just don’t have an affordable, reliable, practical solution. After rummaging in the box, the enthusiasts put it back on the shelf and move on to the next YouTube video.

I read “The Battle of the Translators: Man vs Machine.” The write up tackles Skype’s real time translation feature. Then there is a quick excursion through Google Translate.

The passage I noted was:

So, while machine translations may be great for rudimentary translations or even video calls, professional human translators are expert craftsmen, linguists, wordsmiths and proofreaders all wrapped in one. In addition to possessing cultural insight, they also are better editors who shape and perfect a piece for better public consumption, guaranteeing a level of faithfulness to the original document — a skill that not even the most cutting-edge machine translation technology is capable of doing just yet. Machine translators are simply not yet at the level of their chess-playing counterparts, which can beat humans at their own game. As long as automatic translators lack the self-awareness, insight and fluency of a professional human translator, a combination of human translation assisted by machine translation may be the optimal solution.

I include a chapter about automated translation in CyberOSINT: Next Generation Information Access. You can express interest in ordering by writing benkent2020 at yahoo dot com. In the CyberOSINT universe, machine translation exists cheek-by-jowl with humans.

For large flows of information in many different languages, there are not enough human translators to handle the work load. Machine based translations , therefore, are an essential component of most cyber OSINT systems. For certain content, a human has to make sure that the flagged item is what the smart software thinks it is.

The problem becomes one of having enough capacity to handle first the machine translation load and then the human part of the process. For many language pairs, there are not enough humans. I don’t see a quick fix for this multi-lingual talent shortfall.

The problem is a difficult one. Toss in slang, aliases, code words and phrases, and neologisms. Stir in a bit of threat with or without salt. Do the best you can with what you have.

Translation is a thorny problem. The squabbles of the math oriented and the linguistic camps are of little interest to me. Good enough translation is what we have from both machines and humans.

I don’t see a fix that will allow me to toss out the cardboard box with its musings from 30 years ago.

Stephen E Arnold, January 7, 2015

« Previous PageNext Page »

  • Archives

  • Recent Posts

  • Meta