HonkinNews for 20 June 2017 Now Available
June 20, 2017
HonkinNews reminds everyone that success may be measured in the size of one’s golden parachute. We report that Yahoot (sorry, I meant, Yahoo) is now Oath with a colon. As we ponder the end of Yahoot, we mention that Yahoot’s former president is leaving the company in a cloud of purple haze with about $250 million. Yahoooo. The Dark Web presentations at the TechnoSecurity & Digital Forensics Conference seemed to be a hit. The two public lectures attracted 310 people. The special hands on session was sold out. We report that the launch of Dark Web Notebook (available at gum.com/darkweb) caught some attendees’ attention as well. This week’s program has the details. Concerned that your Big Data or content processing system is an error-generation machine? The solution is editorial controls before one starts crunching. HonkinNews reveals that using the term “data governance” is no substitute for management and planning ahead. What about Palantir? Watch this week’s program to learn that Palantir, once an outsider for some government work, is now an insider. You can find this week’s program at this link.
Kenny Toth, June 20, 2017
Academic Publisher Retracts Record Number of Papers
June 20, 2017
To the scourge of fake news we add the problem of fake research. Retraction Watch announces “A New Record: Major Publisher Retracting More Than 100 Studies from Cancer Journal over Fake Peer Reviews.” We learn that Springer Publishing Company has just retracted 107 papers from a single journal after discovering their peer reviews had been falsified. Faking the integrity of cancer research? That’s pretty low. The article specifies:
To submit a fake review, someone (often the author of a paper) either makes up an outside expert to review the paper, or suggests a real researcher — and in both cases, provides a fake email address that comes back to someone who will invariably give the paper a glowing review. In this case, Springer, the publisher of Tumor Biology through 2016, told us that an investigation produced “clear evidence” the reviews were submitted under the names of real researchers with faked emails. Some of the authors may have used a third-party editing service, which may have supplied the reviews. The journal is now published by SAGE. The retractions follow another sweep by the publisher last year, when Tumor Biology retracted 25 papers for compromised review and other issues, mostly authored by researchers based in Iran.
The article shares Springer’s response to the matter, some from their official statement and some from a spokesperson. For example, we learn the company cut ties with the “Tumor Biology” owners, and that the latest fake reviews were caught during a process put in place after that debacle. See the story for more details.
Cynthia Murrell, June 20, 2017
Siri Becomes Smarter and More Human
June 20, 2017
When Apple introduced Siri, it was a shiny, new toy, but the more people used it they realized it was a dumb digital assistant. It is true that Siri can accurately find a place’s location, conduct a Web search, or even call someone in your contact list, but beyond simple tasks “she” cannot do much. TechCrunch reports that Apple realizes there is a flaw in their flagship digital assistant and in order to compete with Google Assistant, Amazon Alexa, and even Windows Cortana they need to upgrade Siri’s capabilities, “Siri Gets Language Translation And A More Human Voice.”
Apple decided that Siri would receive a big overhaul with iOS 11. Not only will Siri sound more human, but also the digital assistant will have a female and male voice, the voice will become clearer ability to answer more complex, and even better, a translation application:
Apple is bringing translation to Siri so that you can ask the voice assistant how do say a certain English phrase in a variety of languages, including, at launch, Chinese, French, German, Italian and Spanish.
Apple has changed their view of Siri. Instead of it being a gimmicky way to communicate with a device, Apple is treating Siri as a general AI that extends a device’s usage. Apple is making the right decision to make these changes. For the translation aspect, Apple should leverage tools like Bitext’s DLAP to improve the accuracy.
Whitney Grace, June 20, 2017
Algorithms Are Getting Smarter at Identifying Human Behavior
June 19, 2017
Algorithm deployed by large tech firms are better at understanding human behaviors, reveals former Google data scientist.
In an article published by Business Insider titled A Former Google Data Scientist Explains Why Netflix Knows You Better Than You Know Yourself, Seth Stephens-Davidowitz says:
Many gyms have learned to harness the power of people’s over-optimism. Specifically, he said, “they’ve figured out you can get people to buy monthly passes or annual passes, even though they’re not going to use the gym nearly enough to warrant this purchase.
Companies like Netflix use this to their benefit. For instance, during initial years, Netflix used to encourage users to create playlists. However, most users ended up watching the same run of the mill content. Netflix thus made changes and started recommending content that was similar to their content watching habits. It only proves one thing, algorithms are getting smarter at understanding and predicting human behaviors, and that is both good and bad.
Vishal Ingole, June 19, 2017
Russia Demands Google Register or Leave
June 19, 2017
Say, this could be good news for Yandex, the Russian search giant. RT News reports, “Google News Given 3 Mths to Comply with New Law to Stay in Russia.” The article explains:
According to the Russian communications regulator Roskomnadzor, major news sites with traffic exceeding a million visitors per day will be put on a special register in 2017. ‘At the moment, only large and popular aggregators such as Yandex, Google, Mail.ru, and others have such a high level of traffic,’ Roskomnadzor spokesman Vadim Ampelonsky told Izvestia daily.
Foreign news aggregators will have three months from January 1 to register their legal entities, thus allowing them to operate in Russia. Currently, there are two major news aggregators owned by foreign companies in Russia – Google and Bing. Bing belongs to Microsoft which already has a Russian subsidiary called Microsoft Rus.
If Google fails to register, it could be fined and, eventually, blocked within Russia’s borders. A quote from Sergey Kopylov, representative of the Russian National Internet Domain, seems to indicate advertising will be against the new rules. We know Google makes most of its money through AdWords, so how will the company respond to this demand?
Cynthia Murrell, June 19, 2017
Editorial Controls and Data Governance: A Rose by Any Other Name?
June 16, 2017
I read “Why Interest In “Data Governance” Is Increasing.” The write up uses a number of terms to describe what I view as editorial controls. The idea in my experience is that an organization decides what it okay and not okay with regards to the information it wants to process. The object is to know what content will be processed before the organization kick starts indexing, metadata tagging, or text analysis.
The organization then has to figure out and implement the rules of the game. Questions like “What do we do when entities are not recognized?” and “Who goes through the exceptions file?” must be answered. Rules, procedures, processes, and corrective actions have to be implemented in the work flow. One cannot calculate costs, headcount, or software expenses unless one knows what’s going to happen.
The write up explains that data governance is important. I agree. The write up hooks the notion of editorial controls and editorial process to a number of buzzwords. I don’t think this type of jargon catalog is particularly helpful. Jargon distracts some people from focusing on Job One; that is, putting appropriate controls in place before nuking the budget or creating the type of editorial craziness which Facebook and Google are now trying to contain and manage.
The notion that an organization has to perform “data program management” is fine. But this is nothing more than hooking the editorial rules of the road to the responsibilities of the people who have to set up, oversee, and change the work flow.
Jargon does not help implement editorial controls. Clear thinking and speaking do.
Stephen E Arnold, June 16, 2017
HPE IDOL Released with Natural Language Processing Capabilities Aimed at Enterprise-Level Tasks
June 16, 2017
The article titled Hewlett Packard Enterprise Enriches HPE IDOL Machine Learning Engine With Natural Language Processing on SDTimes discusses the enhancements to HPE IDOL. The challenges to creating an effective interactive experience based on Big Data for enterprise-class inquiries are related to the sheer complexity of the inquiries. Additional issues arise around context, specificity, and source validation. The article examines the new and improved model,
HPE Natural Language Question Answering deciphers the intent of a question and provides an answer or initates an action drawing from an organization’s own structured and unstructured data assets, in addition to available public data sources to provide actionable, trusted answers and business critical responses… HPE IDOL Natural Language Question Answering is a core feature of the new HPE IDOL 11.2 software release that features four key capabilities for natural language processing for the enterprise.
These capabilities are the IDOL Answer Bank (with pre-set reference questions), Fact Bank (with structured and unstructured data extraction abilities), Passage Extract (for text-based summaries), and Answer Server (for question analysis and integration of the other 3 areas). The goal is natural conversations between people and computers, an “information exchange”. The four capabilities work together to deliver a complex answer with the utmost accuracy and relevance.
Chelsea Kerwin, June 16, 2017
Will the Smartest Virtual Assistant Please Stand Up?
June 16, 2017
The devices are driving sales. However AI-powered virtual assistants are far from perfect. Alexa, Google Assistant, Siri, and Cortana are good for basic questions on weather, radio stations, and calendars. But when it comes to complicated questions, all fail.
MarketWatch in an article titled This Is the Smartest Virtual Assistant — and It’s NOT Siri, or Alexa says:
A number of factors will shape the market moving forward, including changes in consumers’ comfort over the security and collection of private data, the progress of natural language processing and advances in voice interface functionalities, and regulatory requirements that could alter the market.
A survey revealed that none of the virtual assistants tested was able to answer 100% of questions (let alone attempt them). Virtual assistants that attempted to answer them were not answering the questions correctly. Google was at the top of the heap while Siri was the last.
The article also points out that people want complicated questions answered rather than the simpletons that these virtual assistants answer. It seems, the days of perfect virtual assistants are still far away. Till then, Google search engine is the best bet (the survey says so)
Vishal Ingole, June 16, 2017
Bookeyes for Free Classic Literature
June 15, 2017
We want to let you in on a nifty new resource—Bookeyes lets users download classic literature, in eBook form, for free. As of this writing, the site has 65 works to choose from, with the option to request something specific not yet on the page. (I requested Wuthering Heights by Emily Brontë.)
Despite the lack of Brontë sisters, the selection is pretty representative of the traditional Western-centric cannon, from Machiavelli to Thoreau. There’s your Homer, your Shakespeare, Jack London, Tolstoy, and Twain. Beowulf too, naturally. We also see books by Jane Austin, Harriet Beecher Stowe, and Frederick Douglas.
The search box works as expected—the first few letters of a title or author’s name narrows the list without reloading the page. To say the “About” page is succinct is an understatement; it simply declares:
Bookeyes is your home for book classics. Pick a title on our homepage and enjoy!
On the Contact page is a photo of site creator Kermitt Davis, who is either quite young or incredibly well-preserved. We applaud his effort to bring classic literature to the masses; perhaps he could use more suggestions for works that are out of copyright. Know of any good ones that might fall outside the syllabus for a Survey of Prominent Western Literature?
Cynthia Murrell, June 15, 2017
Maybe Trump Speak Pretty One Day
June 15, 2017
US President Donald Trump is not the most popular person in the world. He is a cherished scapegoat for media outlets, US citizens, and other world leaders. One favorite point of ridicule for people is his odd use of the English language. Trump’s take on the English tongue is so confusing that translators are left scratching their heads says The Guardian in, “Trump In Translation: President’s Mangled Language Stumps Translators.” For probably the first time in his presidency, Trump followed proper sentence structure and grammar when he withdrew the US from the Paris Accord. While the world was in an uproar about the climate change deniers, translators were happy that they could translate his words easier.
Asian translators are especially worried about what comes out of Trump’s mouths. Asian languages have different root languages than European ones; so direct translations of the colloquial expressions Trump favors are near impossible.
India problems translating Trump to Hindi:
‘Donald Trump is difficult to make sense of, even in English,’ said Anshuman Tiwari, editor of IndiaToday, a Hindi magazine. “His speech is unclear, and sometimes he contradicts himself or rambles or goes off on a tangent. Capturing all that confusion in writing, in Hindi, is not easy,’ he added. ‘To get around it, usually we avoid quoting Trump directly. We paraphrase what he has said because conveying those jumps in his speech, the way he talks, is very difficult. Instead, we summarise his ideas and convey his words in simple Hindi that will make sense to our readers.’
Indian translators also do Trump a favor by translating his words using the same level of the rhetoric of Indian politicians. It makes him sound smarter than he appears to English-speakers. Trump needs to learn to trust his speechwriters, but translators should learn they can rely on Bitext’s DLAP to supplement their work and improve local colloquialisms.
Whitney Grace, June 15, 2017