Chatbots: Yak, Yak, Yak

May 24, 2018

We want to keep an open mind about smart software and the go-to application designed to terminate the folks with thrilling phone and email customer support jobs.

Just the name, “chatbot” is likely to elicit eyerolls from readers. While we have frequently been told these online oddities will be stepping up into the big leagues of usability, they don’t seem to have really found their niche. That’s what made it all the more surprising when their creators began demanding a little respect in a recent Qrius piece, “Chatbots Deserve More Than Being a Joke, Here’s Why.”

“In the most successful (and useful) applications we were able to schedule meetings and order pizza. …

“[But] We remember the failures. And when Microsoft’s Tay turned into a racist within 24 hours of release, we all laughed. If one of the biggest technology companies in existence couldn’t prevent a chatbot from becoming an anti-semite, what hope was there for the technology writ large?”

The reason we remember the failures and not the successes is because the benefits of one are outweighed by the regret of the other. However, more and more businesses are aiming to change this. Forbes recently reported on how AI was helping make chatbots more useful (go figure!). It’s a compelling point and maybe one that is finally on the verge of becoming relevant. Relevant is not the same as annoying and sometimes very, very dumb.

Patrick Roland, May 25, 2018

Short Honk: Online Translation Services

May 10, 2018

I read “Five of the Best Free Online Translators to Translate Foreign Languages.” Not a great headline, but I pulled out the list of services. Here they are:

I would suggest that you take a look at SDL’s FreeTranslation.com service at https://www.freetranslation.com/. Sometimes useful.

For accurate translations, one needs a native language speaker. Software is okay, but it does not do well with jargon, insider lingo, and words with loaded meanings.

Stephen E Arnold, May 10, 2018

Houston, We May Want to Do Fake News

May 2, 2018

The fake news phenomenon might be in the public eye more, thanks to endless warnings and news stories, however that has not dulled its impact. In fact, this shadowy form of propaganda seems to flourish under the spotlight, according to a recent ScienceNews story, “On Twitter, The Lure of Fake News is Stronger than Truth.”

According to the research:

“Discussions of false stories tended to start from fewer original tweets, but some of those retweet chains then reached tens of thousands of users, while true news stories never spread to more than about 1,600 people. True news stories also took about six times as long as false ones to reach 1,500 people. Overall, fake news was about 70 percent more likely to be retweeted than real news.”

That’s an interesting set of data. However, anyone quick to blame spambots for this amazing proliferation of fake news needs to give it a second look. According to research, bots are not as much to blame for this trend than humans. This is actually good news. Ideally, changes can be made on the personal level and we can eventually stamp out this misleading trend of fake news.

But if fake news “works”, why not use it? Not even humans can figure out what’s accurate, allegedly accurate, and sort of correct but not really. Smart software plus humans makes curation complex, slow, and costly.

That sounds about right or does it?

Patrick Roland, May 2, 2018

Fake News: Magnetic Content with Legs

April 30, 2018

The fake news phenomenon might be in the public eye more, thanks to endless warnings and news stories, however that has not dulled its impact. In fact, this shadowy form of propaganda seems to flourish under the spotlight, according to a recent ScienceNews story, “On Twitter, The Lure of Fake News is Stronger than Truth.”

According to the research:

“Discussions of false stories tended to start from fewer original tweets, but some of those retweet chains then reached tens of thousands of users, while true news stories never spread to more than about 1,600 people. True news stories also took about six times as long as false ones to reach 1,500 people. Overall, fake news was about 70 percent more likely to be retweeted than real news.”

That’s a shocking set of statistics. However, anyone quick to blame spambots for this amazing proliferation of fake news needs to give it a second look. According to research, bots are not as much to blame for this trend than humans. This is actually good news. Ideally, changes can be made on the personal level and we can eventually stamp out this misleading trend of fake news.

Patrick Roland, April 30, 2018

Text Classification: Established Methods Deliver Good Enough Results

April 26, 2018

Short honk: If you are a cheerleader for automatic classification of text centric content objects, you are convinced that today’s systems are home run hitters. If you have some doubts, you will want to scan the data in “Machine Learning for Text Categorization: Experiments Using Clustering and Classification.” The paper was free when I checked at 920 am US Eastern time. For the test sets, Latent Dirichlet Allocation performed better than other widely used methods. Worth a look. From my vantage point in Harrod’s Creek, automated processes, regardless of method, perform in a manner one expert explained to me at Cebit several years ago: “Systems are good enough.” Improvements are now incremental but like getting the last few percentage ticks of pollutants from a catalytic converter, an expensive and challenging engineering task.

Stephen E Arnold, April 26, 2018

Picking and Poking Palantir Technologies: A New Blood Sport?

April 25, 2018

My reaction to “Palantir Has Figured Out How to Make Money by Using Algorithms to Ascribe Guilt to People, Now They’re Looking for New Customers” is a a sign and a groan.

I don’t work for Palantir Technologies, although I have been a consultant to one of its major competitors. I do lecture about next generation information systems at law enforcement and intelligence centric conferences in the US and elsewhere. I also wrote a book called “CyberOSINT: Next Generation Information Access.” That study has spawned a number of “experts” who are recycling some of my views and research. A couple of government agencies have shortened by word “cyberosint” into the “cyint.” In a manner of speaking, I have an information base which can be used to put the actions of companies which offer services similar to those available from Palantir in perspective.

The article in Boing Boing falls into the category of “yikes” analysis. Suddenly, it seems, the idea that cook book mathematical procedures can be used to make sense of a wide range of data. Let me assure you that this is not a new development, and Palantir is definitely not the first of the companies developing applications for law enforcement and intelligence professionals to land customers in financial and law firms.

baseball card part 5

A Palantir bubble gum card shows details about a person of interest and links to underlying data from which the key facts have been selected. Note that this is from an older version of Palantir Gotham. Source: Google Images, 2015

Decades ago, a friend of mine (Ev Brenner, now deceased) was one of the pioneers using technology and cook book math to make sense of oil and gas exploration data. How long ago? Think 50 years.

The focus of “Palantir Has Figured Out…” is that:

Palantir seems to be the kind of company that is always willing to sell magic beans to anyone who puts out an RFP for them. They have promised that with enough surveillance and enough secret, unaccountable parsing of surveillance data, they can find “bad guys” and stop them before they even commit a bad action.

Okay, that sounds good in the context of the article, but Palantir is just one vendor responding to the need for next generation information access tools from many commercial sectors.

Read more

Real Time Translation: Chatbots Emulate Sci Fi

April 16, 2018

The language barrier is still one of the world’s major problems. Translation software, such as Google Translate is accurate, but it still makes mistakes that native speakers are needed to correct. Instantaneous translation is still a pipe dream, but the technology is improving with each new development. Mashable shares a current translation innovation and it belongs to Google: “Google Pixel Buds Vs. Professional Interpreters: Which Is More Accurate?”

Apple angered many devout users when it deleted the headphone jack on phones, instead replacing it with Bluetooth headphones called AirPods. They have the same minimalist sleek design as other Apple products, but Google’s Pixel Buds are far superior to them because of real time translation or so we are led to believe. Author Raymond Wong tested the Pixel Buds translation features at the United Nations to see how they faired against professional translators. He and his team tested French, Arabic, and Russian. The Pixel Buds did well with simple conversations, but certain words and phrases caused errors.

One hilarious example was when Google translated the Arabic for, “I want to eat salad” to “I want to eat power” in English. When it comes to real time translation, the experts are still the best because they can understand the context and other intricacies, such as tone, that comes with human language. The professional translators liked the technology, but it still needs work:

“Ayad and Ivanova both agreed that Pixel Buds and Google Translate are convenient technologies, but there’s still the friction of holding out a Pixel phone for the other person to talk into. And despite the Pixel Buds’ somewhat speedy translations, they both said it doesn’t compare to a professional conference interpreters, who can translate at least five times faster Google’s cloud.”

Keep working on those foreign language majors kids. Marketing noses in front of products that deliver in my view.

Whitney Grace, April 17, 2018

Fake Fighters Flourish: Faux or No?

April 16, 2018

An article at Buyers Meeting Point draws our attention to a selection of emerging tools meant to stem the tsunami of false information online. In “Will New Startup NewsGuard Address Fake News in Internet Research?” editor Kelly Barner cites an article in the Wall Street Journal by NewsGuard creator L. Gordon Crovitz when she describes:

“The premise of the NewsGuard value proposition is interesting – Crovitz detailed the challenges caused by what has become a ‘news supply chain’. In many cases, we don’t get our news directly from the publisher, like we did in the olden days of newspapers. Instead we get news from another platform that is probably not dedicated to news: Google, YouTube, Facebook, Twitter, etc. This obscures our awareness of the actual source and increases the risk of reading and sharing fake news. NewsGuard, set to be released in advance of the Midterm elections this November, will charge platforms – not publishers – to rate the reliability of the news sources running content on their site. ‘Instead of black-box algorithms, NewsGuard will use human beings to rate news brands Green, Yellow or Red depending on whether they are trying to produce real journalism, fail to disclose their interests, or are intentional purveyors of fake news.’ (WSJ, 3/4/2018). The largest investor in NewsGuard is Publicis Groupe, a France-based multi-national advertising and public relations agency. According to the Commentary piece, the ratings will be based on both NewsGuard’s experts and wisdom of the crowd. We are all wise to be concerned about the fake news in our midst. Is this the right solution?”

Good question. There are several competing AI tools designed to root out fake news, and the article lists Factmata, Storyzy, Trive, and Our.News, among others, as examples. (See the piece for more details.) Our primary question, however, remains—do they work?

The post reminds us that nothing can really replace critical thinking skills. As Barner concludes, “readers and researchers bear the ultimate responsibility for the information and sources they cite.” Indeed.

Cynthia Murrell, April 16, 2018

CyberOSINT: Next Generation Information Access Explains the Tech Behind the Facebook, GSR, Cambridge Analytica Matter

April 5, 2018

In 2015, I published CyberOSINT: Next Generation Information Access. This is a quick reminder that the profiles of the vendors who have created software systems and tools for law enforcement and intelligence professionals remains timely.

The 200 page book provides examples, screenshots, and explanations of the tools which are available to analyze social media information. The book is the most comprehensive run down of the open source, commercial, and cloud based systems which can make sense of social media data, lawful intercept data, and general text and imagery content.

Companies described in this collection of “tools” include:

  • Cyveillance (now LookingGlass)
  • Decisive Analytics
  • IBM i2 (Analysts Notebook)
  • Geofeedia
  • Leidos
  • Palantir Gotham
  • and more than a dozen developers of commercial and open source, high impact cyberOSINT tool vendors.

The book is available for $49. Additional information is available on my Xenky.com Web site. You can buy the PDF book online at this link gum.co/cyberosint.

Get the CyberOSINT monograph. It’s the standard reference for practical and effective analysis, text analytics, and next generation solutions.

Stephen E Arnold, April 5, 2018

Insight into the Value of Big Data and Human Conversation

April 5, 2018

Big data and AI have been tackling tons of written material for years. But actual spoken human conversation has been largely overlooked in this world, mostly due to the difficulty of collecting this information. However, that is on the cusp of changing as we discovered from a white paper from the Business and Local Government Resource Center,The SENSEI Project: Making Sense of Human Conversations.”

According to the paper:

“In the SENSEI project we plan to go beyond keyword search and sentence-based analysis of conversations. We adapt lightweight and large coverage linguistic models of semantic and discourse resources to learn a layered model of conversations. SENSEI addresses the issue of multi-dimensional textual, spoken and metadata descriptors in terms of semantic, para-semantic and discourse structures.”

While some people are excited about the potential for advancement this kind of big data research presents, others are a little more nervous; for example, one or two of the 87 million individuals whose Facebook data found its way into the capable hands of GSR and Facebook.

In fact, there is a growing movement, according to the Guardian, to scale back big data intrusion. What makes this interesting is that advocates are demanding companies that harvest our information for big data purposes give some of that money back to the people whom the info originate, not unlike how songwriters are given royalties every time their music is used for film or television. Putting a financial stipulation on big data collection could cause SENSEI to top its brake pedal. Maybe?

Patrick Roland, April 5, 2018

Next Page »

  • Archives

  • Recent Posts

  • Meta