AI Not to Replace Lawyers, Not Yet

May 9, 2017

Robot or AI lawyers may be effective in locating relevant cases for references, but they are far away from replacing lawyers, who still need to go to the court and represent a client.

ReadWrite in a recently published analytical article titled Look at All the Amazing Things AI Can (and Can’t yet) Do for Lawyers says:

Even if AI can scan documents and predict which ones will be relevant to a legal case, other tasks such as actually advising a client or appearing in court cannot currently be performed by computers.

The author further explains that what the present generation of AI tools or robots does. They merely find relevant cases based on indexing and keywords, which was a time-consuming and cumbersome process. Thus, what robots do is eliminate the tedious work that was performed by interns or lower level employees. Lawyers still need to collect evidence, prepare the case and argue in the court to win a case. The robots are coming, but only for doing lower level jobs and not to snatch them.

Vishol Ingole, May 9, 2017

Salesforce Einstein and Enterprise AI

May 5, 2017

One customer-relationship-management (CRM) firm is determined to leverage the power of natural language processing within its clients’ organizations. VentureBeat examines “What Salesforce Einstein Teaches Us About Enterprise AI.” The company positions its AI tool as a layer within its “Clouds” that brings the AI magic to CRM. They vow that the some-odd 150,000 existing Salesforce customers can deploy Einstein quickly and easily.

Salesforce has invested much in the project, having snapped up RelatelQ for $390 million, BeyondCore for $110 million, Predicition IO for $58 million, and MetaMind for an undisclosed sum. Competition is fierce in this area, but the company is very pleased with the results so far. Writer Mariya Yao cites Salesforce chief scientist Richard Socher as she examines:

The Salesforce AI Research team is innovating on a ‘joint many-task’ learning approach that leverages transfer learning, where a neural network applies knowledge of one domain to other domains. In theory, understanding linguistic morphology should also accelerate understanding of semantics and syntax.

In practice, Socher and his deep learning research team have been able to achieve state-of-the-art results on academic benchmark tests for main entity recognition (identifying key objects, locations, and persons) and semantic similarity (identifying words and phrases that are synonyms). Their approach can solve five NLP tasks — chunking, dependency parsing, semantic relatedness, textual entailment, and part of speech tagging — and also builds in a character model to handle incomplete, misspelled, or unknown words.

Socher believes that AI researchers will achieve transfer learning capabilities in more comprehensive ways in 2017 and that speech recognition will be embedded in many more aspects of our lives. ‘Right now, consumers are used to asking Siri about the weather tomorrow, but we want to enable people to ask natural questions about their own unique data.’

That would indeed be helpful. The article goes on to discuss the potentials for NLP in the enterprise and emphasizes the great challenge of implementing solutions into a company’s workflow. See the article for more discussion. Based in San Francisco, Salesforce was launched in 1999 by a former Oracle executive.

Cynthia Murrell, May 5, 2017

Amazon Aims to Ace the Chatbots

April 26, 2017

Amazon aims to insert itself into every aspect of daily life and the newest way it does is with the digital assistant Alexa.  Reuters reports that, “Amazon Rolls Out Chatbot Tools In Race To Dominate Voice-Powered Tech,” explaining how Amazon plans to expand Alexa’s development.  The retail giant recently released the technology behind Alexa to developers, so they can build chat features into apps.

Amazon is eager to gain dominance in voice-controlled technology.  Apple and Google both reign supreme when it comes to talking computers, chatbots, and natural language processing.  Amazon has a huge reach, perhaps even greater than Apple and Google, because people have come to rely on it for shopping.  Chatbots have a notorious history for being useless and Microsoft’s Tay even turned into a racist, chauvinist program.

The new Alexa development tool is called Alexa Lex, which is hosted on the cloud.  Alexa is already deployed in millions of homes and it is fed a continuous data stream that is crucial to the AI’s learning:

Processing vast quantities of data is key to artificial intelligence, which lets voice assistants decode speech. Amazon will take the text and recordings people send to apps to train Lex – as well as Alexa – to understand more queries.

That could help Amazon catch up in data collection. As popular as Amazon’s Alexa-powered devices are, such as Echo speakers, the company has sold an estimated 10 million or more.

Amazon Alexa is a competent digital assistant, able to respond to vocal commands and even offers voice-only shop via Amazon.  As noted, Alexa’s power rests in its data collection and ability to learn natural language processing.  Bitext uses a similar method but instead uses trained linguists to build its analytics platform.

Whitney Grace, April 26, 2017

AI Might Not Be the Most Intelligent Business Solution

April 21, 2017

Big data was the buzzword a few years ago, but now artificial intelligence is the tech jargon of the moment.  While big data was a more plausible solution for companies trying to mine information from their digital data, AI is proving difficult to implement.  Forbes discusses AI difficulties in the article, “Artificial Intelligence Is Powerful Stuff, But Difficult To Scale To Real-Life Business.”

There is a lot of excitement brewing around machine learning and AI business possibilities, while the technology is ready for use, workers are not.  People need to be prepped and taught how to use AI and machine learning technology, but without the proper lessons, it will hurt a company’s bottom line.  The problem comes from companies rolling out digital solutions, without changing the way they conduct business.  Workers cannot just adapt to changes instantly.  They need to feel like they are part of the solution, instead of being shifted to the side in the latest technological trend.

CIO for the Federal Communications Commission Dr. David Bray said that:

The growth of AI may shift thinking in organizations. ‘At the end of the day, we are changing what people are doing,; Bray says. ‘You are changing how they work, and they’re going to feel threatened if they’re not bought into the change. It’s almost imperative for CIOs to really work closely with their chief executive officers, and serve as an internal venture capitalist, for how we bring data, to bring process improvements and organizational performance improvements – and work it across the entire organization as a whole.

Artificial intelligence and machine learning are an upgrade to not only a company’s technology but also how a company conducts business.  Business processes will need to be updated to integrate the new technology, but also how workers will use and interface it.  Businesses will continue facing problems if they think that changing technology, but not their procedures are the final solution.

Whitney Grace, April 21, 2017

Image Search: Biased by Language. The Fix? Use Humans!

April 19, 2017

Houston, we (male, female, uncertain) have a problem. Bias is baked into some image analysis and just about every other type of smart software.

The culprit?

Numerical recipes.

The first step in solving a problem is to acknowledge that a problem exists. The second step is more difficult.

I read “The Reason Why Most of the Images That Show Up When You Search for Doctor Are White Men.” The headline identifies the problem. However, what does one do about biases rooted in human utterance.

My initial thought was to eliminate human utterances. No fancy dancing required. Just let algorithms do what algorithms do. I realized that although this approach has a certain logical completeness, implementation may meet with a bit of resistance.

What does the write up have to say about the problem? (Remember. The fix is going to be tricky.)

I learned:

Research from Princeton University suggests that these biases, like associating men with doctors and women with nurses, come from the language taught to the algorithm. As some data scientists say, “garbage in, garbage out”: Without good data, the algorithm isn’t going to make good decisions.

Okay, right coast thinking. I feel more comfortable.

What does the write up present as wizard Aylin Caliskan’s view of the problem? A post doctoral researcher seems to be a solid choice for a source. I assume the wizard is a human, so perhaps he, she, it is biased? Hmmm.

I highlighted in true blue several passages from the write up / interview with he, she, it. Let’s look at three statements, shall we?

Regarding genderless languages like Turkish:

when you directly translate, and “nurse” is “she,” that’s not accurate. It should be “he or she or it” is a nurse. We see that it’s making a biased decision—it’s a very simple example of machine translation, but given that these models are incorporated on the web or any application that makes use of textual data, it’s the foundation of most of these applications. If you search for “doctor” and look at the images, you’ll see that most of them are male. You won’t see an equal male and female distribution.

If accurate, this observation means that the “fix” is going to be difficult. Moving from a language without gender identification to a language with gender identification requires changing the target language. Easy for software. Tougher for a human. If the language and its associations are anchored in the brain of a target language speaker, change may be, how shall I say it, a trifle difficult. My fix looks pretty good at this point.

And what about images and videos? I learned:

Yes, anything that text touches. Images and videos are labeled to they can be used on the web. The labels are in text, and it has been shown that those labels have been biased.

And the fix is a human doing the content selection, indexing, and dictionary tweaking. Not so fast. The cost of indexing with humans is very expensive. Don’t believe me. Download 10,000 Wikipedia articles and hire some folks to index them from the controlled term list humans set up. Let me know if you can hit $17 per indexed article. My hunch is that you will exceed this target by several orders of magnitude. (Want to know where the number comes from? Contact me and we discuss a for fee deal for this high value information.)

How does the write up solve the problem? Here’s the capper:

…you cannot directly remove the bias from the dataset or model because it’s giving a very accurate representation of the world, and that’s why we need a specialist to deal with this at the application level.

Notice that my solution is to eliminate humans entirely. Why? The pipe dream of humans doing indexing won’t fly due to [a] time, [b] cost, [c] the massive flows of data to index. Forget the mother of all bombs.

Think about the mother of all indexing backlogs. The gap would make the Modern Language Association’s “gaps” look like weekend catch up party. Is this a job for the operating system for machine intelligence?

Stephen E Arnold, April 17, 2017

Watson and Block: Tax Preparation and Watson

April 19, 2017

Author’s Note:

Tax season is over. I am now releasing a write up I did in the high pressure run up to tax filing day, April 18, 2017, to publish this blog post. I want to comment on one marketing play IBM used in 2016 and 2017 to make Watson its Amazon Echo or its Google Pixel. IBM has been working overtime to come up with clever, innovative, effective ways to sell Watson, a search-and-retrieval system spiced with home brew code, algorithms which make the system “smart,” acquired technology from outfits like Vivisimo, and some free and open source search software.

IBM Watson is being sold to Wall Street and stakeholders as IBM’s next, really big thing. With years of declining revenue under its belt, the marketing of Watson as “cognitive software” is different from the marketing of most other companies pitching artificial intelligence.

One unintended consequence of IBM’s saturation advertising of its Watson system is making the word “cognitive” shorthand for software magic. The primary beneficiaries of IBM’s relentless use of the word “cognitive” has been to help its competitors. IBM’s fuzziness and lack of concrete products has allowed companies with modest marketing budgets to pick up the IBM jargon and apply it to their products. Examples include the reworked Polyspot (now doing business as CustomerMatrix) and dozens of enterprise search vendors; for example, LucidWorks (Really?), Attivio, Microsoft, Sinequa, and Squirro (yep, Squirro). IBM makes it possible for competitors to slap the word cognitive on their products and compete against IBM’s Watson. I am tempted to describe IBM Watson as a “straw man,” but it is a collection of components, not a product.

Big outfits like Amazon have taken a short cut to the money machine. The Echo and Dot sell millions of units and drive sales of Amazon’s music and hard goods sales. IBM bets on a future hint of payoff; for example, Watson may deliver a “maximum refund” for an H&R Block customer. That sounds pretty enticing. My accountant, beady eyed devil if there ever were one, never talks about refunds. He sticks to questions about where I got my money and what I did with it. If anything, he is a cloud of darkness, preferring to follow the IRS rules and avoid any suggestion of my getting a deal, a refund, or a free ride.

Below is the story I wrote a month ago shortly after I spent 45 minutes chatting with three folks who worked at the H&R Block office near my home in rural Kentucky. Have fun reading.

Stephen E Arnold, April 18, 2017

IBM Watson is one of Big Blue’s strategic imperatives. I have enjoyed writing about Watson, mixing up my posts with the phrase “Watson weakly” instead of “Watson weekly.” Strategic imperatives are supposed to generate new revenue to replace the loss of old revenues. The problem IBM has to figure out how to solve is pace. Will IBM Watson and other strategic imperatives generate sustainable, substantial revenue quickly enough to keep the  company’s revenue healthy.

The answer seems to be, “Maybe, but not very quickly.” According to IBM’s most recent quarterly report, Big Blue has now reported declining revenues for 20 consecutive quarters. Yep, that’s five years. Some stakeholders are patient, but IBM’s competitors are thrilled with IBM’s stratgegic imperatives. For the details of the most recent IBM financials, navigate to “IBM Sticks to Its Forecast Despite Underwhlming Results.” Kicking the can down the road is fun for a short time.

The revenue problem is masked by promises about the future. Watson, the smart software, is supposed to be a billion dollar baby who will end up with a $10 billion dollar revenue stream any day now. But IBM’s stock buybacks and massive PR campaigns have helped the company sell its vision of a bright new Big Blue. But selling software and consulting is different from selling hardware. In today’s markets, services and consulting are tough businesses. Examples of companies strugglling to gain traction against outfits like Gerson Lehrman, unemployed senior executives hungry for work, and new graduates will to do MBA chores for a pittance compete with outfits like Elastic, a search vendor which sells add ons to open source software and consulting for those who need it. IBM is trying almost everything. Still those declining revenues tell a somewhat dismal tale.

I assume you have watched the Super Bowl ads if not the game. I just watched the ads. I was surprised to see a one minute, very expensive, and somewhat ill conceived commercial for IBM Watson and H&R Block, the walk in store front tax preparer.

The Watson-Block Super Bowl ad featured this interesting image: A sled going downhill. Was this a Freudian slip about declining revenues?


Does it look to you that the sled is speeding downhill. Is this a metaphor for IBM Watson’s prospects in the tax advisory business?

One of IBM’s most visible promotions of its company-saving, revenue-gushing dreams is IBM Watson. You may have seen the Super Bowl ad about Watson providing H&R Block with a sure-fire way to kill off pesky competitors. How has that worked out for H&R Block?

Read more

Smart Software, Dumb Biases

April 17, 2017

Math is objective, right? Not really. Developers of artificial intelligence systems, what I call smart software, rely on what they learned in math school. If you have flipped through math books ranging from the Googler’s tome on artificial intelligence Artificial Intelligence: A Modern Approach to the musings of the ACM’s journals, you see the same methods recycled. Sure, the algorithms are given a bath and their whiskers are cropped. But underneath that show dog’s sleek appearance, is a familiar pooch. K-means. We have k-means. Decision trees? Yep, decision trees.

What happens when developers feed content into Rube Goldberg machines constructed of mathematical procedures known and loved by math wonks the world over?

The answer appears in “Semantics Derived Automatically from Language Corpora Contain Human Like Biases.” The headline says it clearly, “Smart software becomes as wild and crazy as a group of Kentucky politicos arguing in a bar on Friday night at 2:15 am.”

Biases are expressed and made manifest.

The article in Science reports with considerable surprise it seems to me:

word embeddings encode not only stereotyped biases but also other knowledge, such as the visceral pleasantness of flowers or the gender distribution of occupations.

Ah, ha. Smart software learns biases. Perhaps “smart” correlates with bias?

The canny whiz kids who did the research crawfish a bit:

We stress that we replicated every association documented via the IAT that we tested. The number, variety, and substantive importance of our results raise the possibility that all implicit human biases are reflected in the statistical properties of language. Further research is needed to test this hypothesis and to compare language with other modalities, especially the visual, to see if they have similarly strong explanatory power.

Yep, nothing like further research to prove that when humans build smart software, “magic” happens. The algorithms manifest biases.

What the write up did not address is a method for developing less biases smart software. Is such a method beyond the ken of computer scientists?

To get more information about this question, I asked on the world leader in the field of computational linguistics, Dr. Antonio Valderrabanos, the founder and chief executive officer at Bitext. Dr. Valderrabanos told me:

We use syntactic relations among words instead of using n-grams and similar statistical artifacts, which don’t understand word relations. Bitext’s Deep Linguistics Analysis platform can provide phrases or meaningful relationships to uncover more textured relationships. Our analysis will provide better content to artificial intelligence systems using corpuses of text to learn.

Bitext’s approach is explained in the exclusive interview which appeared in Search Wizards Speak on April 11, 2017. You can read the full text of the interview at this link and review the public information about the breakthrough DLA platform at

It seems to me that Bitext has made linguistics the operating system for artificial intelligence.

Stephen E Arnold, April 17, 2017

A Peek at the DeepMind Research Process

April 14, 2017

Here we have an example of Alphabet Google’s organizational prowess. Business Insider describes how “DeepMind Organises Its AO Researchers Into ‘Strike Teams’ and ‘Frontiers’.” Writer Sam Shead cites a report by Madhumita Murgia as described in the Financial Times. He writes:

Exactly how DeepMind’s researchers work together has been something of a mystery but the FT story sheds new light on the matter. Researchers at DeepMind are divided into four main groups, including a ‘neuroscience’ group and a ‘frontiers’ group, according to the report. The frontiers group is said to be full of physicists and mathematicians who are tasked with testing some of the most futuristic AI theories. ‘We’ve hired 250 of the world’s best scientists, so obviously they’re here to let their creativity run riot, and we try and create an environment that’s perfect for that,’ DeepMind CEO Demis Hassabis told the FT. […]

DeepMind, which was acquired by Google in 2014 for £400 million, also has a number of ‘strike teams’ that are set up for a limited time period to work on particular tasks. Hassabis explained that this is what DeepMind did with the AlphaGo team, who developed an algorithm that was able to learn how to play Chinese board game Go and defeat the best human player in the world, Lee Se-dol.

Here’s a write-up we did about that significant AlphaGo project, in case you are curious. The creative-riot approach Shead describes is in keeping with Google’s standard philosophy on product development—throw every new idea at the wall and see what sticks. We learn that researchers report on their progress every two months, and team leaders allocate resources based on those reports. Current DeepMind projects include algorithms for healthcare and energy scenarios.

Hassabis launched DeepMind in London in 2010, where offices remain after Google’s 2014 acquisition of the company.

Cynthia Murrell, April 14, 2017

The Algorithm to Failure

April 12, 2017

Algorithms have practically changed the way the world works. However, this nifty code also has its limitations that lead to failures.

In a whitepaper published by Cornell University, authored by Shai Shalev-ShwartzOhad ShamirShaked Shammah and titled Failures of Deep Learning, the authors say:

It is important, for both theoreticians and practitioners, to gain a deeper understanding of the difficulties and limitations associated with common approaches and algorithms.

The whitepaper touches four pain points of Deep Learning, which is based on algorithms. The authors propose remedial measures that possibly could overcome these impediments and lead to better AI.

Eminent personalities like Stephen Hawking, Bill Gates and Elon Musk have however warned against advancing AIs. Google in the past had abandoned robotics as the machines were becoming too intelligent. What now needs to be seen is who will win in the end? Commercial interests or unfounded fear?

Vishal Ingole, April 12, 2017

Bitext: Exclusive Interview with Antonio Valderrabanos

April 11, 2017

On a recent trip to Madrid, Spain, I was able to arrange an interview with Dr. Antonio Valderrabanos, the founder and CEO of Bitext. The company has its primary research and development group in Las Rosas, the high-technology complex a short distance from central Madrid. The company has an office in San Francisco and a number of computational linguists and computer scientists in other locations. Dr. Valderrabanos worked at IBM in an adjacent field before moving to Novell and then making the jump to his own start up. The hard work required to invent a fundamentally new way to make sense of human utterance is now beginning to pay off.

Antonio Valderrabanos of Bitext

Dr. Antonio Valderrabanos, founder and CEO of Bitext. Bitext’s business is growing rapidly. The company’s breakthroughs in deep linguistic analysis solves many difficult problems in text analysis.

Founded in 2008, the firm specializes in deep linguistic analysis. The systems and methods invented and refined by Bitext improve the accuracy of a wide range of content processing and text analytics systems. What’s remarkable about the Bitext breakthroughs is that the company support more than 40 different languages, and its platform can support additional languages with sharp reductions in the time, cost, and effort required by old-school systems. With the proliferation of intelligent software, Bitext, in my opinion, puts the digital brains in overdrive. Bitext’s platform improves the accuracy of many smart software applications, ranging from customer support to business intelligence.

In our wide ranging discussion, Dr. Valderrabanos made a number of insightful comments. Let me highlight three and urge you to read the full text of the interview at this link. (Note: this interview is part of the Search Wizards Speak series.)

Linguistics as an Operating System

One of Dr. Valderrabanos’ most startling observations addresses the future of operating systems for increasingly intelligence software and applications. He said:

Linguistic applications will form a new type of operating system. If we are correct in our thought that language understanding creates a new type of platform, it follows that innovators will build more new things on this foundation. That means that there is no endpoint, just more opportunities to realize new products and services.

Better Understanding Has Arrived

Some of the smart software I have tested is unable to understand what seems to be very basic instructions. The problem, in my opinion, is context. Most smart software struggles to figure out the knowledge cloud which embraces certain data. Dr. Valderrabanos observed:

Search is one thing. Understanding what human utterances mean is another. Bitext’s proprietary technology delivers understanding. Bitext has created an easy to scale and multilingual Deep Linguistic Analysis or DLA platform. Our technology reduces costs and increases user satisfaction in voice applications or customer service applications. I see it as a major breakthrough in the state of the art.

If he is right, the Bitext DLA platform may be one of the next big things in technology. The reason? As smart software becomes more widely adopted, the need to make sense of data and text in different languages becomes increasingly important. Bitext may be the digital differential that makes the smart applications run the way users expect them to.

Snap In Bitext DLA

Advanced technology like Bitext’s often comes with a hidden cost. The advanced system works well in a demonstration or a controlled environment. When that system has to be integrated into “as is” systems from other vendors or from a custom development project, difficulties can pile up. Dr. Valderrabanos asserted:

Bitext DLA provides parsing data for text enrichment for a wide range of languages, for informal and formal text and for different verticals to improve the accuracy of deep learning engines and reduce training times and data needs. Bitext works in this way with many other organizations’ systems.

When I asked him about integration, he said:

No problems. We snap in.

I am interested in Bitext’s technical methods. In the last year, he has signed deals with companies like Audi, Renault, a large mobile handset manufacturer, and an online information retrieval company.

When I thanked him for his time, he was quite polite. But he did say, “I have to get back to my desk. We have received several requests for proposals.”

Las Rosas looked quite a bit like Silicon Valley when I left the Bitext headquarters. Despite the thousands of miles separating Madrid from the US, interest in Bitext’s deep linguistic analysis is surging. Silicon Valley has its charms, and now it has a Bitext US office for what may be the fastest growing computational linguistics and text analysis system in the world. Worth watching this company I think.

For more about Bitext, navigate to the firm’s Web site at

Stephen E Arnold, April 11, 2017

Next Page »

  • Archives

  • Recent Posts

  • Meta