IBM Debate Contest: Human Judges Are Unintelligent

February 12, 2019

I was a high school debater. I was a college debater. I did extemp. I did an event called readings. I won many cheesey medals and trophies. Also, I have a number of recollections about judges who shafted me and my team mate or just hapless, young me.

I learned:

Human judges mean human biases.

When I learned that the audience voted a human the victor over the Jeopardy-winning, subject matter expert sucking, and recipe writing IBM Watson, I knew the human penchant for distortion, prejudice, and foul play made an objective, scientific assessment impossible.

Humans may not be qualified to judge state of the art artificial intelligence from sophisticated organizations like IBM.

The rundown and the video of the 25 minute travesty is on display via Engadget with a non argumentative explanation in words in the write up “IBM AI Fails to Beat Human Debating Champion.” The real news report asserts:

The face-off was the latest event in IBM’s “grand challenge” series pitting humans against its intelligent machines. In 1996, its computer system beat chess grandmaster Garry Kasparov, though the Russian later accused the IBM team of cheating, something that the company denies to this day — he later retracted some of his allegations. Then, in 2011, its Watson supercomputer trounced two record-winning Jeopardy! contestants.

Yes, past victories.

Now what about the debate and human judges.

My thought is that the dust up should have been judged by a panel of digital devastators; specifically:

Google DeepMind. DeepMind trashed a human Go player and understands the problems humanoids have being smart and proud
Amazon SageMaker. This is a system tuned with work for a certain three letter agency and, therefore, has a Deep Lens eye to spot the truth
Microsoft Brainwave (remember that?). This is a system which was the first hardware accelerated model to make Clippy the most intelligent “bot” on the planet. Clippy, come back.

Here’s how this judging should have worked.

Each system “learns” what it takes to win a debate, including voice tone, rapport with the judges and audience, and physical gestures (presence)
Each system processes the video, audio, and sentiment expressed when the people in attendance clap, whistle, laugh, sub vocalize “What a load of horse feathers,” etc.
Each system generates a score with 0.000001 the low and 0.999999 the high
The final tally would be calculated by Facebook FAIR (Facebook AI Research). The reason? Facebook is among the most trusted, socially responsible smart software companies.

The notion of a human judging a machine is what I call “deep stupid.” I am working on a short post about this important idea.

A human judged by humans is neither just nor impartial. Not Facebook FAIR.

An also participated award goes to IBM marketing.

IBM snagged an also participated medal. Well done.

Stephen E Arnold, February 13, 2019

Written by Stephen E. Arnold · Filed Under AI, Feature, IBM Watson

Comments

Comments are closed.

Search the site
Subscribe to Beyond Search
Feature archive
News archive

Stephen E. Arnold monitors search, content processing, text mining and related topics from his high-tech nerve center in rural Kentucky. He tries to winnow the goose feathers from the giblets. He works with colleagues worldwide to make this Web log useful to those who want to go "beyond search". Contact him at sa [at] arnoldit.com. His Web site with additional information about search is arnoldit.com.