Bias: Female Digital Assistant Voices

October 17, 2019

It was a seemingly benign choice based on consumer research, but there is an unforeseen complication. TechRadar considers, “The Problem with Alexa: What’s the Solution to Sexist Voice Assistants?” From smart speakers to cell phones, voice assistants like Amazon’s Alexa, Microsoft’s Cortana, Google’s Assistant, and Apple’s Siri generally default to female voices (and usually sport female-sounding names) because studies show humans tend to respond best to female voices. Seems like an obvious choice—until you consider the long-term consequences. Reporter Olivia Tambini cites a report UNESCO issued earlier this year that suggests the practice sets us up to perpetuate sexist attitudes toward women, particularly subconscious biases. She writes:

“This progress [society has made toward more respect and agency for women] could potentially be undone by the proliferation of female voice assistants, according to UNESCO. Its report claims that the default use of female-sounding voice assistants sends a signal to users that women are ‘obliging, docile and eager-to-please helpers, available at the touch of a button or with a blunt voice command like “hey” or “OK”.’ It’s also worrying that these voice assistants have ‘no power of agency beyond what the commander asks of it’ and respond to queries ‘regardless of [the user’s] tone or hostility’. These may be desirable traits in an AI voice assistant, but what if the way we talk to Alexa and Siri ends up influencing the way we talk to women in our everyday lives? One of UNESCO’s main criticisms of companies like Amazon, Google, Apple and Microsoft is that the docile nature of our voice assistants has the unintended effect of reinforcing ‘commonly held gender biases that women are subservient and tolerant of poor treatment’. This subservience is particularly worrying when these female-sounding voice assistants give ‘deflecting, lackluster or apologetic responses to verbal sexual harassment’.”

So what is a voice-assistant maker to do? Certainly, male voices could be used and are, in fact, selectable options for several models. Another idea is to give users a wide variety of voices to choose from—not just different genders, but different accents and ages, as well. Perhaps the most effective solution would be to use a gender-neutral voice; one dubbed “Q” has now been created, proving it is possible. (You can listen to Q through the article or on YouTube.)

Of course, this and other problems might have been avoided had there been more diversity on the teams behind the voices. Tambini notes that just seven percent of information- and communication-tech patents across G20 countries are generated by women. As more women move into STEM fields, will unintended gender bias shrink as a natural result?

Cynthia Murrell, October 17, 2019

Machine Learning Tutorials

October 16, 2019

Want to know more about smart software? A useful list of instructional, reference, and learning materials appears in “40+ Modern Tutorials Covering All Aspects of Machine Learning.”

Materials include free books about machine learning to lists of related material from SAP. DarkCyber noted that a short explanation about how to download documents posted on LinkedIn is included. (This says more about LinkedIn’s interesting approach to content than DarkCyber thinks the compiler of the list expresses.)

Quite useful round up.

Stephen E Arnold, October 16, 2019

Chatbot: Baloney Sliced and Served as Steak

October 15, 2019

DarkCyber noted “The Truth about Chatbots: Five Myths Debunked.” Silver bullets are keenly desired. Use smart software to eliminate most of the costs of customer support. (Anyone remember the last time customer support was painless, helpful, and a joy?)

IT Pro Portal seems to be aware that smart software dispensing customer service is in need of a bit of reality-marketing mustard. My goodness. Interesting. What’s next? Straight talk about quantum computing?

The write up identifies five “myths.” Viewing these from some sylvan viewshed, the disabused “myths” are:

  1. You will need multiple bots. Now multiple bots increase the costs of eliminating most humans from customer support and other roles. Yep, expensive.
  2. Humans won’t go away. That means sick days, protests, healthcare, and other peculiarly human costs are here to stay. Shocker! Smart software is not as smart as the pitch decks assert?
  3. Bots can do a lot. View this “myth” in the context of item 1.
  4. Bots require a support staff. Of course not. Buy a bot service and everything is just peachy.
  5. Bots don’t mean lock in.

Now this dose of reality is a presentation of baloney and hand waving.

What is the truth about chatbots? Are they works in progress? Are they cost cutting mechanisms? Are they fairly narrow demonstrations of machine learning?

The reality is that bots, like customer service, are not yet as good as the marketers, PR professionals, and managers of firms selling bots assert.

Think about these five myths. It’s not one bot. It’s multiple bots. Bots can’t do human stuff as well as some humans. Bots do many things not so well. Rely on providers; you can trust vendors, right? Don’t worry about lock in even though the goal of bot providers is to slap on those handcuffs.

To get a glimpse of unadulterated rah rah cheerleading, check out “Robots Are Catching Up to Humans in the Jobs Race.” That write up states:

In real terms, the price for an industrial robot has fallen by more than 60% in 20 years. They also get better as they get cheaper.

What’s not to like? Better, faster, cheaper.

Stephen E Arnold, October 15, 2019

About Those Cut and Paste Smart Software Recipes

October 15, 2019

DarkCyber noted this write up in Vice: “A Code Glitch May Have Caused Errors In More Than 100 Published Studies.” Okay, no big deal. A few errors.

The write up quotes another source as saying:

“This simple glitch in the original script calls into question the conclusions of a significant number of papers on a wide range of topics in a way that cannot be easily resolved from published information because the operating system is rarely mentioned,” the new paper reads. “Authors who used these scripts should certainly double-check their results and any relevant conclusions using the modified scripts in the [supplementary information].”

So what?

if the code led Williams [expert who made the mistake] to wrongly identify the contents of his sample, chemists trying to recreate the molecule to test as a potential cancer drug would be chasing after the wrong compound

So what?

What unknown, unrealized errors exist within the cut and paste world of smart software?

What about errors in warfighting or crime fighting smart systems?

Don’t know?

No one does.

That’s the issue, isn’t it?

Stephen E Arnold, October 15, 2019

Hot Buzzword: Continuous Intelligence

October 11, 2019

No, I don’t know what “continuous intelligence” means. When I worked at Booz, Allen, one of the presidents from that era remarked to me, “I have a sixth sense for great jargon.” That fellow, James Farley, would have embraced “continuous intelligence.” The phrase sounds good. It is metaphorical. It could support a new practice area.

I heard the word at the TechnoSecurity & Digital Forensics Conference. I am not sure which session speaker dropped the phrase. Maybe Cisco’s and Coalfire’s? At the time, I noted the phrase but did not think much about it.

This morning it surfaced again in “Clear the Path to Continuous Intelligence with Machine Learning, Consultancy Urges.” Not a Booz, Allen pitch which is interesting. The jargon outputters are from ThoughtWorks.

The write up defines the phrase “continuous intelligence” this way:

… The continuous intelligence state: This is where CD4ML platform thinking and a data DevOps culture become the norm. This is “continuous delivery for data,” the ThoughtWorks team explains. “As data scientists create more refined and accurate models, they can easily deploy these into production as replacements for prior models. Being able to create products which learn and complete the intelligence cycle in a continuous fashion is what sets this stage apart. The loops become more seamless and most of the hurdles are removed. Loops become tighter and faster with more use and more experimentation, which is a key indicator of the health of intelligence cycle.”

Got it? If not, a mid tier consulting firm will assist you as you travel the learning curve. A conference opportunity? Absolutely.

“Continuous intelligence” has arrived.

Stephen E Arnold, October 11, 2019

The Roots of Common Machine Learning Errors

October 11, 2019

It is a big problem when faulty data analysis underpins big decisions or public opinion, and it is happening more often in the age of big data. Data Science Central outlines several “Common Errors in Machine Learning Due to Poor Statistics Knowledge.” Easy to make mistakes? Yep. Easy to manipulate outputs? Yep. We believe the obvious fix is to make math point and click—let developers decide for a clueless person.

Blogger Vincent Granville describes what he sees as the biggest problem:

“Probably the worst error is thinking there is a correlation when that correlation is purely artificial. Take a data set with 100,000 variables, say with 10 observations. Compute all the (99,999 * 100,000) / 2 cross-correlations. You are almost guaranteed to find one above 0.999. This is best illustrated in may article How to Lie with P-values (also discussing how to handle and fix it.) This is being done on such a large scale, I think it is probably the main cause of fake news, and the impact is disastrous on people who take for granted what they read in the news or what they hear from the government. Some people are sent to jail based on evidence tainted with major statistical flaws. Government money is spent, propaganda is generated, wars are started, and laws are created based on false evidence. Sometimes the data scientist has no choice but to knowingly cook the numbers to keep her job. Usually, these ‘bad stats’ end up being featured in beautiful but faulty visualizations: axes are truncated, charts are distorted, observations and variables are carefully chosen just to make a (wrong) point.”

Granville goes on to specify several other sources of mistakes. Analysts sometimes take for granted the accuracy of their data sets, for example, instead of performing a walk-forward test. Relying too much on the old standbys R-squared measures and normal distributions can also lead to errors. Furthermore, he reminds us, scale-invariant modeling techniques must be used when data is expressed in different units (like yards and miles). Finally, one must be sure to handle missing data correctly—do not assume bridging the gap with an average will produce accurate results. See the post for more explanation on each of these points.

Cynthia Murrell, October 11, 2019

Higher Education: A Disconnect between Data Analysis and Behavior

October 9, 2019

DarkCyber worked through “Artificial Intelligence and Big Data in Higher Education: Promising or Perilous?” The write up tries to strike a balance between commercial practices and the job universities are supposed to do. Smart software can help an admissions officer determine which students are “really” interested in a particular institution. In view of the payments parents have made to get their children into prestigious universities, we’re just not buying this argument.

We noted this statement about other uses of smart software:

Other uses extend to student support, which for example, makes recommendations on courses and career paths based on how students with similar data profiles performed in the past. Traditionally this was a role of career service officers or guidance counselors, the data-based recommendation service arguably provides better solutions for students. Student support is further elevated by the use of predictive analytics and its potential to identify students who are at risk of failing or dropping out of university. Traditionally, institutions would rely on telltale signs of attendance or falling GPA to assess whether a student is at risk. AI systems allow for the analysis of more granular patterns of the student’s data profile. Real-time monitoring of the student’s risk allows for timely and effective action to be taken.

The indicator of student performance is grades. Maybe one can consider certain extra curricular activities as useful signals.

DarkCyber is not certain that today’s institutions of higher education are much more than loan agency middlemen.

The notion that today’s academic environment will improve when adjunct professors who work for less than “regular” professors seems odd. Will poorly paid adjuncts chase down a student who has lousy grades and doesn’t attend lectures or go through the online work. Then will that adjunct or maybe a graduate assistant know what magic words to say to get the student on track?

DarkCyber doubts the present academic environment encourages this type of behavior. At a recent conference, a professor on the program asked me, “Do you think I need to contact an agent to get me more speaking engagements?”

Typical. Students are not a primary concern it seems.

Stephen E Arnold, October 9, 2019

Amazon AWS, DHS Tie Up: Meaningful or Really Meaningful?

October 7, 2019

In my two lectures at the TechnoSecurity & Digital Forensics conference in San Antonio last week, my observations about Amazon AWS and the US government generated puzzled faces. Let’s face it. Amazon means a shopping service for golf shirts and gym wear.

I would like to mention — very, very briefly because interest in Amazon’s non shopping activities is low among some market sectors — “DHS to Deploy AWS-Based Biometrics System.” The deal is for Homeland Security:

to deploy a cloud-based system that will process millions of biometrics data and support the department’s efforts to modernize its facial recognition and related software.

The system will run on the AWS GovCloud platform. Amazon snagged this deal from the incumbent Northrop Grumman. AWS takes over the program in 2021. DarkCyber estimates that the contract will be north of $80 million, excluding ECOs and scope changes.

This is not a new biometrics system. Its been up and running since the mid 1990s. What’s interesting is that the seller of golf shirts displaced one of the old line vendors upon which the US government has traditionally relied.

DarkCyber finds this suggestive which is a step toward really meaningful. Watch for “Dark Edge: Amazon Policeware”. It will be available in the next few months.

Stephen E Arnold, October 7, 2019

Google: Practical, Pragmatic, and Logical

October 3, 2019

I am not sure if this news item about the GOOG is accurate. Nevertheless, it does provide a Googley solution to a thorny problem; namely, where can one get images with which to train a facial recognition system.

One could scrape Ancestry.com. One could scrape Yandex Images. One could retain the enterprising Yahoo engineer who hacked accounts for interesting images.

Or

One could snap pix of homeless people. Atlanta. Hmmm. Atlanta?

The allegedly accurate factoids appear in “Google Contractors Reportedly Targeted Homeless People for Pixel 4 Facial Recognition.” I noted:

a Google contractor may be using some questionable methods to get those facial scans, including targeting groups of homeless people and tricking college students who didn’t know they were being recorded. According to several sources who allegedly worked on the project, a contracting agency named Randstad sent teams to Atlanta explicitly to target homeless people and those with dark skin, often without saying they were working for Google, and without letting on that they were actually recording people’s faces.

Legal? Illegal? I don’t care.

The idea and the execution is troubling.

If true, classy. Like the yacht death involving drugs and an alleged person for hire. Somehow Googley.

Stephen E Arnold, October 3, 2019

Microsoft Finds the UK an AI Loser

October 1, 2019

DarkCyber was surprised to learn that the United Kingdom is a loser when it comes to smart software. This conclusion will be a surprise to those in UK universities engaged in artificial intelligence research and development.

New Microsoft Report Claims U.K. Is Behind The Rest Of The World On AI” alleges that:

British organizations risk being overtaken by their global counterparts unless the use of artificial intelligence (AI) technology is accelerated.

The full report is available at this link.

Much of Google’s smart software shares some DNA with the DeepMind outfit. Cambridge University, a reasonably good school, has been cranking out smart software luminaries for decades.

What’s Microsoft’s agenda?

That an easy question to answer.

The Microsoft report wants the UK to use more Microsoft. The Redmond giant needs 80 pages and an 80 gigabyte file to make its point. Terse. Tasty. Terrific. Nope.

Will this approach cause a spike in UK grabbing more Microsoft goodness?

Yeah, well, maybe. But the report seems to have an agenda; specifically, making the point that the UK should use more Microsoft and less of the “other guy’s technology.” The other guy may be none other than Google. Microsoft wants AI to work for everyone (page 39). The “other guy” may be less catholic.

Stephen E Arnold, October 2, 2019

Next Page »

  • Archives

  • Recent Posts

  • Meta