Bias in Biometrics
August 26, 2020
How can we solve bias in facial recognition and other AI-powered biometric systems? We humans could try to correct for it, but guess where AI learns its biases—yep, from us. Researcher Samira Samadi explored whether using a human evaluator would make an AI less biased or, perhaps, even more so. We learn of her project and others in Biometric Update.com’s article, “Masks Mistaken for Duct Tape, Researchers Experiment to Reduce Human Bias in Biometrics.” Reporter Luana Pascu writes:
“Curious to understand if a human evaluator would make the process fair or more biased, Samadi recruited users for a human-user study. She taught them about facial recognition systems and how to make decisions about system accuracy. ‘We really tried to imitate a real-world scenario, but that actually made it more complicated for the users,’ Samadi said. The experiment confirmed the difficulty in finding an appropriate dataset with ethically sourced images that would not introduce bias into the study. The research was published in a paper called A Human in the Loop is Not Enough: The Need for Human-Subject Experiments in Facial Recognition.”
Many other researchers are studying the bias problem. One NIST report found a lot of software that produced 10-fold to 100-fold increase in the probability of Asian and African American faces being inaccurately recognized (though a few systems had negligible differences). Meanwhile, a team at Wunderman Thompson Data found tools from big players Google, IBM, and Microsoft to be less accurate than they had expected. For one thing, the systems had trouble accounting for masks—still a persistent reality as of this writing. The researchers also found gender bias in all three systems, even though the technologies used are markedly different.
There is reason to hope. Researchers at the Durham University’s Computer Science Department managed to reduce racial bias by one percent and improve ethnicity accuracy. To achieve these results, the team used a synthesized data set with a higher focus on feature identification. We also learn:
“New software to cut down on demographic differences in face biometric performance has also reached the market. The ethnicity-neutral facial recognition API developed by AIH Technology is officially available in the Microsoft Azure Marketplace. In March, the Canadian company joined the Microsoft Partners Network (MPN) and announced the plans for the global launch of its Facial-Recognition-as-a-Service (FRaaS).”
Bias in biometrics, and AI in general, is a thorny problem with no easy solution. At least now people are aware of the issue and bright minds are working to solve it. Now, if only companies would be willing to delay profitable but problematic implementations until solutions are found. Hmmm.
Cynthia Murrell, August 26, 2020
Informatica: An Old Dog Is Trying to Learn New Tricks?
August 20, 2020
Old dogs. Many people have to pause a moment when standing. Balancing is not the same when one is getting old. Others have to extend an arm, knee, or finger slowly. Joints? Don’t talk about those points of failure to a former athlete. Can bee pollen, a vegan diet, a training session with Glennon Doyle, or an acquisition do the trick?
“Informatica Buys AI Startup for Entity and Schema Matching” explains a digital rejuvenation. The article reports:
Informatica’s latest acquisition extends machine learning capabilities into matching of data entities and schemas.
Entities and schemas are important when fiddling with data. I want to point out that Informatica was founded in 1993 and has been in the data entities and schema business for more than a quarter century. Obviously the future is arriving at the venerable software development company.
The technology employed by Green Bay Technologies is what the article calls “Random Forest” machine learning. The article explains that Green Bay’s method possesses:
the ability to handle more diverse data across different domains, including semi-structured and unstructured data, and a crowd-sourcing approach that improves performance.
The Green Bay method employs:
a machine learning approach where multiple decision trees are run, and then subjected to a crowd sourced consensus process to identify the best results. It is a supervised approach where models are auto generated after the user applies some declarative rules – that is, he or she labels a sample set of record pairs, and from there the system infers “blocking rules” to build the models.
Informatica will add Green Bay’s capabilities to its existing smart software engine called CLAIRE.
The write up does not dig into issues related to performance, over fitting, or dealing with rare outcomes or predictors.
Glennon Doyle does not dwell on her flaws either.
Stephen E Arnold, August 20, 2020
Predictive Analytics: A Time and a Place, Not Just in LE?
August 17, 2020
The concept seems sound: analyze data from past crimes to predict future crimes and stop them before they happen. However, in practice the reality is not so simple. That is, as Popular Mechanics explains, “Why Hundreds of Mathematicians Are Boycotting Predictive Policing.” Academic mathematicians are in a unique position—many were brought into the development of predictive policing algorithms in 2016 by The Institute for Computational and Experimental Research in Mathematics (ICERM). One of the partners, PredPol, makes and sells predictive policing tools. Reporter Courtney Linder informs us:
“Several prominent academic mathematicians want to sever ties with police departments across the U.S., according to a letter submitted to Notices of the American Mathematical Society on June 15. The letter arrived weeks after widespread protests against police brutality, and has inspired over 1,500 other researchers to join the boycott. These mathematicians are urging fellow researchers to stop all work related to predictive policing software, which broadly includes any data analytics tools that use historical data to help forecast future crime, potential offenders, and victims. … Some of the mathematicians include Cathy O’Neil, author of the popular book Weapons of Math Destruction, which outlines the very algorithmic bias that the letter rallies against. There’s also Federico Ardila, a Colombian mathematician currently teaching at San Francisco State University, who is known for his work to diversify the field of mathematics.”
Linder helpfully explains what predictive policing is and how it came about. The embedded four-minute video is a good place to start (interestingly, it is produced from a pro-predictive policing point of view). The article also details why many object to the use of this technology. Chicago’s Office of the Inspector General has issued an advisory with a list of best practices to avoid bias, while Santa Cruz has banned the software altogether. We’re told:
“The researchers take particular issue with PredPol, the high-profile company that helped put on the ICERM workshop, claiming in the letter that its technology creates racist feedback loops. In other words, they believe that the software doesn’t help to predict future crime, but instead reinforces the biases of the officers.”
Structural bias also comes into play, as well as the consideration that some crimes go underreported, skewing data. The piece wraps up by describing how widespread this technology is, an account that can be summarized by quoting PredPol’s own claim that one in 33 Americans are “protected” by its software.
With physics and other disciplines like Google online advertising based on probabilities and predictive analytics, what’s the scientific limit on real world applications? Subjective perceptions?
Cynthia Murrell, August 17, 2020
Search and Predicting Behavior
August 3, 2020
DarkCyber is interested in predictive analytics. Bayesian and other “statistical methods” are a go-to technique, and they find their way into many of the smart software systems. Developers rarely explain that systems share many features and functions. Marketers, usually kept in the dark like mushrooms, are free to formulate an interesting assertion or two.
I read “Google Searches During Pandemic Hint at Future Increase in Suicide,” and I was not sure about the methodology. Nevertheless, the write up provides some insight into what can be wiggled from Google search data.
Specifically Columbia University experts have concluded that financial distress is “strongly linked to suicide.”
Okay.
I learned:
The researchers used an algorithm to analyze Google trends data from March 3, 2019, to April 18, 2020, and identify proportional changes over time in searches for 18 terms related to suicide and known suicide risk factors.
What algorithm?
The method is described this way:
The proportion of queries related to depression was slightly higher than the pre-pandemic period, and moderately higher for panic attack.
Perhaps the researchers looked at the number of searches and noted the increase? So comparing raw numbers? Tenure tracks and grants await! Because that leap between search and future behavior…
Stephen E Arnold, August 3, 2020
Off the Shelf Fancy Math
July 17, 2020
Did you wish you were in the Student Union instead of the engineering lab? Did you long for hanging out with your besties instead of sitting in the library trying to get some answer, any answer to a differential equation? Answer “yes” to either question, and you will enjoy “Algorithms Are Now Commodities.” The write up states:
Algorithms are now like the bolts in a bridge: very important, but nobody talks about them. Today developers talk about story points, features, business logic, etc. Given a well-defined problem, many are now likely to search for an existing package, rather than write code from scratch (I certainly work this way). New algorithms are still being invented, and researchers continue to look for improvements to existing algorithms. This is a niche activity. There are companies where algorithms are not commodities.
The author points out:
Algorithms have not yet completed their journey to obscurity, which has to wait until people can tell computers what they want and not be concerned about the implementation details (or genetic algorithm programming gets a lot better).
With productized code modules proliferating like Star Trek’s Tribbles, math is on the way to the happy condition of a mouse click.
One commenter pointed out:
This is as misguided as a chef claiming recipes are now commodities, and the common chef need not be familiar with any. As with cooking, any organized programming of a machine necessarily involves algorithms, although lesser programmers won’t notice them.—Verisimilitude
This individual then pointed out:
The ‘chefs’ in most restaurants heat precooked components of a meal and combine them on the plate. Progress requires being able to treat what used to be important as commonplace.
An interesting topic. Amazon among others is pushing hard to the “off the shelf” and “ready to consume” approach to a number of computer centric functions.
Push the wrong button, then what? An opportunity to push another button and pay again. Iteration is the name of the game, not figuring out mere exercise problems.
Stephen E Arnold, July 16, 2020
Smart Software and an Intentional Method to Increase Revenue
July 6, 2020
The excellent write up titled “How Researchers Analyzed Allstate’s Car Insurance Algorithm.” My suggestion? Read it.
The “how to” information is detailed and instructive. The article reveals the thought process and logical thinking that allows a giant company with “good hands” to manipulate its revenues.
Here’s the most important statement in the article:
In other words, it appears that Allstate’s algorithm built a “suckers list” that would simply charge the big spenders even higher rates.
The information in the article illustrates how difficult it may be for outsiders to figure out how some smart numerical procedures are assembled into “intentional machines.”
The idea is that data allow the implementation of quite simple big ideas in a slick, automated, obfuscated way.
As my cranky grandfather observed, “It all comes down to money.”
Stephen E Arnold, July 6, 2020
Australia: Facial Recognition Diffuses
June 17, 2020
Facial recognition is in the new in the US. High-profile outfits have waved brightly colored virtue signaling flags. The flags indicate, “We are not into this facial recognition thing.” Interesting if accurate. “Facial Surveillance Is Slowly Being Trialed around the Country” provides some information about using smart software to figure out who is who. (Keep in mind that Australia uses the Ripper device to keep humans from becoming a snack for a hungry shark.)
The write up reports:
Facial recognition technology uses artificial intelligence to identify individuals based on their unique facial features and match it with existing photos on a database, such as a police watch list. While it’s already part of our everyday lives, from tagging photos on Facebook to verifying identities at airport immigration, its use by law enforcement via live CCTV is an emerging issue.
That’s the spoiler. Facial recognition is useful and the technology is becoming a helpful tool, like a flashlight or hammer.
The article explains that “All states and territories [in Australia] are using facial recognition software.”
Police in all states and territories confirmed to 7.30 they do use facial recognition to compare images in their databases, however few details were given regarding the number of live CCTV cameras which use the technology, current trials and plans for its use in the future.
The interesting factoid in the write up is that real time facial recognition systems are now in use in Queensland and Western Australia and under consideration in in New South Wales.
The article points out:
Real-time facial recognition software can simply be added to existing cameras, so it is difficult to tell which CCTV cameras are using the technology and how many around the country might be in operation.
DarkCyber believes that this means real time facial recognition is going to be a feature update, not unlike getting a new swipe action with a mobile phone operating system upgrade.
The article does not identify vendors providing these features, nor are data about accuracy, costs, and supporting infrastructure required.
What’s intriguing is that the article raises the thought that Australia might be on the information highway leading to a virtual location where Chinese methods are part of the equipment for living.
Will Australia become like China?
Odd comparison that. There’s the issue of population, the approach to governance, and the coastline to law enforcement ratio.
The write up also sidesteps the point that facial recognition is a subset of pattern recognition, statistical cross correlation, and essential plumbing for Ripper.
Who provides the smart software for that shark spotting drone? Give up? Maybe Amazon, the company not selling facial recognition to law enforcement in the US.
Interesting, right?
Stephen E Arnold, June 17, 2020
Rounding Error? Close Enough for Horse Shoes in Michigan
June 9, 2020
Ah, Michigan. River Rouge, the bridge to Canada, and fresh, sparkling water. These cheerful thoughts diminished when I read “Government’s Use of Algorithm Serves Up False Fraud Charges.”
The write up describes a smart system. The smart system was not as smart as some expected. The article states:
While the agency still hasn’t publicly released details about the algorithm, class actions lawsuits allege that the system searched unemployment datasets and used flawed assumptions to flag people for fraud, such as deferring to an employer who said an employee had quit — and was thus ineligible for benefits — when they were really laid off.
Where did the system originate? A D student in the University of Michigan’s Introduction to Algorithms class? No. The article reports:
The state’s unemployment agency hired three private companies to develop MiDAS, as well as additional software. The new system was intended to replace one that was 30 years old and to consolidate data and functions that were previously spread over several platforms, according to the agency’s 2013 self-nomination for an award with the National Association of State Chief Information Officers. The contract to build the system was for more than $47 million. At the same time as the update, the agency also laid off hundreds of employees who had previously investigated fraud claims.
Cathy O’Neil may want to update her 2016 “Weapons of Math Destruction.” Michigan has produced some casualties. What other little algorithmic surprises are yet to be discovered? Will online learning generate professionals who sidestep these types of mathiness? Sure.
Stephen E Arnold, June 9, 2020
Mathematica Not Available? Give Penrose a Whirl
June 7, 2020
If you want to visualize mathematical procedures, you can use any number of tools. Wolfram Mathematica is a go to choice for some folks. However, Penrose, a new tool, is available. The system is described in “CMU’s ‘Penrose’ Turns Complex Math Notations Into Illustrative Diagrams.” The article reports:
The CMU team similarly designed Penrose to codify the best practices of mathematical illustrators in a way that is reusable and widely accessible. Ye says Penrose enables users to create diagrams by simply typing in mathematical expressions that describe relationships, whereupon “the tool automatically takes care of laying everything out.”
More information is available at this link.
Stephen E Arnold, June 7, 2020
Facial Recognition: A Partial List
June 3, 2020
DarkCyber noted “From RealPlayer to Toshiba, Tech Companies Cash in on the Facial Recognition Gold Rush.” The write up provides two interesting things and one idea which is like a truck tire retread.
First, the write up points out that facial recognition or FR is a “gold rush.” That’s a comparison which eluded the DarkCyber research team. There’s no land. No seller of heavy duty pants. No beautiful scenery. No wading in cold water. No hydro mining. Come to think of it, FR is not like a gold rush.
Second, the write up provides a partial list of outfits engaged in facial recognition. The word partial is important. There are some notable omissions, but 45 is an impressive number. That’s the point. Just 45?
The aspect of the write the DarkCyber team ignored is this “from the MBA classroom” observation:
Despite hundreds of vendors currently selling facial recognition technology across the United States, there is no single government body registering the technology’s rollout, nor is there a public-facing list of such companies working with law enforcement. To document which companies are selling such technology today, the best resource the public has is a governmental agency called the National Institute of Standards and Technology.
Governments are doing a wonderful job it seems. Perhaps the European Union should step forward? What about Brazil? China? Russia? The United Nations? With Covid threats apparently declining, maybe the World Health Organization? Yep, governments.
Then, after wanting a central listing of FR vendors, this passage snagged one of my researcher’s attention:
NIST is a government organization responsible for setting scientific measurement standards and testing novel technology. As a public service, NIST also provides a rolling analysis of facial recognition algorithms, which evaluates the accuracy and speed of a vendor’s algorithms. Recently, that analysis has also included aspects of facial recognition field like algorithmic bias based on race, age, and sex. NIST has previously found evidence of bias in a majority of algorithms studied.
Yep, NIST. The group has done an outstanding job for enterprise search. Plus the bias in algorithms has been documented and run through the math grinding wheel for many years. Put in snaps of bad actors and the FR system does indeed learn to match one digital watermark with a similar digital watermark. Run kindergarten snaps through the system and FR matches are essentially useless. Bias? Sure enough.
Consider these ideas:
- An organization, maybe Medium, should build a database of FR companies
- An organization, maybe Medium, should test each of the FR systems using available datasets or better yet building a training set
- An organization, maybe Medium, should set up a separate public policy blog to track government organizations which are not doing the job to Medium’s standards.
There is an interest in facial recognition because there is a need to figure out who is who. There are some civil disturbances underway in a certain high profile country. FR systems may not be perfect, but they may offer a useful tool to some. On the other hand, why not abandon modern tools until they are perfect.
We live in an era of good enough, and that’s what is available.
Stephen E Arnold, June 3, 2020