Poisoning Smart Software: More Than Sparkley Sunglasses
March 22, 2020
DarkCyber noted “FYI: You Can Trick Image-Recog AI into, Say, Mixing Up Cats and Dogs – by Abusing Scaling Code to Poison Training Data.” The article provides some information about a method “to subvert neural network frameworks so they misidentify images without any telltale signs of tampering.”
Kudos to the Register for providing links to the papers referenced in the article: “Adversarial Preprocessing: Understanding and preventing Image Scaling Attacks in Machine Learning” and “Backdooring and Poisoning Neural Networks with Image Scaling Attacks.”
The Register article points out:
Their key insight is that algorithms used by AI frameworks for image scaling – a common preprocessing step to resize images in a dataset so they all have the same dimensions – do not treat every pixel equally. Instead, these algorithms, in the imaging libraries of Caffe’s OpenCV, TensorFlow’s tf.image, and PyTorch’s Pillow, specifically, consider only a third of the pixels to compute scaling.
DarkCyber wants to point out:
- The method can be implemented by bad actors seeking to reduce precision of certain types of specialized software. Example: Compromising Anduril’s system
- Smart software is vulnerable to training data procedures. Some companies train once and forget it. Smart software can drift even with well crafted training data.
- Information which may have national security implications finds its way into what seems to be a dry, academic analysis. If one does not read these papers, is it possible for one to be unaware of impending or actual issues.
Net net: Cutting corners on training or failing to retrain systems is a problem. However, failing to apply rigor to the entire training process does more than reduce the precision of outputs. Systems simply fail to deliver what users assume a system provides.
Stephen E Arnold, March 22, 2020