Should Social Media Algorithms be Used to Predict Crime?
September 18, 2019
Do we want Thought Police? Because this is how you get Thought Police. Though tragedies like the recent mass shootings in El Paso and Dayton are horrifying, some “solutions” are bound to do more harm than good. President Trump’s recent call for social-media companies to predict who will become a mass shooter so authorities can preemptively move against them is right out of Orwell’s 1984. Digital Trends asks, “Can Social Media Predict Mass Shootings Before They Happen?” Technically, it probably can, but with limited accuracy. Journalist Mathew Katz writes:
“Companies like Google, Facebook, Twitter, and Amazon already use algorithms to predict your interests, your behaviors, and crucially, what you like to buy. Sometimes, an algorithm can get your personality right – like when Spotify somehow manages to put together a playlist full of new music you love. In theory, companies could use the same technology to flag potential shooters. ‘To an algorithm, the scoring of your propensity [to] purchase a particular pair of shoes is not very different from the scoring of your propensity to become a mass murderer—the main difference is the data set being scored,’ wrote technology and marketing consultant Shelly Palmer in a newsletter on Sunday. But preventing mass shootings before they happen raises some thorny legal questions: how do you determine if someone is just angry online rather than someone who could actually carry out a shooting? Can you arrest someone if a computer thinks they’ll eventually become a shooter?”
That is what we must decide as a society. We also need to ask whether algorithms are really up to the task. We learn:
“The Partnership on AI, an organization looking at the future of artificial intelligence, conducted an intensive study on algorithmic tools that try to ‘predict’ crime. Their conclusion? ‘These tools should not be used alone to make decisions to detain or to continue detention.’”
But we all know that once people get an easy-to-use tool, the ease-of-use can quickly trump accuracy. Think of how often you see ads online for products you would never buy, Katz prompts. Then consider how it would feel to be arrested for a crime you would never commit.
Cynthia Murrell, September 18, 2019
Handy Visual Reference of Data Model Evaluation Techniques
September 12, 2019
There are many ways to evaluate one’s data models, and Data Science Central presents an extensive yet succinct reference in visual form—“Model Evaluation Techniques in One Picture.” Together, the image and links make for a useful resource. Creator Stephanie Glen writes:
“The sheer number of model evaluation techniques available to assess how good your model is can be completely overwhelming. As well as the oft-used confidence intervals, confusion matrix and cross validation, there are dozens more that you could use for specific situations, including McNemar’s test, Cochran’s Q, Multiple Hypothesis testing and many more. This one picture whittles down that list to a dozen or so of the most popular. You’ll find links to articles explaining the specific tests and procedures below the image.”
Glen may be underselling her list of links after the graphic; it would be worth navigating to her post for that alone. The visual, though, elegantly simplifies a complex topic. It is divided into these subtopics: general tests and tools; regression; classification: visual aids; and Classification: statistics and tools. Interested readers should check it out; you might just decide to bookmark it for future reference, too.
Cynthia Murrell, September 12, 2019
Disrupting Neural Nets: Adversarial Has a More Friendly Spin Than Weaponized
August 28, 2019
In my lecture about manipulation of algorithms, I review several methods for pumping false signals into a data set in order to skew outputs.
The basic idea is that if an entity generates content pulses which are semantically or otherwise related in a way the smart software “counts”, then the outputs are altered.
A good review of some of these flaws in neural network classifiers appears in “How Reliable Are Neural Networks Classifiers Against Unforeseen Adversarial Attacks.”
DarkCyber noted this statement in the write up:
attackers could target autonomous vehicles by using stickers or paint to create an adversarial stop sign that the vehicle would interpret as a ‘yield’ or other sign. A confused car on a busy day is a potential catastrophe packed in a 2000 pound metal box.
Dramatic, yes. Far fetched? Not too much.
Providing weaponized data objects to smart software can screw up the works. Examples range from adversarial clothing, discussed in the DarkCyber video program for August 27, 2019, to the wonky predictions that Google makes when displaying personalized ads.
The article reviews an expensive and time consuming method for minimizing the probability of weaponized data mucking up the outputs.
The problem, of course, is that smart software is supposed to handle the tricky, expensive, and slow process of assembling and refining a training set of data. Talk about smart software is really cheap. Delivering systems which operate in the real world is another kettle of what appear to be fish as determined by a vector’s norm.
The Analytics India article is neither broad nor deep. It does raise awareness of the rather interesting challenges which lurk within smart software.
Understanding how smart software can get off base and drift into LaLa Land begins with identifying the problem.
Smart software cannot learn and discriminate with the type of accuracy many people assume is delivered. Humans assume a system output is 99 percent accurate; for example, Is it raining?
The reality is that adversarial inputs can reduce the accuracy rate significantly.
On good days, smart software can hit 85 to 90 percent accuracy. That’s good enough unless a self driving car hits you. But with adversarial or weaponized data, that accuracy rate can drop below the 65 percent level which most of the systems DarkCyber has tested can reliably achieve.
To sum up, smart software makes mistakes. Weaponized data input into a smart software can increase the likelihood of an error.
The methods can be used in commercial and military theaters.
Neither humans nor software can prevent this from happening on a consistent basis.
So what? Yes, that’s a good question.
Stephen E Arnold, August 29. 2019
Smart Software but No Mention of Cathy O’Neil
August 21, 2019
I read “Flawed Algorithms Are Grading Millions of Students’ Essays.” I also read Weapons of Math Destruction: How Big Data Increases Inequality and Threatens Democracy by Cathy O’Neil, which was published in 2016. My recollection is that Ms. O’Neil made appearances on some podcasts; for instance, Econ Talk, a truly exciting economics-centric program. It is indeed possible that the real news outfit Motherboard/Vice did not. Before I read the article, I did a search of the text for “O’Neil” and “Weapons of Math Destruction.” I found zero hits. Why? The author, editor, and publisher did not include a pointer to her book. Zippo. There’s a reference to the “real news” outfit ProPublica. There’s a reference to the Vice investigation. Would this approach work in freshman composition with essays graded by a harried graduate student?
Here’s the point. Ms. O’Neil did a very good job of explaining the flaws of automated systems. Recycling is the name of the game. After all, DarkCyber is recycling this “original” essay containing “original” research, isn’t it?
I noted this passage in the write up:
Research is scarce on the issue of machine scoring bias, partly due to the secrecy of the companies that create these systems. Test scoring vendors closely guard their algorithms, and states are wary of drawing attention to the fact that algorithms, not humans, are grading students’ work. Only a handful of published studies have examined whether the engines treat students from different language backgrounds equally, but they back up some critics’ fears.
Yeah, but there is a relatively recent book on the subject.
I noted this statement in the write up:
Here’s the first sentence from the essay addressing technology’s impact on humans’ ability to think for themselves…
I like the “ability to think for themselves.”
So do I. In fact, I would suggest that this write up is an example of the loss of this ability.
A mere 2,000 words and not a room or a thought or a tiny footnote about Ms. O’Neil. Flawed? I leave it to you to decide.
Stephen E Arnold, August 21, 2019
More on Biases in Smart Software
August 7, 2019
Bias in machine learning strikes again. Citing a study performed by Facebook AI Research, The Verge reports, “AI Is Worse at Identifying Household Items from Lower-Income Countries.” Researchers studied the accuracy of five top object-recognition algorithms, Microsoft Azure, Clarifai, Google Cloud Vision, Amazon Rekognition, and IBM Watson, using this dataset of objects from around the world. Writer James Vincent tells us:
“The researchers found that the object recognition algorithms made around 10 percent more errors when asked to identify items from a household with a $50 monthly income compared to those from a household making more than $3,500. The absolute difference in accuracy was even greater: the algorithms were 15 to 20 percent better at identifying items from the US compared to items from Somalia and Burkina Faso.”
Not surprisingly, researchers point to the usual suspect—the similar backgrounds and financial brackets of most engineers who create algorithms and datasets. Vincent continues:
“In the case of object recognition algorithms, the authors of this study say that there are a few likely causes for the errors: first, the training data used to create the systems is geographically constrained, and second, they fail to recognize cultural differences. Training data for vision algorithms, write the authors, is taken largely from Europe and North America and ‘severely under sample[s] visual scenes in a range of geographical regions with large populations, in particular, in Africa, India, China, and South-East Asia.’ Similarly, most image datasets use English nouns as their starting point and collect data accordingly. This might mean entire categories of items are missing or that the same items simply look different in different countries.”
Why does this matter? For one thing, it means object recognition performs better for certain audiences than others in systems as benign as photo storage services, as serious as security cameras, and as crucial self-driving cars. Not only that, we’re told, the biases found here may be passed into other types of AI that will not receive similar scrutiny down the line. As AI products pick up speed throughout society, developers must pay more attention to the data on which they train their impressionable algorithms.
Cynthia Murrell, August 7, 2019
Trovicor: A Slogan as an Equation
August 2, 2019
We spotted this slogan on the Trovicor Web site:
The Trovicor formula: Actionable Intelligence = f (data generation; fusion; analysis; visualization)
The function consists of four buzzwords used by vendors of policeware and intelware:
- Data generation (which suggests metadata assigned to intercepted, scraped, or provided content objects)
- Fusion (which means in DarkCyber’s world a single index to disparate data)
- Analysis (numerical recipes to identify patterns or other interesting data
- Virtualization (use of technology to replace old school methods like 1950s’ style physical wire taps, software defined components, and software centric widgets).
The buzzwords make it easy to identify other companies providing somewhat similar services.
Trovicor maintains a low profile. But obtaining open source information about the company may be a helpful activity.
Stephen E Arnold, August 2, 2019
Smart Software: About Those Methods?
July 23, 2019
An interesting paper germane to machine learning and smart software is available from Arxiv.org. The title? “Are We Really Making Much Progress? A Worrying Analysis of Recent Neural Recommendation Approaches”.
The punch line for this academic document is, in the view of DarkCyber:
No way.
Your view may be different, but you will have to read the document, check out the diagrams, and scan the supporting information available on Github at this link.
The main idea is:
In this work, we report the results of a systematic analysis of algorithmic proposals for top-n recommendation tasks. Specifically, we considered 18 algorithms that were presented at top-level research conferences in the last years. Only 7 of them could be reproduced with reasonable effort. For these methods, it however turned out that 6 of them can often be outperformed with comparably simple heuristic methods, e.g., based on nearest-neighbor or graph-based techniques. The remaining one clearly outperformed the baselines but did not consistently outperform a well-tuned non-neural linear ranking method. Overall, our work sheds light on a number of potential problems in today’s machine learning scholarship and calls for improved scientific practices in this area.
So back to my summary, “No way.”
Here’s a “oh, how interesting chart.” Note the spikes:
Several observations:
- In an effort to get something to work, those who think in terms of algorithms take shortcuts; that is, operate in a clever way to produce something that’s good enough. “Good enough” is pretty much a C grade or “passing.”
- Math whiz hand waving and MBA / lawyer ignorance of what human judgments operate within an algorithmic operation guarantee that “good enough” becomes “Let’s see if this makes money.” You can substitute “reduce costs” if you wish. No big difference.
- Users accept whatever outputs a smart system deliver. Most people believe that “computers are right.” There’s nothing DarkCyber can do to make people more aware.
- Algorithms can be fiddled in the following ways: [a] Let these numerical recipes and the idiosyncrasies of calculation will just do their thing; for example, drift off in a weird direction or produce the equivalent of white noise; [b] get skewed because of the data flowing into the system automagically (very risky) or via human subject matter experts (also very risky); [c] the programmers implementing the algorithm focus on the code, speed, and deadline, not how the outputs flow; for example, k-means can be really mean and Bayesian methods can bay at the moon.
Net net: Worth reading this analysis.
Stephen E Arnold, July 23, 2019
Need a Machine Learning Algorithm?
July 17, 2019
The R-Bloggers.com Web site published “101 Machine Learning Algorithms for Data Science with Cheat Sheets.” The write up recycles information from DataScienceDojo, and some of the information looks familiar. But lists of algorithms are not original. They are useful. What sets this list apart is the inclusion of “cheat sheets.”
What’s a cheat sheet?
In this particular collection, a cheat sheet looks like this:
You can see the entry for the algorithm: Bernoulli Naive Bayes with a definition. The “cheat sheet” is a link to a python example. In this case, the example is a link to an explanation on the Chris Albon blog.
What’s interesting is that the 101 algorithms are grouped under 18 categories. Of these 18, Bayes and derivative methods total five.
No big deal, but in my lectures about widely used algorithms I highlight 10, mostly because it is a nice round number. The point is that most of the analytics vendors use the same basic algorithms. Variations among products built on these algorithms are significant.
As analytics systems become more modular — that is, like Lego blocks — it seems that the trajectory of development will be to select, preconfigure thresholds, and streamline processes in a black box.
Is this good or bad?
It depends on whether one’s black box is a dominant solution or platform?
Will users know that this almost inevitable narrowing has upsides and downsides?
Nope.
Stephen E Arnold, July 17, 2019
Exclusive: DataWalk Explained by Chris Westphal
July 9, 2019
“An Interview with Chris Westphal” provides an in-depth review of a company now disrupting the analytic and investigative software landscape.
DataWalk is a company shaped by a patented method for making sense of different types of data. The technique is novel and makes it possible for analysts to extract high value insights from large flows of data in near real time with an unprecedented ease of use.
DarkCyber interviewed in late June 2019 Chris Westphal, the innovator who co-founded Visual Analytics. That company’s combination of analytics methods and visualizations was acquired by Raytheon in 2013. Now Westphal is applying his talents to a new venture DataWalk.
Westphal, who monitors advanced analytics, learned about DataWalk and joined the firm in 2017 as the Chief Analytics Officer. The company has grown rapidly and now has client relationships with corporations, governments, and ministries throughout the world. Applications of the DataWalk technology include investigators focused on fraud, corruption, and serious crimes.
Unlike most investigative and analytics systems, users can obtain actionable outputs by pointing and clicking. The system captures these clicks on a ribbon. The actions on the ribbon can be modified, replayed, and shared.
In an exclusive interview with Mr. Westphal, DarkCyber learned:
The [DataWalk] system gets “smarter” by encoding the analytical workflows used to query the data; it stores the steps, values, and filters to produce results thereby delivering more consistency and reliability while minimizing the training time for new users. These workflows (aka “easy buttons”) represent domain or mission-specific knowledge acquired directly from the client’s operations and derived from their own data; a perfect trifecta!
One of the differentiating features of DataWalk’s platform is that it squarely addresses the shortage of trained analysts and investigators in many organizations. Westphal pointed out:
…The workflow idea is one of the ingredients in the DataWalk secret sauce. Not only do these workflows capture the domain expertise of the users and offer management insights and metrics into their operations such as utilization, performance, and throughput, they also form the basis for scoring any entity in the system. DataWalk allows users to create risk scores for any combination of workflows, each with a user-defined weight, to produce an overall, aggregated score for every entity. Want to find the most suspicious person? Easy, just select the person with the highest risk-score and review which workflows were activated. Simple. Adaptable. Efficient.
Another problem some investigative and analytic system developers face is user criticism. According to Westphal, DataWalk takes a different approach:
We listen carefully to our end-user community. We actively solicit their feedback and we prioritize their inputs. We try to solve problems versus selling licenses… DataWalk is focused on interfacing to a wide range of data providers and other technology companies. We want to create a seamless user experience that maximizes the utility of the system in the context of our client’s operational environments.
For more information about DataWalk, navigate to www.datawalk.com. For the full text of the interview, click this link. You can view a short video summary of DataWalk in the July 2, 2019, DarkCyber Video available on Vimeo.
Stephen E Arnold, July 9, 2019
Knowledge Graphs: Getting Hot
July 4, 2019
Artificial intelligence, semantics, and machine learning may lose their pride of place in the techno-jargon whiz bang marketing world. I read “A Common Sense View of Knowledge Graphs,” and noted this graph:
This is a good, old fashioned, Gene Garfield (remember him, gentle reader) citation analysis. The idea is that one can “see” how frequently an author or, in this case, a concept has been cited in the “literature.” Now publishers are dropping like flies and are publishing bunk. Nevertheless, one can see that using the phrase knowledge graph is getting popular within the sample of “literature” parsed for this graph. (No, I don’t recommend trying to perform citation analysis in Bing, Facebook, or Google. The reasons will just depress me and you, gentle reader.)
The section of the write I found useful and worthy of my “research” file is the collection of references to documents defining “knowledge graph.” This is useful, helpful research.
The write up also includes a diagram which may be one of the first representations of a graph centric triple. I thought this was something cooked up by Drs. Bray, Guha, and others in the tsunami of semantic excitement.
One final point: The list of endnotes is also useful. In short, good write up. The downside is that if the article gets wider distribution, a feeding frenzy among money desperate consultants, advisers, and analysts will be ignited like a Fourth of July fountain of flame.
Stephen E Arnold, July 4, 2019