Amazon Data: Yes, There Is a Good Reason
August 28, 2020
About three years ago, I gave my first lecture about Amazon’s streaming data marketplace. The audience was about 150 law enforcement and intelligence professionals. My goal was to describe some technical capabilities Amazon had set up since 2006. I stumbled upon the information reading through AWS public sector information available from open source; for example, patent documents and Amazon’s blog posts.
I was greeted with “We buy quite a bit from Amazon, but policeware?” I have included a description of the streaming data marketplace in my talks and posted some information in this blog. I was interviewed by reporters from Le Monde, the New York Times, and a couple of other “real news” outfits. Like those engaged in law enforcement and intelligence, no one cared.
One company developing specialized software expressed surprise when I recommended taking a look at what capabilities resided in the Amazon Web Services’ construct. The reaction was, “Everyone uses Microsoft Azure.” Most recently I gave three lectures at the 2020 National Cyber Crime Conference. One of them was about Amazon. I have about 250 people at my talks about investigative software and alternatives to the Dark Web. I still don’t know who listened to my Amazon lecture. I assume not too many people.
I read “Kindle Collects a Surprisingly Large Amount of Data.” The write up makes a single point. Reach a book or some other text on an Amazon Kindle and data flows to Amazon. There’s no awareness of the online book store’s streaming data marketplace or any of the related technology, features, and functions. Well, there is one article. That’s a start.
I scanned the comments and noted one which struck me as interesting:
There’s definitely no good reason why it should be sent to Amazon at all.
A good reason exists. Amazon is poised to provide a number of useful services to government agencies. Let me spark thinking with some questions:
What’s the value of a service which can generate a “value” score or “reliability” score or a “credibility” score for an individual?
Answer these and one is well on the way to grasping the Amazon policeware and intelware construct in my opinion. You can learn more by writing benkent2020 at yahoo dot com and inquiring about our Amazon for fee reports.
Stephen E Arnold, August 28, 2020
Facial Recognition: Recognizing Elsie the Cow
August 28, 2020
Facial recognition remains a contentious subject. In one of my 2020 National Cyber Crime Conference presentations, I showed a video snip. In Australia, facial recognition systems have been adapted to spot sharks. When the system “recognizes” a shark, an alert is relayed to individuals who patrol a beach. The idea is that sharks threats can be minimized. That’s animal recognition.
“Orwell’s Nightmare? Facial Recognition for Animals Promises a Farmyard Revolution” is a different type of story. I presented an example for intelligent application of pattern recognition. The write up evokes images of George Orwell and presents a different picture of these “recognition” technologies.
The write up states:
China has led the world in developing facial recognition capabilities. There are almost 630 million facial recognition cameras in use in the country, for security purposes as well as for everyday conveniences like entering train stations and paying for goods in stores. But authorities also use the technology for sinister means, such as monitoring political dissidents and ethnic minorities.
The write up points out:
One Chinese AI company, Megvii, which has been blacklisted by the Department of Commerce for alleged involvement in the Chinese government’s repression of Uighurs in Xinjiang, is applying its technology to a program to recognize dogs by their nose prints. Other tech companies around the world have had a go at identifying chimpanzees, dolphins, horses and lions, with varying degrees of success.
The article reluctantly turns its attention to the animal recognition “hook” for the reporter’s political commentary:
Farmers load information such as health conditions, insemination dates and pregnancy test results into the system, which syncs up with cameras installed above troughs and milking stations. If everything works, farmers can amass valuable data without lifting a finger.
So what? It seems that the reporter (possibly working for the Washington Post, a Jeff Bezos property, was unaware that the Australian shark recognition example was built on Amazon technology. Yep, Mr. Bezos has a stake in Amazon as well.
Interesting stuff. Perhaps the ace reporter could have explored the use of pattern recognition applied to animals? That’s work, of course.
Stephen E Arnold, August 28, 2020
Autonomy: One Chapter Closes but the Saga Continues
August 27, 2020
Just a quick pointer to Reuters, the trusted source (that’s what the Thomson Reuters outfit says, believe me) story “Ex-Autonomy CFO’s Conviction for Hewlett-Packard Fraud Is Upheld by U.S. Appeals Court” about an Autonomy executive. The news report states that Autonomy’s CFO is in deeper legal hot water. Sushovan Hussain was convicted in April 2018 on a number of charges, including wire and security fraud. DarkCyber still marvels that Hewlett Packard, the Board of Directors, auditors, and third party advisors applied “warp speed,” to use a popular phrase, to buy the search and content processing company for $11.1 billion. One fact is unchallengeable: This legal process is moving along at turtle speed. Is the HP Autonomy saga well suited for a Quibi video?
Stephen E Arnold, August 27, 2020
Free As a Dark Pattern
August 27, 2020
A number of online services offer free products. DarkCyber has spotted a semi clever play used by a developer of “free” video editing software. Three-dimensional models were not on our radar. The “free” software constructs are now identified and monitored by our steam-powered intelligence system. (We operate from rural Kentucky. What did you expect? Reinforcement learning?)
“3D Printering: The World of Non-Free 3D Models Is Buyer Beware” contains some information. Let’s take a quick look at a couple of revelations which caught the DarkCyber team’s attention:
First, a company has developed what appears to be a fresh approach to direct sales. The write up explains:
A standout success is a site like Hero Forge, which allows users to create custom tabletop gaming miniatures with a web-based interface. Users can pay to download the STL of their creation, or pay for a printed version. Hero Forge is a proprietary system, but a highly successful one judging by their recent Kickstarter campaign.
Second, you can acquire 3D models via “begging for dollars.” The article explains that these are requests for money paid via Patreon. I assume PayPal may work too.
Third is a kit. The customer gets a 3D model when buying some physical good. The write up points out electrical parts, fasteners, or a “kit,” which DarkCyber assumes is a plastic bag with stuff in it.
The problem?
According to the write up, the problems are:
- Vendors don’t offer “test drives, fitting rooms, or refunds”
- Models have lousy design for manufacture. (DarkCyber assumes this means whatever emerges from the 3D printer is not going to carry water. Nice 3D printed thermos you have there, Wally.)
These two problems boil down to “quality.”
After reading the article, DarkCyber thinks that one could interpret the word “quality” as a synonym for “fraud.”
Dark patterns are becoming increasingly common. Let’s blame it on an error, an oversight, or, best of all, the pandemic.
Stephen E Arnold, August 27, 2020
Elastic: Making Improvements
August 27, 2020
Elasticsearch is one of the most popular open-source enterprise search platforms. While Elasticsearch is free for developers to download, Elastic offers subscriptions for customer support and enhanced software. Now the company offers some new capabilities and features, HostReview reveals in, “Elastic Announces a Single, Unified Agent and New Integrations to Bring Speed, Scale, and Simplicity to Users Everywhere.” The press release tells us:
“With this launch, portions of Elastic Workplace Search, part of the Elastic Enterprise Search solution, have been made available as part of the free Basic distribution tier, enabling organizations to build an intuitive internal search experience without impacting their bottom line. Customers can access additional enterprise features, such as single sign-on capabilities and enhanced support, through a paid subscription tier, or can deploy as a managed service on Elastic Cloud. This launch also marks the first major beta milestone for Elastic in delivering comprehensive endpoint security fully integrated into the Elastic Stack, under a unified agent. This includes malware prevention that is provided under the free distribution tier. Elastic users gain third-party validated malware prevention on-premises or in the cloud, on Windows and macOS systems, centrally managed and enabled with one click.”
The upgrades are available across the company’s enterprise search, observability, and security solutions as well as Elastic Stack and Elastic Cloud. (We noted Elastic’s welcome new emphasis on security last year.) See the write-up for the specific updates and features in each area. Elasticsearch underpins operations in thousands of organizations around the world, including the likes of Microsoft, the Mayo Clinic, NASA, and Wikipedia. Founded in 2012, Elastic is based in Silicon Valley. They also happen to be hiring for many locations as of this writing, with quite a few remote (“distributed”) positions available.
Cynthia Murrell, August 27, 2020
The Possibilities of GPT-3 from OpenAI Are Being Explored
August 27, 2020
Unsurprisingly, hackers have taken notice of the possibilities presented by OpenAI’s text-generating software. WibestBroker News reports, “Fake Blog Posts Land at the Top of Hacker News.” The post was generated by college student Liam Porr, who found it easy to generate content with OpenAI’s latest iteration, GPT-3, that could fool readers into thinking it had been crafted by a person. Writer John Marley describes the software:
“GPT-3, like all deep learning systems, looks for patterns in data. To simplify, the program has been trained on a huge corpus of text mined for statistical regularities. These regularities are unknown to humans. Between the different nodes in GPT-3’s neural network, they are stored as billions of weighted connections. There’s no human input involved in this process. Without any guidance, the program looks and finds patterns.”
Rather than being unleashed upon the public at large, the software has been released to select researchers in a private beta. Marley continues:
“Porr is a computer science student at the University of California, Berkeley. He was able to find a PhD student who already had access to the API. The student agreed to work with him on the experiment. Porr wrote a script that gave GPT-3 a headline and intro for the blog post. It generated some versions of the post, and Porr chose one for the blog. He copy-pasted from GPT-3’s version with very little editing. The post went viral in a matter of a few hours and had more than 26,000 visitors. Porr wrote that only one person reached out to ask if the post was AI-generated. Albeit, several commenters did guess GPT-3 was the author. But, the community down voted those comments, Porr says.”
Little did the down-voters know. Poor reports he applied for his own access to the tool, but it has yet to be granted. Perhaps OpenAI is not too pleased with his post, he suggests. We wonder whether this blogger received any backlash from the software’s creators.
Cynthia Murrell, August 27, 2020
Be Smart: Live in Greenness
August 27, 2020
I do not want to be skeptical. I do not want to suggest that a study may need what might be called verification. Please, read “Residential Green Space and Child Intelligence and Behavior across Urban, Suburban, and Rural Areas in Belgium: A Longitudinal Birth Cohort Study of Twins.” To add zip to your learning, navigate to a “real” news outfit’s article called “Children Raised in Greener Areas Have Higher IQ, Study Finds.” Notice any significant differences.
First, the spin in the headline. The PLOS article points out that the sample comes from Belgium. How representative is this country when compared to Peru or Syria? How reliable are “intelligence” assessments? What constitutes bad behavior? Are these “qualities” subject to statistically significant variations due to exogenous factors?
I don’t want to do a line by line comparison of the write up which wants to ring the academic gong. Nor do I want to read how “real” journalists deal with a scholarly article.
I would point out this sentence in the scholarly article:
To our knowledge, this is the first study investigating the association between residential green space and intelligence in children.
Yeah, let’s not get too excited from a sample of 620 in Belgium. Skip school. Play in a park or wander through thick forests.
Stephen E Arnold, August 27, 2020
Silicon Valley Style Journalists Want to Be Digital Peter Druckers
August 27, 2020
I wrote about the New York Times’ journalists who wrote about one another. I suggested that these “real” news professionals were taking a short cut. Boy, I mischaracterized what was happening. The problem is escalating because Silicon Valley-style journalists want to be like Peter Drucker. The management guru had a Ph.D. and a knack for nailing trends. Perhaps his most timely today is the phrase “knowledge worker.” Today’s “real” journalists want to skip the Drucker trajectory of management by objectives and thought leadership over decades. The Silicon Valley journalists want to get right to it: Stories about technology to grand statements about the way life should be. Not just tech life, but life in general.
The most recent example, in my opinion, is “Why People Can’t Stand Tech Journalists: An Interview with Casey Newton.” Mr. Newton lit up my radar when he suggested that Jeff Bezos give a weekly speech to explain Amazon. After viewing the Congressional hearings, I am not confident Mr. Bezos knows what is going on at Amazon, but I know that a “real” journalist telling the world’s richest man is the journalistic equivalent of Google’s sense of entitlement.
Once again we have two “real” journalists talking to one another. And what do we learn?
The spine of the “real” journalists’ baseline assumption is, “People don’t like us.” Once again a big generalization created to make “real” journalists into underdogs. I noted this statement:
The tech press, I think, has done probably the best work of its life collectively over the past four years.
Let’s do a quick show of hands. Who thinks that this statement from the “real” journalist pundit person will resonate like Dr. Drucker’s knowledge worker or management by objectives phrases? Here’s the Sillycon Valley statement:
I’ve been thinking about what we’ve lost because everything is so noisy and spicy.
Memorable. Noisy and spicy. Like a budget quasi ethnic restaurant and its comidas?
Several observations:
- The merging of entitlement, social justice, and inside baseball produces information that in some ways is as destructive as the output of nation states seeking to influence behavior in the US
- The laziness of talking to someone in the next cubicle or while standing on line at Philz Coffee is evidence of a serious problem. Thought leadership does not flow from unsubstantiated opinions.
- Journalism, whether at the New York Times, the Murdoch outfits, or zippy Silicon Valley-type “real” news producers is becoming indistinguishable from the yammerings of mid-tier consulting firms and student who couldn’t get an “A” from Dr. Drucker.
That’s a less-than-positive situation in my view. Has Mr. Bezos attended to the demand that he give a weekly speech? The answer is, “No.”
Stephen E Arnold, August 27, 2020
IDC Has a New Horse to Flog: Artificial Intelligence
August 26, 2020
Okay, it is official. IDC has a new horse to flog. “Artificial Intelligence” will be carrying a load. Navigate to “Worldwide Spending on AI Expected to Double in 4 Years, Says IDC.” Consulting firms and specialized research outfits need to have a “big thing” about which to opine. IDC has discovered one: AI. The write up states:
Global spending on artificial intelligence (AI) is forecast to double over the next four years, growing from US$50.1 billion in 2020 to more than US$110 billion in 2024, according to the IDC. Spending on AI systems will accelerate over the next several years as organizations deploy AI as part of their digital transformation efforts, said IDC. The CAGR for the 2019-2024 period will be 20.1%.
In the Age of Rona, we have some solid estimates. A 2X jump in 48 months.
Why, pray tell, is AI now moving into the big leagues of hyper growth. Check out this explanation:
Two of the leading drivers for AI adoption are delivering a better customer experience and helping employees to get better at their jobs.
Quite interesting. My DarkCyber research team believes that AI growth will be encouraged by these factors:
- Government investments in smart weapons and aggressive pushes for projects like “loyal wingman”
- A sense that staff must be terminated and replaced with systems which do not require health care, retirement plans, vacations, and special support for issues like addiction
- Packaged “smart” solutions like Amazon’s off the shelf products and services for machine learning.
These are probably trivial in the opinion of the IDC estimators, but DarkCyber is not convinced that baloney like customer experience and helping employees “get better at their jobs” are providing much oomph.
Stephen E Arnold, August 26, 2020
Bias in Biometrics
August 26, 2020
How can we solve bias in facial recognition and other AI-powered biometric systems? We humans could try to correct for it, but guess where AI learns its biases—yep, from us. Researcher Samira Samadi explored whether using a human evaluator would make an AI less biased or, perhaps, even more so. We learn of her project and others in Biometric Update.com’s article, “Masks Mistaken for Duct Tape, Researchers Experiment to Reduce Human Bias in Biometrics.” Reporter Luana Pascu writes:
“Curious to understand if a human evaluator would make the process fair or more biased, Samadi recruited users for a human-user study. She taught them about facial recognition systems and how to make decisions about system accuracy. ‘We really tried to imitate a real-world scenario, but that actually made it more complicated for the users,’ Samadi said. The experiment confirmed the difficulty in finding an appropriate dataset with ethically sourced images that would not introduce bias into the study. The research was published in a paper called A Human in the Loop is Not Enough: The Need for Human-Subject Experiments in Facial Recognition.”
Many other researchers are studying the bias problem. One NIST report found a lot of software that produced 10-fold to 100-fold increase in the probability of Asian and African American faces being inaccurately recognized (though a few systems had negligible differences). Meanwhile, a team at Wunderman Thompson Data found tools from big players Google, IBM, and Microsoft to be less accurate than they had expected. For one thing, the systems had trouble accounting for masks—still a persistent reality as of this writing. The researchers also found gender bias in all three systems, even though the technologies used are markedly different.
There is reason to hope. Researchers at the Durham University’s Computer Science Department managed to reduce racial bias by one percent and improve ethnicity accuracy. To achieve these results, the team used a synthesized data set with a higher focus on feature identification. We also learn:
“New software to cut down on demographic differences in face biometric performance has also reached the market. The ethnicity-neutral facial recognition API developed by AIH Technology is officially available in the Microsoft Azure Marketplace. In March, the Canadian company joined the Microsoft Partners Network (MPN) and announced the plans for the global launch of its Facial-Recognition-as-a-Service (FRaaS).”
Bias in biometrics, and AI in general, is a thorny problem with no easy solution. At least now people are aware of the issue and bright minds are working to solve it. Now, if only companies would be willing to delay profitable but problematic implementations until solutions are found. Hmmm.
Cynthia Murrell, August 26, 2020