Amazon: The Surveillance Mesh Play

September 30, 2019

DarkCyber received a complaint about the small size of the image from my webinar about Amazon Policeware. There are two remedies for tiny images. You can attend my policeware lecture at the TechnoSecurity & Digital Forensics Conference in San Antonio on Wednesday, October 2. Qualified attendees can request a PDF of the image. Second, you can contact DarkCyber at benkent2020 at yahoo dot com and sign up for our LE, security, and intel personnel webinar.

Today, I want to provide several findings from our research related to Amazon Policeware. These are:

  • Amazon’s mesh network in the Sidewalk product provides a solution to blanketing a city with a data collection component. This wide field outdoor mesh network may fail. In the meantime, you may be able to locate your dog if it is wearing a Fetch.
  • Amazon’s Ring doorbell provides an anchor for fixed video feeds. The resolution is poor and the system is far from comprehensive, but the test mechanism is sufficiently compelling for several hundred police departments to show interest.
  • The supplementary data collection devices shown in the figure below feed into the AWS policeware platform. That platform performs a number of analytic functions. Cross correlation is one of these.

image

© Stephen E Arnold, 2019

So what?

In the US, Amazon is moving forward to put in place a next generation service which provides a new tool to enforcement authorities. The system delivers other benefits to Amazon as well.

DarkCyber identifies some parallels between the efforts the government of China is making with Amazon’s activities.

Will the Epstein friendly academic institution get this story straight? Probably not.

Stephen E Arnold, September 30, 2019

Amazon Policeware: The Path to IBM-Style Lock In on Steroids

September 27, 2019

Quite a bit of Amazon news has flowed through the DarkCyber system. The problem is that most of the information is oblivious to Amazon’s policeware initiative. DarkCyber’s research suggests that Amazon is building a surveillance system. One DarkCyber team member said, “Amazon is building what China has been working on for several years.” Is this DarkCyber researcher correct? Who knows?

I do want to provide a diagram from our Amazon webinar which puts Amazon’s activities into a context for enforcement. The scope of Amazon’s business strategy extends beyond local law enforcement and the Ring video doorbell activities, beyond the cloud services for several US government agencies, and beyond the company’s online businesses.

Amazon may be positioning itself to provide:

  • IRS-related services associated with tax investigations
  • Drug enforcement actions related to physicians who allegedly overprescribe or entities which obtain certain compounds using obfuscation methods
  • SEC-related services to determine entity interaction, expenditures, and related financial activities
  • Credit verification, including other financial analyses, for government and retail financial activities.

Other “extensions” are possible. What’s interesting is that few have noticed and even fewer pay much attention beyond hand waving about Alexa. There’s more than Alexa, which is a low level gateway service.

Here’s the diagram, which is copyrighted by Stephen E Arnold, operator of DarkCyber, and author of the forthcoming monograph, Dark Edge: Amazon’s Policeware Initiative.

image

© Stephen E Arnold, 2019.

How do you use this diagram? Just map Amazon’s most recent product announcements into the grid.

The DarkCyber Amazon policeware webinar walks through the tactics and the strategy for this “in plain sight” play. Analysts, journalists, policeware vendors paying Amazon to host their systems, and Microsoft-type outfits are oblivious to what is now the end game for a 12 year push by Amazon to make IBM-style lock in seem as quaint as a Model T Ford.

For those who recycle my information and claim it as your own creative output, why not be somewhat ethical and provide attribution. You know. Old-fashioned stuff like a footnote. Yep, that includes a real journalist who writes for the New York Times and the Epstein linked MIT publication, among others.

Stephen E Arnold, September 27, 2019

Search: Useless Results Finally Recognized?

August 22, 2019

I cannot remember how many years ago it was since I wrote “Search Sucks” for Barbara Quint, the late editor of Searcher. I recall her comment to me, “Finally, someone in the industry speaks out.”

Flash forward a decade. I can now repeat her comment to me with some minor updating: “Finally someone recognized by the capitalist tool, Forbes Magazine, recognizes that search sucks.

The death of search was precipitated by several factors. Mentioning these after a decade of ignoring Web search still makes me angry. The failure of assorted commercial search vendors, the glacial movement of key trade associations, and the ineffectuality of search “experts” still makes me angry.

Image result for fake information

There are other factors contributing to the sorry state of Web search today. Note: I am narrowing my focus to the “free” Web search systems. If I have the energy, I may focus on the remarkable performance of “enterprise search.” But not today.

Here are the reasons Web search fell to laughable levels of utility:

  1. Google adopted the GoTo / Overture / Yahoo approach to determining relevance. This is the pay-to-play model.
  2. Search engine optimization “experts” figured out that Google allowed some fiddling with how it determined “relevance.” Google and other ad supported search systems then suggested that those listings might decay. The fix? Buy ads.
  3. Users who were born with mobile phones and flexible fingers styled themselves “search experts” along with any other individual who obtains information by looking for “answers” in a “free” Web search system.
  4. The willful abandonment of editorial policies, yardsticks like precision and recall, and human indexing guaranteed that smart software would put the nails in the coffin of relevance. Note: artificial intelligence and super duped automated indexing systems are right about 80 percent of the time when hammering scientific, technical, and engineering information. Toss is blog posts, tweets, and Web content created by people who skipped high school English and the accuracy plummets. Way down, folks. Just like facial recognition systems.

The information presented in “As Search Engines Increasingly Turn To AI They Are Harming Search” is astounding. Not because it is new, but because it is a reflection of what I call the Web search mentality.

Here’s an example:

Yet over the past few years, search engines of all kinds have increasingly turned to deep learning-powered categorization and recommendation algorithms to augment and slowly replace the traditional keyword search. Behavioral and interest-based personalization has further eroded the impact of keyword searches, meaning that if ten people all search for the same thing, they may all get different results. As search engines depreciate traditional raw “search” in favor of AI-assisted navigation, the concept of informational access is being harmed and our digital world is being redefined by the limitations of today’s AI.

The problem is not artificial intelligence.

Read more

New Jargon: Consultants, Start Your Engines

July 13, 2019

I read “What Is “Cognitive Linguistics“? The article appeared in Psychology Today. Disclaimer: I did some work for this outfit a long time ago. Anybody remember Charles Tillinghast, “CRM” when it referred to people, not a baloney discipline for a Rolodex filled with sales lead, and the use of Psychology Today as a text in a couple of universities? Yeah, I thought not. The Ziff connection is probably lost in the smudges of thumb typing too.

Onward: The write up explains a new spin on psychology, linguistics, and digital interaction. The jargon for this discipline or practice, if you will is:

Cognitive Linguistics

I must assume that the editorial processes at today’s Psychology Today are genetically linked to the procedures in use in — what was it, 1972? — but who knows.

excited fixed

Here’s the definition:

The cognitive linguistics enterprise is characterized by two key commitments. These are:
i) the Generalization Commitment: a commitment to the characterization of general principles that are responsible for all aspects of human language, and
ii) the Cognitive Commitment: a commitment to providing a characterization of general principles for language that accords with what is known about the mind and brain from other disciplines. As these commitments are what imbue cognitive linguistics with its distinctive character, and differentiate it from formal linguistics.

If you are into psychology and figuring out how to manipulate people or a Google ranking, perhaps this is the intellectual gold worth more than stolen treasure from Montezuma.

Several observations:

  1. I eagerly await an estimate from IDC for the size of the cognitive linguistics market, and I am panting with anticipation for a Garnter magic quadrant which positions companies as leaders, followers, outfits which did not pay for coverage, and names found with a Google search at Starbuck’s south of the old PanAm Building. Cognitive linguistics will have to wait until the two giants of expertise figure out how to define “personal computer market”, however.
  2. A series of posts from Dave Amerland and assorted wizards at SEO blogs which explain how to use the magic of cognitive linguistics to make a blog page — regardless of content, value, and coherence — number one for a Google query.
  3. A how to book from Wiley publishing called “Cognitive Linguistics for Dummies” with online reference material which may or many not actually be available via the link in the printed book
  4. A series of conferences run by assorted “instant conference” organizers with titles like “The Cognitive Linguistics Summit” or “Cognitive Linguistics: Global Impact”.

So many opportunities. Be still, my heart.

Cognitive linguistics — it’s time has come. Not a minute too soon for a couple of floundering enterprise search vendors to snag the buzzword and pivot to implementing cognitive linguistics for solving “all your information needs.” Which search company will embrace this technology: Coveo, IBM Watson, Sinequa?

DarkCyber is excited.

Stephen E Arnold, July 13, 2019

Amazon and YouTube: The Hong Kong Protests Mark the Day that Twitch.tv Made Clear the Limitations of YouTube

June 16, 2019

I heard there was a small protest underway in Hong Kong. The time is now 6 30 am US Eastern time. I navigated to YouTube, entered the query “Hong Kong protest”, and I saw links to videos from a day ago (today is June 16, 2019). I navigated to the YouTube “Live” page which provides a limited selection of streaming videos on YouTube. If you have not seen that somewhat incomplete index, navigate to https://www.youtube.com/live. No live stream of the Hong Kong protest.

If it’s not on YouTube, then it doesn’t exist, goes some old times’ catchphrase.

Well, not quite.

Navigate to Amazon’s Twitch.tv. Run a query for Hong Kong. Here’s what I saw before I clicked on the live stream of Unable to Breath.

image

Amazon Twitch.tv search result. The Unable to Breath stream is not one but an aggregate of eight separate feeds from Hong Kong.

Front and center was a link to Unable to Breath, which presents this streaming image:

image

This is a screen shot of a single screen which is eight different feeds showing different views of the handful of people who are participating in the event. Note: Handful means more than one million.

Notice that three are eight live streams of this modest protest. This is one live stream with eight separate views of the modest demonstration in Hong Kong. Eight in one stream! No registration required. No in stream pop up ads. Just high value intelligence in pretty good streaming video quality.

Read more

Google: Can Semantic Relaxing Display More Ads?

June 10, 2019

For some reason, vendors of search systems have shuddered if a user’s query returns a null set. the idea is that a user sends a query to a system or more correctly an index. The terms in the query do not match entries in the database. The system displays a message which says, “No results match your query.”

For some individuals, that null set response is high value information. One can bump into null sets when running queries on a Web site; for example, send the anti fungicide query to the Arnold Information Technology blog at this link. Here’s the result:

image

From this response, one knows that there is no content containing the search phrase. That’s valuable for some people.

To address this problem, modern systems “relax” the query. The idea is that the user did not want what he or she typed in the search box. The search system then changes the query and displays those results to the stupid user. Other systems take action and display results which the system determines are related to the query. You can see these relaxed results when you enter the query shadowdragon into Google. Here are the results:

image

Google ignored my spelling and displays information about a video game, not the little known company Shadowdragon. At least Google told me what it did and offers a way to rerun the query using the word I actually entered. But the point is that the search was “relaxed.”

The purpose of semantic expansion is a variation of Endeca’s facets. The idea is that a key word belongs to a category. If a system can identify a category, then the user can get more results by selecting the category and maybe finding something useful. Endeca’s wine demonstration makes this function and its value clear.

Read more

Nosing Beyond the Machine Learning from Human Curated Data Sets: Autonomy 1996 to Smart Software 2019

April 24, 2019

How does one teach a smart indexing system like Autonomy’s 1996 “neurodynamic” system?* Subject matter experts (SMEs) assembled training collection of textual information. The article and other content would replicate the characteristics of the content which the Autonomy system would process; that is, index and make searchable or analyzable. The work was important. Get the training data wrong and the indexing system would assign metadata or “index terms” and “category names” which could cause a query to generate results the user could perceive as incorrect.

image

How would a licensee adjust the Autonomy “black box”? (Think of my reference to Autonomy and search as a way of approaching “smart software” and “artificial intelligence.”)

The method was to perform re-training. The approach was practical and for most content domains, the re-training worked. It was an iterative process. Because the words in the corpus fed into the “black box” included new words, concepts, bound phrases, entities, and key sequences, there were several functions integrated into the basic Autonomy system as it matured. Examples ranged from support for term lists (controlled vocabularies) and dictionaries.

The combination of re-training and external content available to the system allowed Autonomy to deliver useful outputs.

Where the optimal results departed from the real world results usually boiled down to several factors, often working in concert. First, licensees did not want to pay for re-training. Second, maintenance of the external dictionaries was necessary because new entities arrive with reasonable frequency. Third, testing and organizing the freshening training sets and the editorial work required to keep dictionaries ship shape was too expensive, time consuming, and tedious.

Not surprisingly, some licensees grew unhappy with their Autonomy IDOL (integrated data operating layer) system. That, in my opinion, was not Autonomy’s fault. Autonomy explained in the presentations I heard what was required to get a system up and running and outputting results that could easily hit 80 percent or higher on precision and recall tests.

The Autonomy approach is widely used. In fact, wherever there is a Bayesian system in use, there is the training, re-training, external knowledge base demand. I just took a look at Haystax Constellation. It’s Bayesian and Haystax makes it clear that the “model” has to be training. So what’s changed between 1996 and 2019 with regards to Bayesian methods?

Nothing. Zip. Zero.

Read more

Expert System: Interesting Financials

April 6, 2019

Expert System SpA is a firm providing semantic software that extracts knowledge from text by replicating human processes. I noticed information on the company’s Web site which informed me:

  • The company had sales revenues of 28.7 million euros for 2018
  • The company’s growth was 343 percent compared to 2017
  • The net financial position was 12.4 million euros up from 8.8 million euros in March 2017.

Remarkable financial performance.

Out of curiosity I navigated to Google Finance and plugged in Expert System Spa to see what data the GOOG could offer.

Here’s the chart displayed on April 6, 2019:

image

The firm’s stock does not seem to be responding as we enter the second quarter of 2019.

Read more

Facebook: Ripples of Confusion, Denial, and Revisionism

March 18, 2019

Facebook contributed to an interesting headline about the video upload issue related to the bad actor in New Zealand. Here’s the headline I noted as it appeared on Techmeme’s Web page:

image

The Reuters’ story ran a different headline:

image

What caught my attention is the statement “blocked at upload.” If a video were blocked at upload, were those videos removed? If blocked, then the number of videos drops to 300 million.

This type of information is typical of the coverage of Facebook, a company which is become the embodiment of social media.

There were two other interesting Facebook stories in my news feed this morning.

The first concerns a high profile Silicon Valley investor, Marc Andreessen. The write up reports and updates a story whose main point is:

Facebook Board Member May Have Met Cambridge Analytica Whistleblower in 2016.

Read more

The Search Wars: When Open Starts to Close

March 12, 2019

Compass Search. The precursor. The result? Elasticsearch. No proprietary code. Free and open source. The world of enterprise search shifted.

As a result of Shay Bannon’s efforts, an alternative to proprietary search and interesting financial maneuvers, an individual or organization could download code and set up a functional enterprise search system.

There are proprietary search systems available like Coveo. But most of the offerings are sort of open sourcey. It is a marketing ploy. The forward leaning companies do not use the word search to market their products because zippier functionality is what brings tire kickers and some buyers.

The landscape of search seems to be doing its Hawaii volcano act. No real eruption buts shakes, hot gas, and cracks have begun to appear. The lava flows will come soon enough.

a bezos art

The path is clear to the intrepid developer.

The tip off is Amazon’s announcement that it now offers an open distro for Elasticsearch. Why is Amazon taking this step? The company explains:

Elasticsearch has become an essential technology for log analytics and search, fueled by the freedom open source provides to developers and organizations. Our goal is to ensure that open source innovation continues to thrive by providing a fully featured, 100% open source, community-driven distribution that makes it easy for everyone to use, collaborate, and contribute.

DarkCyber’s briefings about Amazon’s policeware initiative suggest that the online bookstore is adding another component to its robust intelligence system and services.

The move involves or will involve:

  • Entrepreneurs who will see Amazon as creating low friction for new products and services
  • Partners because implementing search can be a consulting gold mine
  • Users
  • Developers who will use an Amazon “off the shelf” solutions
  • Competitors who may find the “other open source” Elasticsearch lagging behind the Amazon “house brand”.

The move is not much of a surprise. Amazon seeks to implement its version of IBM’s 1960s style vendor lock in. Open source is open source, isn’t it? A version of the popular Elasticsearch system which has utility in commercial products to add ons which help make log files more mine-able. Plus search snaps into the DNA of the Amazon jungle of services, functions, features, and services. Where there is confusion, there are opportunities to make money.

Adding a house brand to its ecosystem is a basic tactic in the Amazon playbook. Those T shirts with the great price are Amazon’s, not the expensive stuff with a fancy brand name. T shirts and search? Who cares?

What’s the play mean for over extended proprietary search systems which may never generate a pay day for investors? A lot of explaining seems likely.

What the play mean for Elastic, the company which now operates the son of Compass Search? Some long off site meetings may be ahead and maybe some chats with legal eagles.

What’s the play mean for vendors using Amazon as back end plumbing for their enterprise or policeware services? A swap out of the Elasticsearch system for the Amazon version could be in the cards. Amazon Elasticsearch will probably deliver fewer headaches and lost weekends than using the Banon-Elastic version. Who wants headaches in an already complex, expensive implementation?

The Register quotes an evangelist from AWS as saying:

“We will continue to send our contributions and patches upstream to advance these projects.”

DarkCyber interprets this action and Amazon’s explanations from the perspective and context of a high school football coach:

“Front line, listen up, fork that QB. I want that guy put down. Hard. Let’s go.”

Amazon. The best defense is a good offense, right?

The coach shouts:

“Let’s hit those Sheep hard. Arrrgh.”

Stephen E Arnold, March 12, 2019

« Previous PageNext Page »

  • Archives

  • Recent Posts

  • Meta