Amazon Policeware: One Possible Output

October 1, 2019

Investigations focus on entities and timelines. The context includes the legal wrapper, procedures, impressions, and similar information usually resident in investigators and their colleagues.

Why gather data unless there is a payoff. The payoff from data in terms of Amazon’s policeware includes these upsides:

  • Data which informs new products and services, especially those signals for latent demand
  • Raw material for analytical processes such as those performed by superordinate Amazon Web Services
  • Outputs which have market magnetism; that is, the product is desirable and LE and intel customers want to buy it.

This illustration which I have taken from my October 2, 2019, TechnoSecurity lecture and from my Amazon policeware webinar illustrates three points:

First, raw data are acquired by Amazon. The sources are diverse and some are unique to Amazon; for example, individual and enterprise purchasing data.

Second, the AWS policeware platform which performs normalization, indexing, and analysis from historic and real time data flows; for example, what books did an individual purchase and when.

Third, an output in the form of a profile or report about a person of interest.

image

© Stephen E Arnold 2019

I know the image is difficult to read. There are two ways to address this issue. You can attend my lectures at the San Antonio conference or you can sign up for my Amazon policeware webinar.

No Epstein supporters, fans, and acquaintances should express interest in my research. Sorry. I am old fashioned.

Stephen E Arnold, October 1, 2019

Amazon: The Surveillance Mesh Play

September 30, 2019

DarkCyber received a complaint about the small size of the image from my webinar about Amazon Policeware. There are two remedies for tiny images. You can attend my policeware lecture at the TechnoSecurity & Digital Forensics Conference in San Antonio on Wednesday, October 2. Qualified attendees can request a PDF of the image. Second, you can contact DarkCyber at benkent2020 at yahoo dot com and sign up for our LE, security, and intel personnel webinar.

Today, I want to provide several findings from our research related to Amazon Policeware. These are:

  • Amazon’s mesh network in the Sidewalk product provides a solution to blanketing a city with a data collection component. This wide field outdoor mesh network may fail. In the meantime, you may be able to locate your dog if it is wearing a Fetch.
  • Amazon’s Ring doorbell provides an anchor for fixed video feeds. The resolution is poor and the system is far from comprehensive, but the test mechanism is sufficiently compelling for several hundred police departments to show interest.
  • The supplementary data collection devices shown in the figure below feed into the AWS policeware platform. That platform performs a number of analytic functions. Cross correlation is one of these.

image

© Stephen E Arnold, 2019

So what?

In the US, Amazon is moving forward to put in place a next generation service which provides a new tool to enforcement authorities. The system delivers other benefits to Amazon as well.

DarkCyber identifies some parallels between the efforts the government of China is making with Amazon’s activities.

Will the Epstein friendly academic institution get this story straight? Probably not.

Stephen E Arnold, September 30, 2019

Amazon Policeware: The Path to IBM-Style Lock In on Steroids

September 27, 2019

Quite a bit of Amazon news has flowed through the DarkCyber system. The problem is that most of the information is oblivious to Amazon’s policeware initiative. DarkCyber’s research suggests that Amazon is building a surveillance system. One DarkCyber team member said, “Amazon is building what China has been working on for several years.” Is this DarkCyber researcher correct? Who knows?

I do want to provide a diagram from our Amazon webinar which puts Amazon’s activities into a context for enforcement. The scope of Amazon’s business strategy extends beyond local law enforcement and the Ring video doorbell activities, beyond the cloud services for several US government agencies, and beyond the company’s online businesses.

Amazon may be positioning itself to provide:

  • IRS-related services associated with tax investigations
  • Drug enforcement actions related to physicians who allegedly overprescribe or entities which obtain certain compounds using obfuscation methods
  • SEC-related services to determine entity interaction, expenditures, and related financial activities
  • Credit verification, including other financial analyses, for government and retail financial activities.

Other “extensions” are possible. What’s interesting is that few have noticed and even fewer pay much attention beyond hand waving about Alexa. There’s more than Alexa, which is a low level gateway service.

Here’s the diagram, which is copyrighted by Stephen E Arnold, operator of DarkCyber, and author of the forthcoming monograph, Dark Edge: Amazon’s Policeware Initiative.

image

© Stephen E Arnold, 2019.

How do you use this diagram? Just map Amazon’s most recent product announcements into the grid.

The DarkCyber Amazon policeware webinar walks through the tactics and the strategy for this “in plain sight” play. Analysts, journalists, policeware vendors paying Amazon to host their systems, and Microsoft-type outfits are oblivious to what is now the end game for a 12 year push by Amazon to make IBM-style lock in seem as quaint as a Model T Ford.

For those who recycle my information and claim it as your own creative output, why not be somewhat ethical and provide attribution. You know. Old-fashioned stuff like a footnote. Yep, that includes a real journalist who writes for the New York Times and the Epstein linked MIT publication, among others.

Stephen E Arnold, September 27, 2019

Search: Useless Results Finally Recognized?

August 22, 2019

I cannot remember how many years ago it was since I wrote “Search Sucks” for Barbara Quint, the late editor of Searcher. I recall her comment to me, “Finally, someone in the industry speaks out.”

Flash forward a decade. I can now repeat her comment to me with some minor updating: “Finally someone recognized by the capitalist tool, Forbes Magazine, recognizes that search sucks.

The death of search was precipitated by several factors. Mentioning these after a decade of ignoring Web search still makes me angry. The failure of assorted commercial search vendors, the glacial movement of key trade associations, and the ineffectuality of search “experts” still makes me angry.

Image result for fake information

There are other factors contributing to the sorry state of Web search today. Note: I am narrowing my focus to the “free” Web search systems. If I have the energy, I may focus on the remarkable performance of “enterprise search.” But not today.

Here are the reasons Web search fell to laughable levels of utility:

  1. Google adopted the GoTo / Overture / Yahoo approach to determining relevance. This is the pay-to-play model.
  2. Search engine optimization “experts” figured out that Google allowed some fiddling with how it determined “relevance.” Google and other ad supported search systems then suggested that those listings might decay. The fix? Buy ads.
  3. Users who were born with mobile phones and flexible fingers styled themselves “search experts” along with any other individual who obtains information by looking for “answers” in a “free” Web search system.
  4. The willful abandonment of editorial policies, yardsticks like precision and recall, and human indexing guaranteed that smart software would put the nails in the coffin of relevance. Note: artificial intelligence and super duped automated indexing systems are right about 80 percent of the time when hammering scientific, technical, and engineering information. Toss is blog posts, tweets, and Web content created by people who skipped high school English and the accuracy plummets. Way down, folks. Just like facial recognition systems.

The information presented in “As Search Engines Increasingly Turn To AI They Are Harming Search” is astounding. Not because it is new, but because it is a reflection of what I call the Web search mentality.

Here’s an example:

Yet over the past few years, search engines of all kinds have increasingly turned to deep learning-powered categorization and recommendation algorithms to augment and slowly replace the traditional keyword search. Behavioral and interest-based personalization has further eroded the impact of keyword searches, meaning that if ten people all search for the same thing, they may all get different results. As search engines depreciate traditional raw “search” in favor of AI-assisted navigation, the concept of informational access is being harmed and our digital world is being redefined by the limitations of today’s AI.

The problem is not artificial intelligence.

Read more

New Jargon: Consultants, Start Your Engines

July 13, 2019

I read “What Is “Cognitive Linguistics“? The article appeared in Psychology Today. Disclaimer: I did some work for this outfit a long time ago. Anybody remember Charles Tillinghast, “CRM” when it referred to people, not a baloney discipline for a Rolodex filled with sales lead, and the use of Psychology Today as a text in a couple of universities? Yeah, I thought not. The Ziff connection is probably lost in the smudges of thumb typing too.

Onward: The write up explains a new spin on psychology, linguistics, and digital interaction. The jargon for this discipline or practice, if you will is:

Cognitive Linguistics

I must assume that the editorial processes at today’s Psychology Today are genetically linked to the procedures in use in — what was it, 1972? — but who knows.

excited fixed

Here’s the definition:

The cognitive linguistics enterprise is characterized by two key commitments. These are:
i) the Generalization Commitment: a commitment to the characterization of general principles that are responsible for all aspects of human language, and
ii) the Cognitive Commitment: a commitment to providing a characterization of general principles for language that accords with what is known about the mind and brain from other disciplines. As these commitments are what imbue cognitive linguistics with its distinctive character, and differentiate it from formal linguistics.

If you are into psychology and figuring out how to manipulate people or a Google ranking, perhaps this is the intellectual gold worth more than stolen treasure from Montezuma.

Several observations:

  1. I eagerly await an estimate from IDC for the size of the cognitive linguistics market, and I am panting with anticipation for a Garnter magic quadrant which positions companies as leaders, followers, outfits which did not pay for coverage, and names found with a Google search at Starbuck’s south of the old PanAm Building. Cognitive linguistics will have to wait until the two giants of expertise figure out how to define “personal computer market”, however.
  2. A series of posts from Dave Amerland and assorted wizards at SEO blogs which explain how to use the magic of cognitive linguistics to make a blog page — regardless of content, value, and coherence — number one for a Google query.
  3. A how to book from Wiley publishing called “Cognitive Linguistics for Dummies” with online reference material which may or many not actually be available via the link in the printed book
  4. A series of conferences run by assorted “instant conference” organizers with titles like “The Cognitive Linguistics Summit” or “Cognitive Linguistics: Global Impact”.

So many opportunities. Be still, my heart.

Cognitive linguistics — it’s time has come. Not a minute too soon for a couple of floundering enterprise search vendors to snag the buzzword and pivot to implementing cognitive linguistics for solving “all your information needs.” Which search company will embrace this technology: Coveo, IBM Watson, Sinequa?

DarkCyber is excited.

Stephen E Arnold, July 13, 2019

Amazon and YouTube: The Hong Kong Protests Mark the Day that Twitch.tv Made Clear the Limitations of YouTube

June 16, 2019

I heard there was a small protest underway in Hong Kong. The time is now 6 30 am US Eastern time. I navigated to YouTube, entered the query “Hong Kong protest”, and I saw links to videos from a day ago (today is June 16, 2019). I navigated to the YouTube “Live” page which provides a limited selection of streaming videos on YouTube. If you have not seen that somewhat incomplete index, navigate to https://www.youtube.com/live. No live stream of the Hong Kong protest.

If it’s not on YouTube, then it doesn’t exist, goes some old times’ catchphrase.

Well, not quite.

Navigate to Amazon’s Twitch.tv. Run a query for Hong Kong. Here’s what I saw before I clicked on the live stream of Unable to Breath.

image

Amazon Twitch.tv search result. The Unable to Breath stream is not one but an aggregate of eight separate feeds from Hong Kong.

Front and center was a link to Unable to Breath, which presents this streaming image:

image

This is a screen shot of a single screen which is eight different feeds showing different views of the handful of people who are participating in the event. Note: Handful means more than one million.

Notice that three are eight live streams of this modest protest. This is one live stream with eight separate views of the modest demonstration in Hong Kong. Eight in one stream! No registration required. No in stream pop up ads. Just high value intelligence in pretty good streaming video quality.

Read more

Google: Can Semantic Relaxing Display More Ads?

June 10, 2019

For some reason, vendors of search systems have shuddered if a user’s query returns a null set. the idea is that a user sends a query to a system or more correctly an index. The terms in the query do not match entries in the database. The system displays a message which says, “No results match your query.”

For some individuals, that null set response is high value information. One can bump into null sets when running queries on a Web site; for example, send the anti fungicide query to the Arnold Information Technology blog at this link. Here’s the result:

image

From this response, one knows that there is no content containing the search phrase. That’s valuable for some people.

To address this problem, modern systems “relax” the query. The idea is that the user did not want what he or she typed in the search box. The search system then changes the query and displays those results to the stupid user. Other systems take action and display results which the system determines are related to the query. You can see these relaxed results when you enter the query shadowdragon into Google. Here are the results:

image

Google ignored my spelling and displays information about a video game, not the little known company Shadowdragon. At least Google told me what it did and offers a way to rerun the query using the word I actually entered. But the point is that the search was “relaxed.”

The purpose of semantic expansion is a variation of Endeca’s facets. The idea is that a key word belongs to a category. If a system can identify a category, then the user can get more results by selecting the category and maybe finding something useful. Endeca’s wine demonstration makes this function and its value clear.

Read more

Nosing Beyond the Machine Learning from Human Curated Data Sets: Autonomy 1996 to Smart Software 2019

April 24, 2019

How does one teach a smart indexing system like Autonomy’s 1996 “neurodynamic” system?* Subject matter experts (SMEs) assembled training collection of textual information. The article and other content would replicate the characteristics of the content which the Autonomy system would process; that is, index and make searchable or analyzable. The work was important. Get the training data wrong and the indexing system would assign metadata or “index terms” and “category names” which could cause a query to generate results the user could perceive as incorrect.

image

How would a licensee adjust the Autonomy “black box”? (Think of my reference to Autonomy and search as a way of approaching “smart software” and “artificial intelligence.”)

The method was to perform re-training. The approach was practical and for most content domains, the re-training worked. It was an iterative process. Because the words in the corpus fed into the “black box” included new words, concepts, bound phrases, entities, and key sequences, there were several functions integrated into the basic Autonomy system as it matured. Examples ranged from support for term lists (controlled vocabularies) and dictionaries.

The combination of re-training and external content available to the system allowed Autonomy to deliver useful outputs.

Where the optimal results departed from the real world results usually boiled down to several factors, often working in concert. First, licensees did not want to pay for re-training. Second, maintenance of the external dictionaries was necessary because new entities arrive with reasonable frequency. Third, testing and organizing the freshening training sets and the editorial work required to keep dictionaries ship shape was too expensive, time consuming, and tedious.

Not surprisingly, some licensees grew unhappy with their Autonomy IDOL (integrated data operating layer) system. That, in my opinion, was not Autonomy’s fault. Autonomy explained in the presentations I heard what was required to get a system up and running and outputting results that could easily hit 80 percent or higher on precision and recall tests.

The Autonomy approach is widely used. In fact, wherever there is a Bayesian system in use, there is the training, re-training, external knowledge base demand. I just took a look at Haystax Constellation. It’s Bayesian and Haystax makes it clear that the “model” has to be training. So what’s changed between 1996 and 2019 with regards to Bayesian methods?

Nothing. Zip. Zero.

Read more

Expert System: Interesting Financials

April 6, 2019

Expert System SpA is a firm providing semantic software that extracts knowledge from text by replicating human processes. I noticed information on the company’s Web site which informed me:

  • The company had sales revenues of 28.7 million euros for 2018
  • The company’s growth was 343 percent compared to 2017
  • The net financial position was 12.4 million euros up from 8.8 million euros in March 2017.

Remarkable financial performance.

Out of curiosity I navigated to Google Finance and plugged in Expert System Spa to see what data the GOOG could offer.

Here’s the chart displayed on April 6, 2019:

image

The firm’s stock does not seem to be responding as we enter the second quarter of 2019.

Read more

Facebook: Ripples of Confusion, Denial, and Revisionism

March 18, 2019

Facebook contributed to an interesting headline about the video upload issue related to the bad actor in New Zealand. Here’s the headline I noted as it appeared on Techmeme’s Web page:

image

The Reuters’ story ran a different headline:

image

What caught my attention is the statement “blocked at upload.” If a video were blocked at upload, were those videos removed? If blocked, then the number of videos drops to 300 million.

This type of information is typical of the coverage of Facebook, a company which is become the embodiment of social media.

There were two other interesting Facebook stories in my news feed this morning.

The first concerns a high profile Silicon Valley investor, Marc Andreessen. The write up reports and updates a story whose main point is:

Facebook Board Member May Have Met Cambridge Analytica Whistleblower in 2016.

Read more

« Previous PageNext Page »

  • Archives

  • Recent Posts

  • Meta