Data Are a Problem? And the Solution Is?

January 8, 2020

I attended a conference about managing data last year. I sat in six sessions and listened as enthusiastic people explained that in order to tap the value of data, one has to have a process. Okay? A process is good.

Then in each of the sessions, the speakers explained the problem and outlined that knowing about the data and then putting it in a system is the way to derive value.

Neither Pros Nor Cons: Just Consulting Talk

This morning I read an article called “The Pros and Cons of Data Integration Architectures.” The write up concludes with this statement:

Much of the data owned and stored by businesses and government departments alike is constrained by the silos it’s stuck in, many of which have been built over the years as organizations grow. When you consider the consolidation of both legacy and new IT systems, the number of these data silos only increases. What’s more, the impact of this is significant. It has been widely reported that up to 80 per cent of a data scientist’s time is spent on collecting, labeling, cleaning and organizing data in order to get it into a usable form for analysis.

Now this is most true. However, the 80 percent figure is not backed up. An IDG expert whipped up some percentages about data and time, and these, I suspect, have become part of the received wisdom of those struggling with silos for decades. Most of a data scientist’s time is frittered away in meetings, struggling with budgets and other resources, and figuring out what data are “good” and what to do with the data identified by person or machine as “bad.”

The source of this statement is MarkLogic, a privately held company founded in 2001 and a magnet for $173 million from funding sources. That works out to an 18 years young start up if DarkCyber adopts a Silicon Valley T shirt.


A modern silo is made of metal and impervious to some pests and most types of weather.

One question the write up begs is, “After 18 years, why hasn’t the methodology of MarkLogic swept the checker board?” But the same question can be asked of other providers’ solutions, open source solutions, and the home grown solutions creaking in some government agencies in Europe and elsewhere.

Several reasons:

  1. The technical solution offered by MarkLogic-type companies can “work”; however, proprietary considerations linked with the issues inherent in “silos” have caused data management solutions to become consultantized; that is, process becomes the task, not delivering on the promise of data, elther dark or sunlit.
  2. Customers realize that the cost of dealing with the secrecy, legal, and technical problems of disparate, digital plastic trash bags of bits cannot be justified. Like odd duck knickknacks one of my failed publishers shoved into his lumber room, ignoring data is often a good solution.
  3. Individuals tasked with organizing data begin with gusto and quickly morph into bureaucrats who treasure meetings with consultants and companies pitching magic software and expensive wizards able to make the code mostly work.

DarkCyber recognizes that with boundaries like budgets, timetables, measurable objectives, federation can deliver some zip.

Silos: A Moment of Reflection

The article uses the word “silo” five times. That’s the same frequency of its use in the presentations to which I listened in mid December 2019.


So you want to break down this missile silo which is hardened and protected by autonomous weapons? That’s what happens when a data scientist pokes around a pharma company’s lab notebook for a high potential new drug.

Let’s pause a moment to consider what a silo is. A silo is a tower or a pit used to store core, wheat, or some other grain. Dust is silos can be exciting. Tip: Don’t light a match in a silo on a dry, hot day in a state where farms still operate. A silo can also be a structure used to house a ballistic missile, but one has to be a child of the Cold War to appreciate this connotation.

As applied to data, it seems that a silo is a storage device containing data. Unlike a silo used to house maize or a nuclear capable missile, the data silo contains information of value. How much value? No one knows. Are the data in a digital silo explosive? Who knows? Maybe some people should not know? What wants to flick a Bic and poke around?

Read more

Informatica: A Play for Greater Relevance in an Amazon Chess Game?

January 3, 2020

Informatica was set up in 1993. The company was private, then public, and now private. Its new CEO is a former McKinsey professional, a background which some may find reassuring and others terrifying. (McKinsey had a racketeering lawsuit dismissed. How does a consulting firm ensnare itself in an allegation of racketeering? I will leave it to you to answer that question.)

The big news, however, is that Informatica is making an attempt to retain its relevance and increase its impact among Fortune 1000 firms, investment banks, financial services firms, insurance companies, and other blue chip customers.

The method, its seems to DarkCyber, involves Amazon. Keep in mind that Informatica’s previous attempts to add some zing to its quarter century of database-related work involved Microsoft and Salesforce, both next big things.

According to “Informatica Aims to Better Track Data Lineage with AI-Powered Data Catalog,”

its AI-powered data catalog, called Catalog of Catalogs is notable because it is trying to track data lineage across ecosystems. Catalog of Catalogs includes metadata scanners for business intelligence, data warehouses, big data and third party repositories.

The “new” Informatica is represented in this graphic, which has a remarkable resemblance to Amazon Web Services blockchain diagrams:


Is this an Amazon diagram in recognizable AWS orange or an Informatica diagram?

There’s a hook to Amazon’s data marketplace technology, support for Amazon’s smart workflow, and the federation of metadata.

But what’s missing in this real news story?

Read more

Ancient Search Recipes: Bread Pork Chops

December 16, 2019

I noted a report in the Times of Israel titled “Cache of Crypto-Jewish Recipes Dating to Inquisition Found in Miami Kitchen.” One of the recipes explained how to make a pork chop from bread and milk. (Dairy? Guess so.) Here’s what you and I can whip up using this ancient recipe:


The cookbook contains information which the author “didn’t think to question the idiosyncratic customs her mother and grandmothers practiced in the kitchen.”

By coincidence, my news alert spit out this article in the same list: “The Growth of Cognitive Search in the Enterprise, and Why It Matters.”

Magic. Bread pork chops created from zeros and ones.

Search matters. Cognitive search matters more. Who buys? The enterprise.

The write up recycles the equivalent of the break pork chop formula. Mix jargon, sprinkle with the notion of federated data, and bake until the checks clear the bank.

The article is fascinating, and it overlooks a few milestones in the history of enterprise search. What for example? Glad you asked:

  1. Forrester, the Wave folks, has created a report for its paying customers which reveal that search is now cognitive, able to tap dark data, and ready for prime time. Again! The Wave returns.
  2. Big companies are into search, including Microsoft  with its Fast inspired solution and Amazon Kendra with an open source how de doo to Elastic and LucidWorks. Some use old spices; others, open source flavoring with proprietary special seasonings.
  3. Outfits which have been around for more than a decade like Coveo are now smarter than ever in their decade long effort to pay off their patient investors
  4. Autonomy gets a nod despite the interesting trial underway in the UK.

The point is that enterprise search is going to be in the news whether anyone wants to revisit hyperbole which makes the chatter around artificial intelligence and quantum computing seem rational and credible.

Here’s a quick refresher about why untapped data in an organization is likely to remained untapped or at the very least not tapped by vendors of smart key word search systems:

First, data are in silos for a reason. No enterprise search system with which I am familiar can navigate the permissions and access controls required to put siloed data in one index. There’s a chance that the Amazon blockchain permissions system can deliver this, but for now, the patents are explanations and federated enterprise search is a sales pitch.

Read more

Arnold Interviewed about Amazon Blockchain Inventions

December 5, 2019

Robert David Steele, former CIA professional and open source intelligence expert, interviewed Stephen E Arnold about Amazon’s blockchain inventions. Arnold recently completed a chapter for a forthcoming academic press book about blockchain. That chapter and its information prompted journalists from the US and France to interview Arnold about his findings. Arnold’s information was included in news stories appearing in the New York Times, MIT Technology Review, and Le Monde.


Steele obtained an exclusive video interview with Arnold about his Amazon blockchain research. Among the topics discussed in the 30 minute program are:

  • The “trigger” for the research
  • Sources of data and research methods
  • The major findings from the 18 month research project
  • The likely trajectory of Amazon’s products and services incorporating the company’s more than 12 blockchain inventions.
  • How to obtain a summary of Arnold’s research findings.

You can view the video at this link. Steele has compiled links to other Amazon information obtained from Arnold at this link.

Kenny Toth, December 5, 2019

Why Is MiningLamp Getting Ink?

December 3, 2019

The question “Why is MiningLamp getting ink?” is an interesting one to some people. The firm was founded in 2014. The company was a product of bunsha practiced by Miaozhen Systems, a company engaged in advertising “analysis.” The company is funded by Tencent, China Renaissance, and Sequoia Capital China. The firm may have revenues in the hundreds of millions of dollars. Data about the influence of the Chinese government is not available to the DarkCyber team at this time. MiningLamp may have received as much as $290 million from its backers.


Companies want publicity to get sales leads, attract investors, create buzz to lure new hires, and become known to procurement professionals in government agencies.


We noted talk about MiningLamp at a couple of law enforcement and intelligence conferences. The company provides policeware and intelware to customers in China and elsewhere. You can read about the firm on its Web site at this link. (Be patient. The service seems to provide a high latency experience.) Product pages also seem to be missing in action.

Nevertheless, “Chinese Data Mining Firm MiningLamp, Now a National AI Champion, Began by Helping Police Solve Crimes” does not talk about a dearth of public information. The write up states that “MiningLamp’s business analytics tools are used by more than 200 companies in the Fortune 200.” That’s a lot of big companies embracing investigative software. Judging from the attendees at law enforcement and intelligence conference, these big companies are finding out about a Chinese company somehow.

The news story states that “Like Palantir, this Chinese start up uses AI to help corporate clients convert huge volumes of data into actionable information.” Palantir is a big ticket item. Perhaps price is a factor or Fortune 200 companies want to rely on a business intelligence system operated by a company located outside the span of control of some government authorities.

The company has been named a Chinese champion. The article reveals:

Although not as well known as US equivalent Palantir Technologies, which reportedly contributed to America’s success in hunting down Osama bin Laden, MiningLamp’s data mining software is used to spot crime patterns, track drug dealers and prevent human trafficking.

DarkCyber thinks that any company which has 200 Fortune listed companies as customers is reasonably well known.

We learned:

“Cases are being resolved on our platforms every day” in more than 60 cities and regions in China, said founder and CEO Wu Minghui. “We can run fast analysis on potential drug dealers or major suspects, improving the overall case-solving efficiency several hundred times.”

Read more

Amazon Loses JEDI: Now What?

October 26, 2019

Friday (October 25, 2019) Amazon and the Bezos bulldozer drove into a granite erratic. The Department of Defense awarded the multi-year, multi-billion dollar contract for cloud services to Microsoft. “Microsoft Snags Hotly Contested $10 Billion Defense Contract, Beating Out Amazon” reported the collision between PowerPoint’s owner and the killing machine which has devastated retail.


CNBC reports:

If the Joint Enterprise Defense Infrastructure deal, known by the acronym JEDI, ends up being worth $10 billion, it would likely be a bigger deal to Microsoft than it would have been to Amazon. Microsoft does not disclose Azure revenue in dollar figures but it’s widely believed to have a smaller share of the market than Amazon, which received $9 billion in revenue from AWS in the third quarter.

The write up pointed out:

While Trump didn’t cite Amazon CEO Jeff Bezos by name at the time, the billionaire executive has been a constant source of frustration for the president. Bezos owns The Washington Post, which Trump regularly criticizes for its coverage of his administration. Trump also has gone after Amazon repeatedly on other fronts, such as claiming it does not pay its fair share of taxes and rips off the U.S. Post Office.

There are other twists and turns to the JEDI story, but I will leave it to you, gentle reader, to determine if the Oracle anti-Amazon campaign played a role.

There are some questions which I discussed with my DarkCyber team when we heard the news as a rather uneventful week in the technology world wound down. Let’s look at four of these and the “answers” my team floated as possibilities.

Question 1: Will this defeat alter Amazon’s strategy for policeware and intelware business?

Answer 1: No. Since 2007, Amazon has been grinding forward in the manner of the Bezos bulldozer with its flywheel spinning and its electricity sparking. As big as $10 billion is, Amazon has invested significant time and resources in policeware and intelware inventions like DeepLens, software like SageMaker, and infrastructure designed to deliver information that many US government agencies will want and for which many of the more than 60 badge-and-gun entities in the US government will pay. The existing sales team may be juggled as former Microsoft government sales professional Teresa Carlson wrestles with the question, “What next?” Failure turns on a bright spotlight. The DoD is just one, albeit deep pocket entity, of many US government agencies needing cloud services. And there is always next year which begins October 1, 2020.

Question 2: Has Amazon tuned its cloud services and functions to the needs of the Department of Defense?

Answer 2: No. Amazon offers services which meet the needs of numerous government agencies at the federal as well as local jurisdictional levels. In fact, there is one US government agency deals with more money than the DoD that is a potential ATM for Amazon. The Bezos bulldozer drivers may be uniquely positioned to deliver cloud services and investigative tools with the potential payout to Amazon larger than the JEDI deal.

Read more

Amazon Policeware: Getting Visible in Spite of Amazon

October 9, 2019

An enterprising reporter included some information from my Amazon research. You can find these open source factoids in “Meet America’s Newest Military Giant: Amazon.” Like good recipients of Jeffrey Epstein love, the publication will enjoin you to pay to read the recycled version of my research. Hey, that’s capitalism in action.

The write up does veer from “military giant” into policeware, a term I coined to make clear that there are platforms, applications, and tools purpose-built to support law enforcement, analysts, and investigators.


© Stephen E Arnold, 2016

You may want to read the article and take a look at the information I have published in this blog and on YouTube and Vimeo. The search systems struggle to highlight this content, but that’s the way life is in the world of ad-supported search. (Tip: To locate the information, use the search box on this Web site or you can explore these short videos at these links:

October 30, 2018

November 6, 2018

November 13, 2018

November 20, 2018

Another peek at Amazon’s activities is provided in a side mirror attached to a speeding Chevrolet Volt. “Ring’s Police Partnerships Must End, Say More Than 30 Civil Rights Groups” is an “open letter.” That document, according to CNet, “urges local lawmakers to cancel all existing police deals with Amazon’s video doorbell company.”

Good luck with that.

The CNet write up adds:

Ring has more than 500 police partnerships across the US, and a coalition of civil rights groups are calling for local governments to cancel them all. On Tuesday, tech-focused nonprofit Fight For the Future published an open letter to elected officials raising concerns about Ring’s police partnerships and its impacts on privacy and surveillance.  The letter is signed by more than 30 civil rights groups, including the Center for Human Rights and Privacy, Color of Change and the Constitutional Alliance. Along with asking mayors and city councils to cancel existing Ring partnerships, the letter also asks for surveillance oversight ordinances to prevent police departments from making these deals in the future, and also requested members of Congress to investigate Ring’s practices.

Read more

Amazon Policeware: One Possible Output

October 1, 2019

Investigations focus on entities and timelines. The context includes the legal wrapper, procedures, impressions, and similar information usually resident in investigators and their colleagues.

Why gather data unless there is a payoff. The payoff from data in terms of Amazon’s policeware includes these upsides:

  • Data which informs new products and services, especially those signals for latent demand
  • Raw material for analytical processes such as those performed by superordinate Amazon Web Services
  • Outputs which have market magnetism; that is, the product is desirable and LE and intel customers want to buy it.

This illustration which I have taken from my October 2, 2019, TechnoSecurity lecture and from my Amazon policeware webinar illustrates three points:

First, raw data are acquired by Amazon. The sources are diverse and some are unique to Amazon; for example, individual and enterprise purchasing data.

Second, the AWS policeware platform which performs normalization, indexing, and analysis from historic and real time data flows; for example, what books did an individual purchase and when.

Third, an output in the form of a profile or report about a person of interest.


© Stephen E Arnold 2019

I know the image is difficult to read. There are two ways to address this issue. You can attend my lectures at the San Antonio conference or you can sign up for my Amazon policeware webinar.

No Epstein supporters, fans, and acquaintances should express interest in my research. Sorry. I am old fashioned.

Stephen E Arnold, October 1, 2019

Amazon: The Surveillance Mesh Play

September 30, 2019

DarkCyber received a complaint about the small size of the image from my webinar about Amazon Policeware. There are two remedies for tiny images. You can attend my policeware lecture at the TechnoSecurity & Digital Forensics Conference in San Antonio on Wednesday, October 2. Qualified attendees can request a PDF of the image. Second, you can contact DarkCyber at benkent2020 at yahoo dot com and sign up for our LE, security, and intel personnel webinar.

Today, I want to provide several findings from our research related to Amazon Policeware. These are:

  • Amazon’s mesh network in the Sidewalk product provides a solution to blanketing a city with a data collection component. This wide field outdoor mesh network may fail. In the meantime, you may be able to locate your dog if it is wearing a Fetch.
  • Amazon’s Ring doorbell provides an anchor for fixed video feeds. The resolution is poor and the system is far from comprehensive, but the test mechanism is sufficiently compelling for several hundred police departments to show interest.
  • The supplementary data collection devices shown in the figure below feed into the AWS policeware platform. That platform performs a number of analytic functions. Cross correlation is one of these.


© Stephen E Arnold, 2019

So what?

In the US, Amazon is moving forward to put in place a next generation service which provides a new tool to enforcement authorities. The system delivers other benefits to Amazon as well.

DarkCyber identifies some parallels between the efforts the government of China is making with Amazon’s activities.

Will the Epstein friendly academic institution get this story straight? Probably not.

Stephen E Arnold, September 30, 2019

Amazon Policeware: The Path to IBM-Style Lock In on Steroids

September 27, 2019

Quite a bit of Amazon news has flowed through the DarkCyber system. The problem is that most of the information is oblivious to Amazon’s policeware initiative. DarkCyber’s research suggests that Amazon is building a surveillance system. One DarkCyber team member said, “Amazon is building what China has been working on for several years.” Is this DarkCyber researcher correct? Who knows?

I do want to provide a diagram from our Amazon webinar which puts Amazon’s activities into a context for enforcement. The scope of Amazon’s business strategy extends beyond local law enforcement and the Ring video doorbell activities, beyond the cloud services for several US government agencies, and beyond the company’s online businesses.

Amazon may be positioning itself to provide:

  • IRS-related services associated with tax investigations
  • Drug enforcement actions related to physicians who allegedly overprescribe or entities which obtain certain compounds using obfuscation methods
  • SEC-related services to determine entity interaction, expenditures, and related financial activities
  • Credit verification, including other financial analyses, for government and retail financial activities.

Other “extensions” are possible. What’s interesting is that few have noticed and even fewer pay much attention beyond hand waving about Alexa. There’s more than Alexa, which is a low level gateway service.

Here’s the diagram, which is copyrighted by Stephen E Arnold, operator of DarkCyber, and author of the forthcoming monograph, Dark Edge: Amazon’s Policeware Initiative.


© Stephen E Arnold, 2019.

How do you use this diagram? Just map Amazon’s most recent product announcements into the grid.

The DarkCyber Amazon policeware webinar walks through the tactics and the strategy for this “in plain sight” play. Analysts, journalists, policeware vendors paying Amazon to host their systems, and Microsoft-type outfits are oblivious to what is now the end game for a 12 year push by Amazon to make IBM-style lock in seem as quaint as a Model T Ford.

For those who recycle my information and claim it as your own creative output, why not be somewhat ethical and provide attribution. You know. Old-fashioned stuff like a footnote. Yep, that includes a real journalist who writes for the New York Times and the Epstein linked MIT publication, among others.

Stephen E Arnold, September 27, 2019

Next Page »

  • Archives

  • Recent Posts

  • Meta