Clearview: More Tradecraft Exposed

March 26, 2020

After years of dancing around the difference between brain dead products like enterprise search, content management, and predictive analytics, anyone can gain insight into the specialized software provided by generally low profile companies. Verint is publicly traded. Do you know what Verint does? Sure, look it up on Bing or Google.

I read with some discomfort “I Got My File From Clearview AI, and It Freaked Me Out.”

Here are some factoids from the write up. Are these true? DarkCyber assumes that everything the team sees on the Internet meets the highest standards of integrity, objectivity, and truthiness. DarkCyber’s comments are in italic:

  1. “Someone really has been monitoring nearly everything you post to the public internet. And they genuinely are doing “something” with it. The someone is Clearview AI. And the something is this: building a detailed profile about you from the photos you post online, making it searchable using only your face, and then selling it to government agencies and police departments who use it to help track you, identify your face in a crowd, and investigate you — even if you’ve been accused of no crime.”
  2. “Clearview AI was founded in 2017. It’s the brainchild of Australian entrepreneur Hoan Ton-That and former political aide Richard Schwartz. For several years, Clearview essentially operated in the shadows.”
  3. “The Times, not usually an institution prone to hyperbole, wrote that Clearview could “end privacy as we know it.” [This statement is a reference to a New York Times intelware article. The New York Times continues to hunt for real news that advances an agenda of “this stuff is terrible, horrible, unconstitutional, pro anything the NYT believes in, etc.”]
  4. “the company [Clearview] scrapes public images from the internet. These can come from news articles, public Facebook posts, social media profiles, or multiple other sources. Clearview has apparently slurped up more than 3 billion of these images.” [The images are those which are available on the Internet and possibly from other sources; for example, commercial content vendors.]
  5. “The images are then clustered together which allows the company to form a detailed, face-linked profile of nearly anyone who has published a picture of themselves online (or has had their face featured in a news story, a company website, a mug shot, or the like).” [This is called enrichment, context, or machine learning indexing and — heaven help DarkCyber — social graphs or semantic relationships. Jargon varies according to fashion trends.]
  6. “Clearview packages this database into an easy-to-query service (originally called Smartcheckr) and sells it to government agencies, police departments, and a handful of private companies….As of early 2020, the company had more than 2,200 customers using its service.” [DarkCyber wants to point out that law enforcement entities are strapped for cash, and many deals are little more than proofs-of-concept. Some departments cycle through policeware and intelware in order to know what the systems do versus what the marketing people say the systems do. Big difference? Yep, yep.]
  7. “Clearview’s clients can upload a photo of an unknown person to the system. This can be from a surveillance camera, an anonymous video posted online, or any other source.”
  8. “In a matter of seconds, Clearview locates the person in its database using only their face. It then provides their complete profile back to the client.”

Now let’s look at what the write up reported that seemed to DarkCyber to be edging closer to “real news.”

This is the report the author obtained:


The article reports that the individual who obtained this information from Clearview was surprised. DarkCyber noted this series of statements:

The depth and variety of data that Clearview has gathered on me is staggering. My profile contains, for example, a story published about me in my alma mater’s alumni magazine from 2012, and a follow-up article published a year later. It also includes a profile page from a Python coders’ meet up group that I had forgotten I belonged to, as well as a wide variety of posts from a personal blog my wife and I started just after getting married. The profile contains the URL of my Facebook page, as well as the names of several people with connections to me, including my faculty advisor and a family member (I have redacted their information and images in red prior to publishing my profile here).

The write up includes commentary on the service, its threats to individual privacy, and similar sentiments.

DarkCyber’s observations include:

  • Perhaps universities could include information about applications of math, statistics, and machine learning in their business and other courses? At a lecture DarkCyber gave at the University of Louisville in January 2019, cluelessness among students and faculty was the principal takeaway for the DarkCyber team.
  • Clearview’s technology is not unique, nor is it competitive with the integrated systems available from other specialized software vendors, based on information available to DarkCyber.
  • The summary of what Clearview does captures information that would have been considered classified and may still be considerate classified in some countries.
  • Clearview does not appear to have video capability like other vendors with richer, more sophisticated technology.

Why did DarkCyber experience discomfort? Some information is not — at this time or in the present environment — suitable for wide dissemination. A good actor with technical expertise can become a bad actor because the systems and methods are presented in sufficient detail to enable certain activities. Knowledge is power, but knowledge in the hands of certain individuals can yield unexpected consequences. DarkCyber is old fashioned and plans to stay that way.

Stephen E Arnold, March 26, 2020

Wolfram Mathematica

March 19, 2020

DarkCyber noted “In Less Than a Year, So Much New: Launching Version 12.1 of Wolfram Language & Mathematica” contains highly suggestive information. Yes, this is a mathy program. The innovations are significant for analysts and some government professionals. To cite one example:

I’ve been recording hundreds of hours of video in connection with a new project I’m working on. So I decided to try our new capabilities on it. It’s spectacular! I could take a 4-hour video, and immediately extract a bunch of sample frames from it, and then—yes, in a few hours of CPU time—“summarize the whole video”, using SpeechRecognize to do speech-to-text on everything that was said and then generating a word cloud…

DarkCyber reacts positively to other additions and enhancements to the Mathematica “system.” Version 12.1 will make it easier to develop specific functions for policeware and intelware use cases.

Remarkable because the “system” can geo-everything. That’s important in many situations.

Stephen E Arnold, March 19, 2020

Israel and Mobile Phone Data: Some Hypotheticals

March 19, 2020

DarkCyber spotted a story in the New York Times: “Israel Looks to Repurpose a Trove of Cell Phone Data.” The story appeared in the dead tree edition on March 17, 2020, and you can access the online version of the write up at this link.

The write up reports:

Prime Minister Benjamin Netanyahu of Israel authorized the country’s internal security agency to tap into a vast , previously undisclosed trove of cell phone data to retract the movements of people who have contracted the corona virus and identify others who should be quarantined because their paths crossed.

Okay, cell phone data. Track people. Paths crossed. So what?

Apparently not much.

The Gray Lady does the handwaving about privacy and the fragility of democracy in Israel. There’s a quote about the need for oversight when certain specialized data are retained and then made available for analysis. Standard journalism stuff.

DarkCyber’s team talked about the write up and what the real journalists left out of the story. Remember. DarkCyber operates from a hollow in rural Kentucky and knows zero about Israel’s data collection realities. Nevertheless, my team was able to identify some interesting use cases.

Let’s look at a couple and conclude with a handful of observations.

First, the idea of retaining cell phone data is not exactly a new one. What if these data can be extracted using an identifier for a person of interest? What if a time-series query could extract the geolocation data for each movement of the person of interest captured by a cell tower? What if this path could be displayed on a map? Here’s a dummy example of what the plot for a single person of interest might look like. Please, note these graphics are examples selected from open sources. Examples are not related to a single investigation or vendor. These are for illustrative purposes only.


Source: Standard mobile phone tracking within a geofence. Map with blue lines showing a person’s path. SPIE at

Useful indeed.

Second, what if the intersection of two or more individuals can be plotted. Here’s a simulation of such a path intersection:


Source: Map showing the location of a person’s mobile phone over a period of time. Tyler Bell at

Would these data provide a way to identify an individual with a mobile phone who was in “contact” with a person of interest? Would the authorities be able to perform additional analyses to determine who is in either party’s social network?

Third, could these relationship data be minded so that connections can be further explored?

Image result for analyst notebook mapping route

Source:  Diagram of people who have crossed paths visualized via Analyst Notebook functions.

Can these data be arrayed on a timeline? Can the routes be converted into an animation that shows a particular person of interest’s movements at a specific window of time?


Source: Vertical dots diagram from Recorded Future showing events on a timeline.

These hypothetical displays of data derived from cross correlations, geotagging, and timeline generation based on date stamps seem feasible. If earnest individuals in rural Kentucky can see the value of these “secret” data disclosed in the New York Times’ article, why didn’t the journalist and the others who presumably read the story?

What’s interesting is that systems, methods, and tools clearly disclosed in open source information is overlooked, ignored, or just not understood.

Now the big question: Do other countries have these “secret” troves of data?

DarkCyber does not know; however, it seems possible. Log files are a useful function of data processes. Data exhaust may have value.

Stephen E Arnold, March 19, 2020

Medical Surveillance: Numerous Applications for Government Entities and Entrepreneurs

March 16, 2020

With the Corona virus capturing headlines and disrupting routines, how can smart software monitoring data help with the current problem?

DarkCyber assumes that government health professionals would want to make use of technology that reduced a Corona disruption. Enforcement professionals would understand that monitoring, alerting, and identifying functions could assist in spotting issues; for example, in a particular region.

What’s interesting is that the application of intelware systems and methods to health issues is likely to become a robust business. However, despite the effective application of established techniques, identifying signals in a stream of data is an extension of innovations reaching back to i2 Analyst Notebook and other sensemaking systems in wide use in many countries’ enforcement and intelligence agencies.

What’s different is the keen attention these monitoring, alerting, and identifying systems are attracting.

Let’s take one example: Bluedot, a company operating from Canada. Founded by  an infectious disease physician, Dr. Kamran Kahn. This company was one of the first firms to highlight the threat posed by the Coronavirus. According to Diginomica, BlueDot “alerted its private sector and government clients about a cluster of unusual pneumonia cases happening around a market in Wuhan, China.”


BlueDot, founded in 2013, combined expertise in infectious disease, artificial intelligence, analytics, and flows of open source and specialized information. “How Canadian AI start-up BlueDot Spotted Coronavirus before Anyone Else Had a Clue” explains what the company did to sound the alarm:

The BlueDot engine gathers data on over 150 diseases and syndromes around the world searching every 15 minutes, 24 hours a day. This includes official data from organizations like the Center for Disease Control or the World Health Organization. But, the system also counts on less structured information. Much of BlueDot’s predictive ability comes from data it collects outside official health care sources including, for example, the worldwide movements of more than four billion travelers on commercial flights every year; human, animal and insect population data; climate data from satellites; and local information from journalists and healthcare workers, pouring through 100,000 online articles each day spanning 65 languages. BlueDot’s specialists manually classified the data, developed a taxonomy so relevant keywords could be scanned efficiently, and then applied machine learning and natural language processing to train the system. As a result, it says, only a handful of cases are flagged for human experts to analyze. BlueDot sends out regular alerts to health care, government, business, and public health clients. The alerts provide brief synopses of anomalous disease outbreaks that its AI engine has discovered and the risks they may pose.

DarkCyber interprets BlueDot’s pinpointing of the Corona virus as an important achievement. More importantly, DarkCyber sees BlueDot’s system as an example of innovators replicating the systems, methods, procedures, and outputs from intelware and policeware systems.

Independent thinkers arrive at a practical workflow to convert raw data into high-value insights. BlueDot is a company that points the way to the future of deriving actionable information from a range of content.

Some vendors of specialized software work hard to keep their systems and methods confidential and in some cases secret. Now a person interested in how some specialized software and service providers assist government agencies, intelligence professionals, and security experts can read about BlueDot in open source articles like the one cited in this blog post or work through the information on the BlueDot Web site. The company wants to hire a surveillance analyst. Click here for information.

Net net: BlueDot provides a template for innovators wanting to apply systems and methods that once were classified or confidential to commercial problems. Business intelligence may become more like traditional intelligence more quickly than some anticipated.

Stephen E Arnold, March 16, 2020

Banjo: A How To for Procedures Once Kept Secret

March 13, 2020

DarkCyber wrote about BlueDot and its making reasonably clear what steps it takes to derive actionable intelligence from open source and some other types of data. Ten years ago, the processes implemented by BlueDot would have been shrouded in secrecy.

From Secrets to Commercial Systems

Secret and classified information seems to find its way into social media and the mainstream media. DarkCyber noted another example of a company utilizing some interesting methods written up in a free online publication.

DarkCyber can visualize old-school companies depending on sales to law enforcement and the intelligence community asking themselves, “What’s going on? How are commercial firms getting this know how? Why are how to and do it yourself travel guides to intelligence methods becoming so darned public?”

It puzzles DarkCyber as well.

Let’s take a look at the revelations in “Surveillance Firm Banjo Used a Secret Company and Fake Apps to Scrape Social Media.” The write up explains:

  • A company called Pink Unicorn Labs created apps which obtained information from users. Users did not know their data were gathered, filtered, and cross correlated.
  • Banjo, an artificial intelligence firm that works with police used a shadow company to create an array of Android and iOS apps that looked innocuous but were specifically designed to secretly scrape social media. The developer of the apps was Pink Unicorn. Banjo CEO Damien Patton created Pink Unicorn.
  • Why create apps that seemed to do one while performing data inhalation: “Dataminr received an investment from Twitter. Dataminr has access to the Twitter fire hose. Banjo, the write up says, “did not have that sort of data access.” The fix? Create apps that sucked data.
  • The apps obtained information from Facebook, Twitter, Instagram, Russian social media app VK, FourSquare, Google Plus, and Chinese social network Sina Weibo.
  • The article points out: “Once users logged into the innocent looking apps via a social network OAuth provider, Banjo saved the login credentials, according to two former employees and an expert analysis of the apps performed by Kasra Rahjerdi, who has been an Android developer since the original Android project was launched. Banjo then scraped social media content.”
  • The write up explains, Banjo, via a deal with Utah, has access to the “state’s traffic, CCTV, and public safety cameras. Banjo promises to combine that input with a range of other data such as satellites and social media posts to create a system that it claims alerts law enforcement of crimes or events in real-time.”

Why social media? On the surface and to most parents and casual users of Facebook, Twitter, and YouTube, there are quite a few cat posts. But via the magic of math, an analyst or a script can look for data which fills in missing information. The idea is to create a record of a person, leave blanks where desirable information is not yet plugged in, and then rely on software to spot the missing item. How is this accomplished? The idea is simple. One known fact appears in the profile and that fact appears in another unrelated item of content. Then the correlated item of content is scanned by a script and any information missing from the profile is plugged in. Using this method and content from different sources, a clever system can compile a dossier on an entity. Open source information yields numerous gems; for example, a cute name applied to a boy friend might become part of a person of interest’s Dark Web handle. Phone numbers, geographic information, friends, and links to other interesting content surface. Scripts work through available data. Data can be obtained in many ways. The methods are those which were shrouded in secrecy before the Internet started publishing essays revealing what some have called “tradecraft.”

Net Net

Banjo troubles DarkCyber on a number of levels:

  1. Secrecy has significant benefits. Secrets, once let loose, have interesting consequences.
  2. Users are unaware of the risks apps pose. Cluelessness is in some cases problematic.
  3. The “now” world looks more like an intelligence agency than a social construct.

Stephen E Arnold, March 13, 2020

Sintelix Adds Unstructured Text to IBM i2 Solutions

March 12, 2020

DarkCyber noted that IBM is promoting the Sintelix text and data analytics software. The tie up makes it easier for i2 users to make sense of unstructured text. Sintelix does not compete with IBM. Sintelix has filled a gap in IBM’s presentation of the i2 solutions. For more information, navigate to this IBM page. No pricing details. Sintelix’s headquarters are in Australia.

Stephen E Arnold, March 12, 2020

Africa: Booming Intelware and Policeware Markets?

February 20, 2020

DarkCyber has a difficult time determining what information is on the money and what information is on the floor of the data casino. We read “Inside Africa’s Increasingly Lucrative Surveillance Market.” The write up is chock full of details. Some of the allegedly accurate information was interesting.

Here’s a sampling of factoids to evaluate:

Market size, but it is not clear what “market” means, just Africa, the world, or developed countries: The cybersecurity market was worth $118.78bn in 2018. By 2024, this figure is expected to hit $267.73bn.

Name of Gabonese Republic’s enforcement unit: SILAM which is allegedly run by French national Jean-Charles Solon. The write up states: “Solon previously worked for the General Directorate for External Security (Direction générale de la sécurité extérieure – DGSE), France’s intelligence agency.” Allegedly Solor is familiar with the ins and outs of wire tapping. The write up asserts without providing a specific source: “According to our sources, Solon is well equipped and handles everything from wiretap transcripts, text message and WhatsApp conversation interceptions, and email and social media surveillance.” Solon is likely to find the write up in This Is GCN worth some special attention, but that’s just DarkCyber hunch.

Entities (governmental and commercial) linked to the Gabonese Republic include: Amesys and its Cerebro tool, SDECE/DGSE, AMES, Nexa Technologies, and Suneris Solutions (Thales).

Current market leaders: The write up reports, “Ercom and Suneris Solutions have a leading position in the African market, especially in the sub-Saharan region.” These two companies are owned by Thales.

What sells and where to buy: The write up notes, ““Clients want to buy something that has a proven track record. They’re not looking for an experimental gadget.” For Africa, the two must-see events are Milipol Paris, held in November, and ISS World Middle East and Africa, held in March in Dubai.”

Israeli companies selling or trying to sell in Africa: The write up identifies these firms as eyeing the African markets –—Thales (includes Ercom and Suneris Solutions), Mer Group and its unit Athena GS3 (Mer Group (Congo, Guinea, Nigeria and DRC), Verint Systems and Elbit Systems (South Africa, Angola, Ethiopia, Nigeria, etc.), AD Consultants, and NSO Group. The write up asserts, “The Israelis are everywhere. They even managed to equip Saudi Arabia! It’s pretty much impossible to bypass them.”

Other companies trying to sell to African markets include: BAE Systems, Gamma Group, Trovicor (now a unit of Nexa), Hacking Team, VasTech, Protei (a Russian firm), Huawei, and ZTE Corporation (described in the article as a compatriot of Huawei).

DarkCyber will leave it to you, gentle reader, to figure out if the write up in This is GCN is fact or fluff. What is known is that most of the named entities in this write up work overtime to avoid big time news coverage, traditional marketing, and noisy public relations. DarkCyber believes that firms providing specialized services should remain low profile.

In closing, if you want information about Sudanese intelligence activities, you may find this thesis by Muhammad Bathily helpful. Its title is “Reform of Senegalese Gendarmerie Intelligence Services.” You can locate the document at this url (Verified at 1049 am US Eastern time, 2 20 20)

Stephen E Arnold, February 20, 2020

NSO Does Not Play the Facebook Game

January 16, 2020

We spotted a write up in Techdirt, an interesting publication indeed. The story is “Malware Marketer NSO Group Looks Like It’s Blowing Off Facebook’s Lawsuit.”

The title suggested to some of the DarkCyber team that NSO is a not so good company. It is a malware marketer. Furthermore, the company is “blowing off” Facebook’s lawsuit.

The Facebook case asserts that the NSO Group exploited WhatsApp. The goal? Compromise an actor’s mobile device via software. This approach is known as an attack vector created by Facebook.

NSO, as DarkCyber has noted in this blog and our videos, has been generating media attention. Specialized software companies providing technology to government entities generally prefer to maintain a lower profile.

What’s the status of Facebook’s legal action? Techdirt states:

Facebook’s lawsuit is going nowhere fast. While it’s not uncommon for there to be a delay between the filing of a complaint and the defendant’s response, NSO hasn’t filed anything — not even a notice of appearance from its corporate counsel — since the filing of the suit.

NSO is not a US company. It is owned by a Japanese firm and most of the technical operations are still under the umbrella of Israeli citizens.

DarkCyber thinks that Facebook’s challenge to NSO was an interesting action.

First, NSO responds to its customers’ needs. This means that outfits like Facebook which often drag their running shoe shod feet when it comes to dealing with government requests for data invites attention from specialist firms. Look in the mirror, Facebookers.

Second, Facebook wants to encrypt everything, create its own walled garden, and operate like a country. Okay, Facebookers, that attitude invites some special attention. Look in the mirror, Facebookers.

Third, the challenge to NSO strikes DarkCyber like an New Age slow cooker calling a microwave an unnecessary luxury. Nope. Look in the mirror, Facebooks, or in this case, in the reflection in the slow cooker’s aluminum skin.

Net net: Facebook may want to think a bit harder about the resources available to specialist software firms. Why? Nothing special, of course.

Stephen E Arnold, February 16, 2020

Belated Recognition: Barn Burned, Intelligence Costco Operating

December 18, 2019

Amnesty International has described the “Architecture of Surveillance.” Quick out of the gate?

Concerns about privacy and the ways in which large tech companies use and profit off user data have been more and more in the news lately. A recent report by Amnesty International goes so far as to say Facebook and Google, in particular, maintain a “surveillance-based business model.” Common Dreams discusses the report in its article, “Unprecedented ‘Architecture of Surveillance’ Created by Facebook and Google Poses Grave Human Rights Threat: Report.” Writer Andrea Germanos summarizes:

“With Facebook controlling not only its eponymous social media platform but also WhatsApp, Messenger, and Instagram, and Google parent company Alphabet in control of YouTube and the Android mobile operating system as well as the search engine, the companies ‘control the primary channels that people rely on to engage with the internet.’ In fact, the report continues, the two companies control ‘an architecture of surveillance that has no basis for comparison in human history.’ … The companies hoover up user data—as well as metadata like email recipients—and ‘they are using that data to infer and create new information about us,’ relying in part on artificial intelligence (AI).The report says that ‘as a default Google stores search history across all of an individual’s devices, information on every app and extension they use, and all of their YouTube history, while Facebook collects data about people even if they don’t have a Facebook account.’ Smart phones also offer the companies a ‘rich source of data,’ but the reach of surveillance doesn’t stop there.”

In fact, the reach now extends into homes via AI assistants like Alexa and devices connected to the internet of things. It also extends through public spaces courtesy of smart city implementations. All of this has crept upon us gradually and, largely, with the full cooperation of the subjects being surveilled (a.k.a. “users”), whether they fully understood what they were signing up for or not. The connections and conclusions algorithms can draw from all this information is mind-boggling even to someone who writes about data and AI for a living. See the article for a more in-depth discussion of the possibilities and repercussions.

Because the big tech companies are not going to stop these lucrative practices on their own, Amnesty International insists governments must step in. Companies must stop requiring users to surrender all rights to their data in order to use their services, for example, and the right to not be tracked must be enshrined into law. Transparency is also to be required, and companies mustn’t be allowed to lobby for weakened protections. Society has gone so far down the digital road that opting out of an online existence is simply not a workable option for most—that’s just not how it works anymore. But will it be possible to hold the big techs’ feet to the fire, or have they become too powerful?

Cynthia Murrell, December 18, 2019

This Snooping Stuff

December 14, 2019

The Economist’s story “Offering Software for Snooping to Governments Is a Booming Business” sounds good. The article is locked behind a paywall so you will have to sign up to read the quite British analysis. There are some interesting comments zipping around about the article. For example, a useful thread appears at this link.

Several observations struck me as informative; for example:

  • The Economist does not mention Cisco. This is important because Cisco has an “intelligence” capability with some useful connections to innovators in other countries.
  • Palantir, a recipient of another US government contract, is not mentioned in the write up. For information about this new Palantir project, navigate to “Palantir Wins New Pentagon Deal With $111 Million From the Army.” This is paywalled as well.
  • There is even a reference to surveillance technology delivering a benefit.

Perhaps those interested in surveillance software will find the interview Robert Steele, a former CIA professional, conducted with me. You can find that information at this link.

Perhaps the Economist will revisit this topic and move beyond NSO Group and colloquial language like snooping?

Stephen E Arnold

Next Page »

  • Archives

  • Recent Posts

  • Meta