Academics Can Predict Crime: What about Close Enough for Horseshoes Accuracy?

July 6, 2022

I have no phat phaux phrench bulldog in this upcoming academic free-for-all. I read “Algorithm Predicts Crime a Week in Advance, But Reveals Bias in Police Response.” Yellow lights flash.

The article is a summary of a longer research paper published by wizards at the University of Chicago, an outstanding institution located in a safe, well-lit, and community-oriented area of Chicago. Home of the Bears and once the literal stomping grounds of the P Stone Nation.  (And, Yes, I am intentionally leaving part of the gang’s name out of my reference. Feel free to use the full gang name yourself.)

The write up says:

Data and social scientists from the University of Chicago have developed a new algorithm that forecasts crime by learning patterns in time and geographic locations from public data on violent and property crimes. The model can predict future crimes one week in advance with about 90% accuracy.

Predicting crime a week before the incident or incidents sounds like an application of predictive analytics. I think there was an outfit which started at Indiana University which came up with something similar. That system attracted some attention and some skepticism.

But humans are curious and applying mathematical recipes to available data is for some an interesting way to pursue grants, publicity, and maybe some start up funding.

But 90 percent. That begs the question, “What about that other 10 percent?” How low does the model go for acceptable outputs? Maybe 60 percent confidence? Maybe lower?

The write up continues:

Previous efforts at crime prediction often use an epidemic or seismic approach, where crime is depicted as emerging in “hotspots” that spread to surrounding areas. These tools miss out on the complex social environment of cities, however, and don’t consider the relationship between crime and the effects of police enforcement.

I know I have mentioned Banjo (now SafeX AI) and the firm’s patents. Some of these patent documents provide useful summaries of some of the algorithms used in predictive models. What’s strikes me as  important about math-centric outputs is that methods are useful — up to a point. I have a canned lecture which identifies the 10 most used mathy methods and identifies how the data sets going in can be poisoned by an intentional actor. The culprit can be smart software generating data in the manner of AI synthetic data systems or by humans working for a government funded entity in St. Petersburg, Russia.

However, there have been a few high hurdles predictive systems have to jump over in a clean, fluid manner; for instance:

  • Identifying and filtering certain data. Bad data can have a significant impact of the outputs. My recollection is that analysis of a predictive system in California revealed wide variation in the collection of data and the consistency of the data from both humans and automated sources
  • Refining actionable outputs. Some of these outputs are often wide of the mark. This means that scarce resources may be deployed on a wild goose chase or investigation of actors who are not “bad” or involved in an incident
  • Real time not correlating with the past. Numerous contextual issues arise in real time, and predictive systems operate in what I call a time disconnected mode. For those on the pointy end of the stick, this time variance can create a situation in which the predictive outputs are not just a few degrees off center, they are orbiting around a beach club in Bermuda.

If you want to read the entire academic “we have cracked this problem” article, navigate to this link. You will have to pay to read this remarkable article.

Stephen E Arnold, July 6, 2022

An Analyst Wrestles with the Palantir Realities

May 23, 2022

Palantir Technologies in my world view is a services and software company positioned as a provider of intelware. Intelware means software and services which allow users to extract high-value information from text, numeric, and possibly image and video data.

Palantir, founded in 2003, has been influenced from its inception by precursor software like the original i2 Ltd. Analyst Notebook and BAE Systems Detica. Both of these systems allowed user to intake “content”, enter the names of people or things, and display the outputs so that the higher-value facts were presented in a useful way; for example, a chart or a relationship graph.

The US government works to learn about new and potentially useful software and systems. Not surprisingly, a government agency showed interest in Palantir’s software when the entrepreneurs involved in the company started describing the Palantir features and functions. Appreciate that in its early years almost two decades ago, the presentations and demonstrations captured what I call “to be” systems; that is, at some point in the future, Palantir’s system and software would be everything that Analyst Notebook, Detica, and the other intelware vendors could offer. The pitch is compelling.

Palantir, now almost two decades old, is a publicly traded company, and it is working overtime to move beyond sales to governments in the US and elsewhere. One of the characteristics of selling intelware to non-governmental organizations is that the capabilities of the system and its use by government clients are often disconcerting to a financial institution, a big hospital chain, or consulting firm focused on real estate.

Furthermore, intelware systems require data. Some data can be easily imported into a system like Palantir’s; for example, plain ASCII text and Excel spreadsheets. Other data are in a format which must be transformed so that Palantir can import the information. Other data present challenges like converting an image with a date and time stamp into an indexed content object. That indexing, to be helpful and to reduce the likelihood of errors, has to be accurate. Some non-text data must be enriched. French content processing experts refer to this enrichment as “fertilization.”

The write up “Palantir: Complete Disaster” includes this statement:

We think there are three possible courses of action in the disaster that has been Palantir, all of which are correct.

Here are the three “courses of action”:

  1. Don’t buy shares in Palantir.
  2. Buy shares, maybe short the stock.
  3. Buy shares and ride out the downturn.

Each of these options ignore two issues. The first is why Palantir is not closing deals and showing a profit. The second is why an intelware company is not able to amp up its sales to government agencies in the US, Western Europe, and selected government agencies elsewhere.

My view is that Palantir is a tough sell for these reasons:

  1. To land a deal, the prospect has to know what the payoff from using the Gotham / Foundry system is. “Intelligence” is a hot concept, but it is a tough sell unless there is a “champion” inside the prospect’s organization to grease the skids.
  2. Competitors offer comparable products for as little as $5,000 per month and some of these competitors bundle third party data which can be fused with the licensee’s data with minimal fiddling with filters and file conversions.
  3. Newer systems are easier to use, include automated workflows which speed analysts, investigators, and and researchers work.

The slow sales of Palantir follow the same type of curve that sales of Autonomy, Fast Search & Software, and many other “information” or “intelligence” focused products have. The initial sales are from government agencies which want better mouse traps. When the intelware does not deliver markedly significant payoffs, the licensees keep looking for better, faster, and cheaper options.

Will Palantir be able to generate a profit and deliver organic growth?

If the trajectory of precursor companies is the path Palantir is on, the answer is, “No.”

Stephen E Arnold, May 23, 2022

Open Source: Dietary Insights

May 5, 2022

One of the more benign news briefs about Russia these days concerns the eating habits of the country’s secret police. The Verge explains how delivery apps revealed Russian law enforcement’s food preferences: “Data Leak From Russian Delivery App Shows Dining Habits Of The Secret Police.” A massive data leak from Yandex Food, a large food delivery service in Russia, contained names, addresses, phone numbers, and delivery instructions related to the secret police.

Yandex Food is a subsidiary of the Russian search engine of the same name. The data leak occurred on March 1 and Yandex blamed it on the bad actions of one of its employees. The leak did not include users’ login information. The Roskomnadzor, the Russian government agency responsible for mass media, threatened Yandex with a 100,000 ruble fine and it also blocked a map containing citizen and secret police data.

Bellingcat researchers were investigating leads on the poisoning of Alexey Navalny, the Russian opposition leader. They searched the Yandex Food database collected from a prior investigation and discovered a person who was in contact with Russia’s Federal Security Service (FSB) to plan Navalny’s poisoning. The individual used his work email to register with Yandex Food. They also searched for phone numbers linked to Russia’s Main Intelligence Directorate (GRU). Bellingcat found interesting information in the leak:

“Bellingcat uncovered some valuable information by searching the database for specific addresses as well. When researchers looked for the GRU headquarters in Moscow, they found just four results — a potential sign that workers just don’t use the delivery app, or opt to order from restaurants within walking distance instead. When Bellingcat searched for FSB’s Special Operation Center in a Moscow suburb, however, it yielded 20 results. Several results contained interesting delivery instructions, warning drivers that the delivery location is a military base. One user told their driver “Go up to the three boom barriers near the blue booth and call. After the stop for bus 110 up to the end,” while another said ‘Closed territory. Go up to the checkpoint. Call [number] ten minutes before you arrive!’”

The most scandalous information leaked from the Yandex Food breach was information about Putin’s former mistress and their “suspected daughter.”

While it is hilarious to read about Russian law enforcement’s eating habits, it is alarming when the situation is applied to the United States. Imagine all of the information DoorDash, Grubhub, Uber Eats, and other delivery services collect on customers. There was a DoorDash data leak in 2019 that affected 4.9 million people and it was much larger than the Yandex Food leak.

Whitney Grace, May 5, 2022

NSO Group Knock On: More Attention Directed at Voyager Labs?

April 12, 2022

Not many people know about Voyager Labs, its different businesses, or its work for some government entities. From my point of view, that’s how intelware and policeware vendors should conduct themselves. Since the NSO Group’s missteps have fired up everyone from big newspaper journalists to college professors, the once low profile world of specialized software and services has come to center stage. Unfortunately most of the firms providing these once secret specialized functions are, unlike Tallulah Bankhead, ill prepared for the rigors of questions about chain smoking and a sporty life style. Israeli companies in the specialized software and services business are definitely not equipped for criticism, exposure, questioning by non military types. A degree in journalism or law is interesting, but it is the camaraderie of a military unit which is important. To be fair, this “certain blindness” can be fatal. Will NSO Group be able to survive? I don’t know. What I do know is that anyone in the intelware or policeware game has to be darned careful. The steely gaze, the hardened demeanor, and the “we know more than you do” does not play well with an intrepid reporter investigating the cozy world of secretive conferences, briefings at government hoe downs, or probing into private companies which amass user data from third-party sources for reselling to government agencies hither and yon.

Change happened.

I read “On the Internet, No One Knows You’re a Cop.” The author of the article is Albert Fox-Cahn, the founder and director of STOP. Guess what the acronym means? Give up. The answer is: The Surveillance Technology Oversight Project.

Where does this outfit hang its baseball cap with a faded New York Yankees’ emblem? Give up. The New York University Urban Justice Center. Mr. Fox-Cahn is legal type, and he has some helpers; for example, fledgling legal eagles. (A baby legal eagle is technically eaglets or is it eaglettes. I profess ignorance.) This is not a Lone Ranger operation, and I have a hunch that others at NYU can be enjoined to pitch in for the STOP endeavor. If there is one thing college types have it is an almost endless supply of students who want “experience.” Then there is the thrill of the hunt. Eagles, as you know, have been known to snatch a retired humanoid’s poodle for sustenance. Do legal eagles enjoy the thrill of the kill, or are they following some protein’s chemical make up?

The write up states:

Increasingly, internet surveillance is operating under our consent, as police harness new software platforms to deploy networks of fake accounts, tricking the public into giving up what few privacy protections the law affords. The police can see far beyond what we know is public on these platforms, peaking behind the curtains at what we mean to show and say only to those closest to us. But none of us know these requests come from police, none of us truly consent to this new, invasive form of state surveillance, but this “consent” is enough for the law, enough for the courts, and enough to have our private conversations used against us in a court of law.

Yeah, but use of public data is legal. Never mind, I hear an inner voice speaking for the STOP professionals.

The article then trots through the issues sitting on top of a stack of reports about actions that trouble STOP; to wit, use of fake social media accounts. The idea is to gin up a fake name and operate as a sock puppet. I want to point out that this method is often helpful in certain types of investigations. I won’t list the types.

The write up then describes Voyager Labs’ specialized software and services this way:

Voyager Labs claims to perceive people’s motives and identify those “most engaged in their hearts” about their ideologies. As part of their marketing materials, they touted retrospective analysis they claimed could have predicted criminal activity before it took place based on social media monitoring.

Voyager Labs’ information was disclosed after the Los Angeles government responded to a Brennan Center Freedom of Information Act request. If you are not familiar with these documents, you can locate at this link which I verified on April 9, 2022. Note that there are 10,000 pages of LA info, so plan on spending some time to locate the information of interest. If you want more information about Voyager Labs, navigate to the company’s Web site.

Net net: Which is the next intelware or policeware company to be analyzed by real news outfits and college professors? I don’t know, but the revelations do not make me happy. The knock on from the NSO Group’s missteps are not diminishing. It appears that there will be more revelations. From my point of view, these analyses provide bad actors with a road map of potholes. The bad actors become more informed, and government entities find their law enforcement and investigative efforts are dulled.

Stephen E Arnold, April 12, 2022

ShadowDragon Profiled by Esteemed Tech Expert Kim Komando

January 13, 2022

This is an interesting turn of events. Policeware vendor ShadowDragon has been profiled by computer guru-ette Kim Komando on her Tech Refresh podcast episode, “Software Tracking Everything You Do, New iPhone, Alexa on Wheels.” The video’s description reads:

“Have you heard of ShadowDragon? It collects data from 120 major sites going back a decade. Yes, 10 years of info about YOU. Plus, the iPhone 13 and iOS 15 are here, along with Amazon’s new smart home gear, including Astro, the Echo on wheels.”

Yes, we have heard of ShadowDragon. The security company mines data from more than 120 social-media websites, archives results for a decade, and shares the information with its law-enforcement clients around the world. ShadowDragon boasts its software can take an investigation down “from months to minutes.” The podcast starts discussing the company at timestamp 13:05, warning one would have to refrain from social media altogether to avoid its reach. The inclusion seems to support our prediction that reporters are becoming more aware of, and reporting more on, such specialized service vendors. This will make it harder for such firms to keep their generally preferred low profiles. Based in Cheyenne, Wyoming, ShadowDragon was founded in 2015.

For those curious, that podcast episode also discussed the newest iPhones, covered some weird news stories, and reviewed smart floodlights, among other wide-ranging topics. Their coverage of Amazon’s Astro home robot caught the attention of this Alexa-wary writer—apparently the device is so thirsty to identify folks with facial recognition it will (if left in “patrol” mode) follow guests around until it can identify them. It also, according to Motherboard, tracks everything owners do.

Cynthia Murrell, January 13, 2021

Palantir at the Intersection of Extremists and Prescription Fraud

January 5, 2022

Blogger Ron Chapman II, ESQ, seems to be quite the fan of Palantir Technologies. We get that impression from his post, “Palantir’s Anti-Terror Tech Used to Fight RX Fraud.” The former Marine fell in love with the company’s tech in Afghanistan, where its analysis of terrorist attack patterns proved effective. We especially enjoyed the rah rah write-up’s line about Palantir’s “success on the battlefield.” Chapman is not the only one enthused about the government-agency darling.

As for Palantir’s move into detecting prescription fraud, we learn the company begins with open-source data from the likes of census data, public and private studies, and Medicare’s Meaningful Use program. Chapman describes the firm’s methodology:

“Palantir then cross-references varying sets of Medicare data to determine which providers statistically deviate from the norm amongst large data sets. For instance, Palantir can analyze prescription data to determine which providers rank the highest in opiate prescribing for a local area. Palantir can then cross-reference those claims against patient location data to determine if the providers’ patients are traveling long distances for opiates. Palantir can further analyze the data to determine if the patient population of a provider has been previously treated by a physician on the Office of Inspector General exclusion database (due to prior misconduct) which would indicate that the patients are not ‘legitimate.’ By using ‘big data’ to determine which providers deviate from statistical trends, Palantir can provide a more accurate basis for a payment audit, generate probable cause for search warrants, or encourage a federal grand jury to further investigate a provider’s activities. After the government obtains additional provider-specific data, Palantir can analyze specific patient files, cell phone data, email correspondence, and electronic discovery. Investigators can review cell phone data and email correspondence to determine if networks exist between providers and patients and determine the existence of a healthcare fraud conspiracy or patient brokering.”

Despite his fondness for Palantir, Chapman does include the obligatory passage on privacy and transparency concerns. He notes that healthcare providers, specifically, are concerned about undue scrutiny should their patient care decisions somehow diverge from a statistical norm. A valid consideration. As with law enforcement, the balance between the good of society and individual rights is a tricky one. Palantir was launched in 2003 by Peter Theil, who was also a cofounder of PayPal and is a notorious figure to some. The company is based in Denver, Colorado.

Cynthia Murrell, January 5, 2022

DarkCyber for December 28, 2021, Now Available

December 28, 2021

This is the 26th program in the third series of DarkCyber video news programs produced by Stephen E Arnold and Beyond Search. You can view the ad-free show at this url. This program includes news of changes to the DarkCyber video series. Starting in January 2022, Dark Cyber will focus on smart software and its impact on intelware and policeware. In addition, Dark Cyber will appear once each month and expand to a 15 to 20 minute format.

What will we do with the production time? We begin a new video series called “OSINT Radar.” OSINT is an acronym for open source intelligence. In a December 2021 presentation to cyber investigators, the idea surfaced of a 60 second profile of a high value OSINT site. We have developed this idea and will publish what we hope will be a weekly video “infodeck” in video form of an OSINT resource currently in use by law enforcement and intelligence professionals. Watch Beyond Search for the details of how to view these short, made-for-mobile video infodecks. Now when you swipe left, you will learn how to perform free reverse phone number look ups, obtain a list of a social media user’s friends, and other helpful data collection actions from completely open source data pools.

Also, in this DarkCyber program are: [a] the blame for government agencies and specialized software vendors using Facebook to crank out false identities. Hint: It’s not the vendors’ fault. [b] why 2022 will be a banner year for bad actors. No, it’s not just passwords, insiders, and corner-cutting software developers. There is a bigger problem. [c] Microsoft has its very own Death Star. Does Microsoft know that the original Death Star was a fiction and it did not survive an attack by the rebels?, and [d] a smart drone with kinetic weapons causes the UN to have a meeting and decide to have another meeting.

Kenny Toth, December 28, 2021

NSO Group: How about That Debt?

December 14, 2021

The NSO Group continues to make headlines and chisel worry lines in the faces of the many companies in Israel which create specialized software and systems for law enforcement and intelligence professionals. You can read the somewhat unpleasant news in Bloomberg’s report, the Financial Times’ article,  and Gizmodo’s Silicon Valley-esque write up. Gizmodo said:

the company’s cumbersome mixture of unpaid debts and growing international scrutiny have made NSO a bloated pariah and is forcing its leadership to consider shutting down its Pegasus spyware unit. Selling the entire company is also reportedly on the table.

First, the reports suggest, without much back up, that NSO Group has about a half a billion US in debt. This is important because it underscores what is the number one flaw in the jazzy business plans of companies making sense of data and providing specialized services to law enforcement, intelligence, and war fighting entities. Here’s my take:

Point 1. What was secret is now open and easily available information.

Since Snowden, the systems and methods informing NSO Group and dozens of similar firms are easy to grasp. Former intelligence professionals can blend what Snowden revealed with whatever these individuals picked up in their service to their country, create a “baby” or “similar” solution and market it. This means that there are more surveillance, penetration, intercept, and analysis options available than at any other time in my 50 year career in online information and systems. Toss in what’s in the wild from dumps of FinFisher and Hacking Team techniques and the gold mine of open source code, and it should be no surprise that the NSO Group’s problem is just the tip of an iceberg, a favorite metaphor in the world of surveillance. None of the newsy reports grasp the magnitude of the NSO Group problem.

Point 2. There’s a lot of “smart” money chasing a big pay day from software purpose built for law enforcement, intelligence, and military operations. VC cows in herds, however, are not that smart or full of wisdom.

There are many investors who buy the line “cyber crime and terrorism” drive big, lucrative sales of specialized software and systems. That’s partially correct. But what’s happened is that the flood of cash has generated a number of commercial enterprisers trying to covert those dollars into highly reliable, easy to use systems. The presentations at off the radar trade shows promise functionality that is almost science fiction. The situation today is that there is a lot of hyper marketing going on because there’s money to apply some very expensive computational methods to what used to be largely secret and manual work. A good case for the travails of selling and keeping customers is the Palantir Technologies’ journey which is more than a decade long and still underway. The marketing is seeping from conferences open only to government agencies and those with clearances to advertising trade shows. I think you can see the risk of moving from low profile or secret government solutions to services for Madison Avenue. I sure can.

Point 3. Too few customers to go around.

There are not enough government customers with deep pockets for the abundant specialized services and systems which are on offer. In this week’s DarkCyber at this link, you can learn about the vendors at conferences where surveillance and applied information collection and analysis explain their products and services. You can also learn that the Brennan Center has revealed documents obtained via FOIA about Voyager Labs, a company which is also engaged in the specialized software and services business. Our DarkCyber report makes clear that license fees are in six figures and include more special add ins than a deal from a flea market vendor selling at the Clignancourt flea market. Competition means prices are falling, and quite effective systems are available for as little as a few hundred dollars per month and sometimes even less. Plus, commercial enterprises are often nervous when the potential customer realizes the power of specialized software and services. Stalking made easy? Yep. Spying on competitors facilitated? Yep. Open source intelligence makes it possible to perform specialized work at a quite attractive price point: Free or a few hundred a month.

What’s next?

Financial wizards may be able to swizzle the NSO Group’s financial pickles into a sweet relish for a ball park frank. There will be other companies in this sector which will face comparable money challenges in the future. From my perspective, it is not possible to put the spilled oil back in the tanker and clean the gunk off the birds now coated in crude.

Policeware and intelware vendors have operated out of sight and out of mind in their bubble since i2 Ltd. in the late 19909s rolled out the Analysts Notebook solution and launched the market for specialized software. The NSO Group’s situation could be or has already shoved a hat pin in that big, fat balloon.

More significantly, formerly blind and indifferent news organizations, government agencies, and potential investors can see what issues specialized software and services pose. More reporting will be forthcoming, including books that purport to reveal how data aggregators are spying on hapless Instagram and TikTok users. Like most of the downstream consequences of the so called digital revolution, NSO Group’s troubles are the tip of an information iceberg drifting into equatorial waters.

Stephen E Arnold, December 14, 2021

Siren 12 Security Platform Relies on Elasticsearch

December 13, 2021

Here is an example of Elastic being stretched a different way. The Intelligence Community News announces, “Siren Releases Siren 12.” The new version of Siren’s security search and analysis platform relies heavily on Elasticsearch—it incorporates Elastic Platinum subscriptions and will support Elasticsearch v8 (still in alpha). Siren 12 consolidates investigative tools for law enforcement, intelligence, and cyber security organizations. Writer Loren Blinde specifies:

“Siren’s latest release makes it easier for users to organize and join data in a way that suits their requirements, with intuitive UI driven schema editing and ETL. It allows organizations to forensically analyze device data and link it to other available data sources. Siren 12 enables investigators to not only browse existing information, but also to create new records and edit graphs freely, for the first time merging the ‘analysis’, the ‘data entry’ and ‘hypothesis and presentation’ phases in investigation in a single intuitive interface. Lastly Siren doubles down on Investigative AI capabilities by introducing Siren Vision, a deep learning based toolkit for automatic image annotation and classification, integrating with Elastic’s anomaly and outlier detection in a way that is consistent with Siren Investigative use cases.”

We note the emphasis on AI; it seems the security field is not letting concerns over algorithmic bias slow it down. Siren execs call this version a huge step forward and hopes it will position their platform as the go-to global reference investigative intelligence platform. Founded in 2014, the company is based in Galway, Ireland.

Cynthia Murrell December 13, 2021

Who Remembers Palantir or Anduril? Maybe Peter Thiel?

November 4, 2021

Despite sci-fi stoked fears about artificial general intelligences (AGI) taking over the world, CNBC reports, “Palantir’s Peter Thiel Thinks People Should Be Concerned About Surveillance AI.” Theil, co-founder of Palantir and investor in drone-maker Anduril, is certainly in the position to know what he is talking about. The influential venture capitalist made the remarks at a recent event in Miami. Writer Sam Shead reports:

“Tech billionaire Peter Thiel believes that people should be more worried about ‘surveillance AI’ rather than artificial general intelligences, which are hypothetical AI systems with superhuman abilities. … Those that are worried about AGI aren’t actually ‘paying attention to the thing that really matters,’ Thiel said, adding that governments will use AI-powered facial recognition technology to control people. His comments come three years after Bloomberg reported that ‘Palantir knows everything about you.’ Thiel has also invested in facial recognition company Clearview AI and surveillance start-up Anduril. Palantir, which has a market value of $48 billion, has developed data trawling technology that intelligence agencies and governments use for surveillance and to spot suspicious patterns in public and private databases. Customers reportedly include the CIA, FBI, and the U.S. Army. AGI, depicted in a negative light in sci-fi movies such as ‘The Terminator’ and ‘Ex Machina,’ is being pursued by companies like DeepMind, which Thiel invested in before it was acquired by Google. Depending on who you ask, the timescale for reaching AGI ranges from a few years, to a few decades, to a few hundred years, to never.”

Yes, enthusiasm for AGI has waned as folks accept that success, if attainable at all, is a long way off. Meanwhile, Thiel is now very interested in crypto currencies. For the famously libertarian mogul, that technology helps pave the way for his vision of the future: a decentralized world. That is an interesting position for a friend of law enforcement.

Cynthia Murrell, November 4, 2021

« Previous PageNext Page »

  • Archives

  • Recent Posts

  • Meta