Palantir Technologies: Following a Well Worn Path
August 11, 2022
Most intelware vendors are pretty much search and retrieval with a layer of search based applications. I think of these specialized services like an over-priced foam dog bed. The foam is hidden beneath what looks like a rich, comfy, and pet friendly cover. The dog climbs on, sniffs the fumes and scratches the cover. A bite or two and the cover tears and foam shards litter the floor.
When I think of some intelware vendors’ solutions, I keep thinking about that Alibaba-type dog bed. Wow. Not good.
I read “Palantir Stock Skids As Exec Says Downbeat Forecast Is All the More Disappointing Given Opportunities Ahead”, and I saw that dog bed, the torn cover, and the weird pink and green foam chunks in our family room. I know this association is not one shared by those who cheerlead for Palantir or the stakeholders who must look at the value of their “stakes”.
The write up reports:
Government deals “at the billion-dollar range of the contracts that we are working on…have the bug of them taking too long and the feature of, in a highly difficult, tumultuous and politically uncertain world, that you actually get paid and you actually make free-cash flow,” Chief Executive Alex Karp said on the earnings call.
Yep, that’s true.
However, Palantir has been working hard to convince outfits like chocolate companies, big banks, and some pharma companies to rely on Palantir for their information plumbing and intelligence dashboard. (Dashboards are hot, even though many intelware vendors just recycle the components associated with Elasticsearch, a popular open source search and retrieval system, and other members of the species ELK.
If Palantir were closing deals with non governmental entities, wouldn’t that revenue make up for the historically slow and sketchy US government procurement process. For those in the know, FAR is a friend. For those who have racked up a track record of grousing about Federal procurement rules, FAR can be associated with the concept “far outside the circle of decision makers.”
If we accept my assertion of intelware as basic search, indexing and classifying content objects, and output nice looking reports. These reports, by the way, depend upon some widely used numerical recipes. The outputs of competitive intelware systems which use the same test set of content objects is often similar. In some cases, very similar. (In September at CyCon, we will show some screenshots and challenge the audience of law enforcement and intelligence professionals to identify the output with the system generating the diagrams, charts, graphs, and maps. In previous lectures this audience involvement ploy yielded one predictable result: No one could match outputs with the system producing it.
What are the paths available to a vendor of intelware chasing huge contracts for getting close to 20 years? That’s two decades, gentle reader.
Based on my observations and research for my books and monographs, here are the historical precedents I have noticed. Will Palantir follow any of these paths? Probably not, but I enjoy trotting them out in order to provide some color for the search and specialized software sector competitors. What each competitor lacked in applications, stable products and services, and informed and available customer support, the PP (Palantir predecessors) had outstanding marketing, nifty technical jargon, and a bit of the Steve Jobs reality distortion field magic.
- The vendor just gets acquired. Recorded Future is now Insight. Super secretive Detica is BAE Systems, etc. etc. The idea is that the buyer has the resources to make the software work and develop innovations that will keep ahead of open source offerings and pesky start ups. A variation is continuous resales as owners of intelware companies realize there are not enough customers to deliver the claims in PowerPoint decks’ revenue projections. Is one example this sequence? i2 Ltd (UK) —> venture firm –> IBM Corp. –> Harris?
- The vendor hooks up with the government and presents the face of a standalone, independent outfit when affiliated with a government entity. Example: Some intelware firms in China, Israel, and the UK.
- The vendor goes away or turns a few cartwheels and emerges as something else entirely. Example: Cobwebs Technologies doesn’t do intelware; it provides anti money laundering services. I still like LifeRaft’s positioning as a marketing intelligence company.
- Everybody involved with the company moves on, new executives arrive, and the firm emerges as a customer service outfit or a customer experience provider. Rightly or wrongly I think of LucidWorks as this type of outfit.
- A combo deal. The inner workings of this type of deal converts Excalibur into Convera which becomes Ntent and then becomes a property of Allen & Co. Where is Convera today? I heard that some of its DNA survives in Seekr, but I have not heard back from the company to verify this rumor. The firm’s PR professional is apparently busy doing more meaningful PR things.
- Creative accounting. Believe it or not, some senior executives are found guilty of financial fancy dancing. Example: The founder of a certain search vendor with government clients. I think a year in the slammer was talked about.
- The company just closes up. Example: Perhaps Delphis, Entopia, or Stull, among others.
Net net: Vendors selling to law enforcement, crime analysts, and intelligence agencies face formidable competition from incumbents; for example, big Beltway bandits like the one for which I used to work. Furthermore, when selling intelware (event with a name change and a flashy PowerPoint deck) corporate types are not comfortable buying from a company working closely with some of the badge-and-gun agencies. Intelware vendors can talk about big sales to commercial enterprises. True, the intelware vendor may land some deals. But the majority of leads just become money pits: Sales calls, presentations, meetings with shills for the firm’s lawyers, and similar human resources. Those foam chunks from the Alibaba dog bed are similar to some investors’ dreams of giant stakeholder paydays. Oh, well, there is recycling.
Stephen E Arnold, August 11, 2022
Is the New Era of Timesharing Winding Down?
August 11, 2022
What kind of question is that? Stupid for sure. The cloud is infinite. The earnings bright spots for Amazon, Google, and Microsoft are cloud revenue and services. Google wants to amp up its cloud because sitting in third place behind the dorky outfits Amazon and Microsoft is not part of the high school science club’s master plan. And Microsoft cannot cope with Amazon AWS. Accordingly Microsoft is chasing start ups in order to be in the front of the ChocoTaco line for the next big thing. And Amazon. Fancy moves like killing long-provided services like backup, making changes that will cause recoding of some applications, and thinking about ways to increase revenue from Fancy Dan billing thresholds.
The cloud is the big thing.
If the information in “Why AI and Machine Learning Are Drifting Away from the Cloud” is on the money, one of those odd ball Hegelian things may be gaining momentum. The reference is to the much loved and pretty obvious theory that sine waves operated in the biological world. I am referring to the old chestnut test question about thesis, antithesis, and synthesis. Stated another way: First there was a big computer. Then there was timesharing. Then there was the personal computer. Then there was client server. That begot the new version of the cloud. The future? Back to company-owned and controlled computers. Hegelian stuff, right?
The article presents this idea:
Cloud computing isn’t going anywhere, but some companies are shifting their machine learning data and models to their own machines they manage in-house. Adopters are spending less money and getting better performance.
Let’s follow this idea. If smart software becomes the next big thing as opposed to feeding people, the big clouds will face customer defection and maybe pushback about pricing, lock in, and restrictions on what can and cannot be done on the services. (Yep, some phishing outfits use the cloud to bedevil email users. Yes, some durable Dark Web sites host some of their data on big cloud services. Yep, some cloud services have “inspection” tools to prevent misuse which may not be as performant as the confections presented in marketing collateral.)
With more AI, perhaps there will be less cloud. Then what?
The write up points out:
Companies shifting compute and data to their own physical servers located inside owned or leased co-located data centers tend to be on the cutting edge of AI or deep-learning use, Robinson [vice president of strategic partnerships and corporate development at MLOps platform company Domino Data Lab] said. “[They] are now saying, ‘Maybe I need to have a strategy where I can burst to the cloud for appropriate stuff. I can do, maybe, some initial research, but I can also attach an on-prem workload.”
Hegel? What’s he got to do with this rethinking of the cloud, today’s version of good old timesharing? Probably nothing. The sine wave theories are silly. Ask any Econ 101 or Poli Sci 101 student. And who does not enjoy surprise charges for cloud computing services which are tough to see through? I know I do.
Stephen E Arnold, August 11, 2022
Another Facebook Innovation: Imitating Twitch
August 11, 2022
I don’t know if the information in “Meta Is Testing a New Live Streaming Super Platform for Influencers Called Super.” I like the name of the alleged new Einsteinian-grade service. It’s super.k
The article reports:
The new platform allows influencers to host live streams, earn revenue and engage with viewers. The company has reportedly paid influencers between $200 and $3,000 to use the platform for 30 minutes.
How is the Zuckster’s Super new Super going to lure those who produce Twitchy stuff? The write up says:
Meta has recently reached out to multiple creators asking them to try out the new project. The platform, which looks to have similar functionality to Twitch, is currently being tested with fewer than 100 creators, including tech influencer Andru Edwards and TikTok star Vienna Skye.
My hunch is that Zuckbucks are going to be needed to “lure” some talent. Microsoft demonstrated its ability to create a streaming service not too long ago. Remember that? Yeah, neither does anyone on my research team. I wonder if MSFT’s CFO has any records of the money paid to a certain game streamer. Nah. Of course not.
The creativity of the Zucksters is amazing. Super in fact.
Stephen E Arnold, August 11, 2022
Google Kicked Out of School in Denmark
August 11, 2022
Like its colleagues in Netherlands and Germany, the Denmark data protection authority has taken a stand against Google’s GDPR non-compliance. European secure-email firm Tutanota reports on its blog, “Denmark Bans Gmail and Co from Schools Due to Privacy Concerns.” Schools in the Helsingør Municipality have until August 3 to shift to a different cloud solution. We learn:
“In a statement published mid July, the Danish data protection agency expresses ‘serious criticism and bans … the use of Google Workspace’. Based on a risk assessment for the Helsingør Municipality, the data protection authority concluded that the processing of personal data of pupils does not meet the requirements of the GDPR and must, therefor, stop. The ban is effective immediately. Helsingør has until August 3 to delete pupil’s data and start using an alternative cloud solution. … This decision follows similar decisions by Dutch and German authorities. The issues that governmental institutions see themselves faced with has started with the invalidation of Privacy Shield back in 2020. Privacy Shield has been a data transferring agreement between the USA and the European Union and was supposed to make data transfers between the two legally possible. However, the agreement has been declared invalid by the European Court of Justice (ECJ) in 2020 due to privacy concerns. One major problem that the EU court pointed out is that data of foreigners is not protected in the USA. The protections that are there – even if limited – only apply to US citizens.”
So the NSA can gain unfettered access to the personal data of Europeans but not US citizens. We can see how authorities in the EU might have a problem with that. As the Danish agency notes, such a loophole violates rights considered fundamental in Europe. Not surprisingly, this Tutanota write-up emphasizes the advantages of a Europe-based email service like Tutanota. It is not wrong. It seems Denmark has woken up to the Google reality. Now what about Web-search tracking?
Cynthia Murrell, August 11, 2022
The Zuckbook Smart Chatbot: It May Say Bad Things Like the Delightful Tay.ai?
August 10, 2022
I read an amusing article called “Meta Warns Its New Chatbot May Not Tell You the Truth.” The write up states:
Meta warns that BlenderBot 3 is also capable of saying some bad things. It seems to be Meta’s main unresolved problem, despite having a model that can learn from feedback. “Despite all the work that has been done, we recognize that BlenderBot can still say things we are not proud of,” it says in a BlenderBot 3 FAQ page.
The write up contains other interesting statements; for example:
Google is aiming to improve the “factual groundedness” of chatbots and conversational AI through LaMDA or the “Language Models for Dialog Applications”, which it unveiled in mid-2021…. LaMDA is a 137 billion parameter model that took almost two months of running on 1,024 of Google’s Tensor Processing Unit chips to develop.
And:
Meta says BlenderBot 3 is a 175 billion parameter “dialogue model capable of open-domain conversation with access to the internet and a long-term memory.”
My reaction? You don’t Tay?
Stephen E Arnold, August 10, 2022
The Expanding PR Challenge for Cyber Threat Intelligence Outfits
August 10, 2022
Companies engaged in providing specialized services to law enforcement and intelligence entities have to find a way to surf on the building wave of NSO Group backlash.
What do I mean?
With the interest real journalists have in specialized software and services has come more scrutiny from journalists, financial analysts, and outfits like Citizens Lab.
The most recent example is the article which appeared in an online publication focused on gadgets. The write up is “: These Companies Know When You’re Pregnant—And They’re Not Keeping It Secret. Gizmodo Identified 32 Brokers Selling Data on 2.9 Billion Profiles of U.S. Residents Pegged as Actively Pregnant or Shopping for Maternity Products.” The write up reports:
A Gizmodo investigation into some of the nation’s biggest data brokers found more than two dozen promoting access to datasets containing digital information on millions of pregnant and potentially pregnant people across the country. At least one of those companies also offered a large catalogue of people who were using the same sorts of birth control that’s being targeted by more restrictive states right now. In total, Gizmodo identified 32 different brokers across the U.S. selling access to the unique mobile IDs from some 2.9 billion profiles of people pegged as “actively pregnant” or “shopping for maternity products.” Also on the market: data on 478 million customer profiles labeled “interested in pregnancy” or “intending to become pregnant.”
To add some zest to the write up, the “real news” outfit provided a link to 32 companies allegedly engaged in such data aggregation, normalization, and provision. Here are the 32 companies available from the gadget blogs link. Note sic means this is the actual company name. The trendy means very hip marketing.
123Push
Adprime Health
Adstra
Alike Audience
Anteriad (180byTwo)
Cross Pixel
Datastream Group
Dstillery (sic and trendy)
Epsilon
Experian
Eyeota (sic and trendy)
FieldTest
Fluent
Fyllo (sic)
LBDigital
Lighthouse (Ameribase Digital)
PurpleLab
Quotient
Reklaim (sic)
ShareThis
Skydeo
Stirista (Crosswalk) (sic)
TrueData
Valassis Digital
Weborama Inc
Ziff Davis
ZoomInfo (Clickagy)
How many of these do you recognize? Perhaps Experian, usually associated with pristine security practices and credit checks? What about Ziff Davis, the outfit which publishes blogs which reveal the inner workings of Microsoft and a number of other “insider” information? Or Zoom Info, an outfit once focused on executive information and now apparently identified as a source of information to make a pregnant teen fear the “parent talk”?
But the others? Most people won’t have a clue? Now keep in mind these are companies in the consumer information database business. Are there other firms with more imaginative sources of personal data than outfits poking around open source datasets, marketing companies with helpful log file data, and blossoming data scientists gathering information from retail outlets?
The answer is, “Yes, there are.”
That brings me to the building wave of NSO Group backlash. How does one bridge the gap between a government agency using NSO Group type tools and data?
The answer is that specialized software and services firms themselves are the building blocks, engineer-constructors, and architect-engineers of these important bridges.
So what’s the PR problem?
Each week interesting items of information surface. For example, cyber threat firms report new digital exploits. I read this morning about Cerebrate’s Redeemer. What’s interesting is that cyber threat firms provide software and services to block such malware, right? So the new threat appears to evade existing defense mechanisms. Isn’t this a circular proposition: Buy more cyber security. Learn about new threats. Ignore the fact that existing systems do not prevent the malware from scoring a home run? Iterate… iterate… iterate.
At some point, a “real news” outfit will identify the low profile engineers engaged in what might be called “flawed bridge engineering.”
Another PR problem is latent. People like the Kardashians are grousing about Instagram. What happens when influencers and maybe some intrepid “real journalists” push back against the firms collecting personal information very few people think of as enormously revelatory. Example: Who has purchased a “weapon” within a certain geofence? Or who has outfitted an RV with a mobile Internet rig? Or who has signed up for a Dark Web forum and accessed it with a made up user name?
Who provides these interesting data types?
The gadget blog is fixated on pregnancy because of the current news magnetism. Unfortunately the pursuit of clicks with what seems really significant does not provide much insight into the third party data businesses in the US, Israel, and other countries.
That’s the looming PR problem. Someone is going to step back and take a look at companies which do not want to become the subject of a gadget blog write up with a 30 plus word headline. In my opinion, that will happen, and that’s the reason certain third party data providers and specialized software and services firms face a crisis. These organizations have to sell to survive, except for a handful supported by their countries’ governments. If that marketing becomes too visible, then the gadget bloggers will out them.
What’s it mean when a cyber threat company hires a former mainstream media personality to bolster the company’s marketing efforts? I have some thoughts. Mine are colored by great sensitivity to the NSO Group and the allegations about its Pegasus specialized software. If these allegations are true, what better way to get personal data than suck it directly from a single target’s or group of targets’ mobile devices in real time?
Here are the chemical compounds in the data lab: The NSO Group-type technology which is increasingly understood and replicated. Gadget bloggers poking around data aggregators chasing ad and marketing service firms. Cyber threat companies trying to market themselves without being too visible.
The building wave is on the horizon, just moving slowly.
Stephen E Arnold, August 10, 2022
Oracle: Marketing Experience or MX = Zero?
August 10, 2022
How does one solve the problem MX = 0? One way is to set M to zero and X to zero and bingo! You have zero. If the information in the super select, restricted, juicy article called “Oracle Insiders Describe the Complete Chaos from Layoffs and Restructuring While Employees Brace for More” is accurate, the financially lucrative Oracle database system is unhappy with the firm’s marketing. Not just the snappy PowerPoint decks or the obedient database administrator documentation. Nope. Everything is apparently a bit of indigestion.
The write up which is as I have mentioned is super selected, restricted, and juicy is a bit jumbled. Nevertheless, I noted several observations I found interesting. Let me summarize the 1,100 word report this way: Lots of people from marketing and customer experience (whatever that is) have been fired. Okay. Now let’s look at the comments that struck me as significant. Keep in mind that I love Oracle. Yep, clients just pay those who can make the sleek, efficient, tightly integrated components hum like an electric motor on a fully functioning Ford F 150 Lightning. Here we go. (My comments appear in italics after each bullet.)
- “The common verb to describe ACX is that they were obliterated,” said a person who works at Oracle. (I quite liked the use of the word “obliterated.” Was Oracle using a Predator launched flying ginsu management bomb or just an email or maybe a Zoom call?)
- “There’s no marketing anymore…” (My question is, “Was there ever any marketing at Oracle?” Bombast, yes. Rah rah conferences. Jet flights after curfew at the San Jose airport. But marketing? In my opinion, no.)
- “There’s a sense among many at Oracle of impending doom…” (Yep, upbeat stuff.)
- “We’ve been kind of working like zombies the last couple of weeks because there’s just this sense of ‘What am I doing here?” (The outfit on the former Sea World exit excels at management. Well, maybe it doesn’t? How does the Oracle hit above its weight? That’s a good question. Let’s ask Cerner about the electronic medical record business and its seamless functioning with the Oracle database, shall I? No I shall not.)
- “…Oracle’s code base is so complicated that it can take years before engineers are fully up to speed with how everything works, and workers with over a decade of experience were cut…” (Ah, ha, Oracle is weeding out the dinobabies. Useless deadwood. A 20 something engineer can figure out where an entire database is hiding.)
Net net: I hate to suggest this, but perhaps some database types think using AWS, the GOOG, or the super secure MSFT data management systems is better, faster, and cheaper. Pick two.
Stephen E Arnold, August 10, 2022
How about a Decade of Vulnerability? Great for Bad Actors
August 10, 2022
IT departments may be tired of dealing with vulnerabilities associated with Log4j, revealed late last year, but it looks like the problem will not die down any time soon. The Register reveals, “Homeland Security Warns: Expect Log4j Risks for ‘a Decade or Longer’.” Because the open-source tool is so popular, it can be difficult to track down and secure all instances of its use within an organization. Reporter Jessica Lyons Hardcastle tell us:
“Organizations can expect risks associated with Log4j vulnerabilities for ‘a decade or longer,’ according to the US Department of Homeland Security. The DHS’ Cyber Safety Review Board‘s inaugural report [PDF] dives into the now-notorious vulnerabilities discovered late last year in the Java world’s open-source logging library. The bugs proved to be a boon for cybercriminals as Log4j is so widely used, including in cloud services and enterprise applications. And because of this, miscreants soon began exploiting the flaws for all kinds of illicit activities including installing coin miners, stealing credentials and data, and deploying ransomware.”
Fortunately, no significant attacks on critical infrastructure systems have been found. Yet. The write-up continues:
“‘ICS operators rarely know what software is running on their XIoT devices, let alone know if there are instances of Log4j that can be exploited,’ Thomas Pace, a former Department of Energy cybersecurity lead and current CEO of NetRise, told The Register. NetRise bills itself as an ‘extended IoT’ (xIoT) security firm. ‘Just because these attacks have not been detected does not mean that they haven’t happened,’ Pace continued. ‘We know for a fact that threat actors are exploiting known vulnerabilities across industries. Critical infrastructure is no different.'”
Security teams have already put in long hours addressing the Log4j vulnerabilities, often forced to neglect other concerns. We are told one unspecified US cabinet department has spent some 33,000 hours guarding its own networks, and the DHS board sees no end in sight. The report classifies Log4j as an “endemic vulnerability” that could persist for 10 years or more. That is a long time for one cyber misstep to potentially trip up so many organizations. See the article for suggestions on securing systems that use Log4j and other open-source software.
Cynthia Murrell, August 10, 2022
Machine Learning: Cheating Is a Feature?
August 9, 2022
I read “MIT Boffins Make AI Chips 1 Million Times Faster Than the Synapses in the Human Brain. Plus: Why ML Research Is Difficult to Produce – and Army Lab Extends AI Contract with Palantir.” I dismissed the first item as some of the quantum supremacy stuff output by high school science club types. I ignored the Palantir Technologies’ item because the US Army has to make a distributed common ground system work and leave resolution to the next team rotation. Good or bad, Palantir has the ball. But the middle item in the club sandwich article contains a statement I found particularly interesting.
If you have followed out comments about smart software, we have taken a pragmatic view of getting “AI/ML” systems to work in the 80 to 95 percent confidence range in a consistent way even when new “content objects” are fed into the zeros and ones. To get off on the right foot, human subject matter experts assembled training data which reflected the content the system would be processing in the real world. The way smart software is expected to work is that it learns… on its own… sort of. It is very time consuming and very expensive to create hand crafted training sets and then “update” the system with the affected module. What if the prior content had to be reprocessed? Well, not too many have the funds, time, resources, and patience for that.
Thus, today’s AI/ML forward leaning cost conscious wizards want to use synthetic data, minimize the human SMEs’ cost and time, and do everything auto-magically. Sounds good. Yes, and the ideas make great PowerPoint decks too.
The sentence in the article which caught may attention is this one:
Data leakage occurs when the data used to train an algorithm can leak into its testing; when its performance is assessed the model seems better than it actually is because it has already, in effect, seen the answers to the questions. Sometimes machine learning methods seem more effective than they are because they aren’t tested in more robust settings.
Here’s the link to “Leakage and the Reproducibility Crisis in ML-Based Science in which more details appear. Wowza if these experts are correct. Who goes swimming without a functioning snorkel? Maybe the Google?
Stephen E Arnold, August 8, 2022
DARPA Works to Limit Open Source Security Threats
August 9, 2022
Isn’t it a little late? Open-source code has become an integral part of nearly every facet of modern computing, including military and critical infrastructure applications. Now, reports MIT Technology Review, “The US Military Wants to Understand the Most Important Software on Earth.” It seems military researchers have just realized there is no control over, or even accounting for, the countless contributors to open-source projects like the Linux kernel. That software alone underpins the operation of most computers. And yet the feature that makes open-source software free and, therefore, ubiquitous also makes it vulnerable to bad actors.
Since it cannot turn back the clock and consider security before open-source code got baked into critical software, DARPA will instead scrutinize the people and organizations behind open-source projects. The program, dubbed “SocialCyber,” will take 18 months and millions of dollars to implement. It will use a combination of the latest AI tech and good old-fashioned sociology to pinpoint potential threats. Reporter Patrick Howell O’Neill writes:
“The ultimate goal is to detect and counteract any malicious campaigns to submit flawed code, launch influence operations, sabotage development, or even take control of open-source projects. To do this, the researchers will use tools such as sentiment analysis to analyze the social interactions within open-source communities such as the Linux kernel mailing list, which should help identify who is being positive or constructive and who is being negative and destructive. The researchers want insight into what kinds of events and behavior can disrupt or hurt open-source communities, which members are trustworthy, and whether there are particular groups that justify extra vigilance. These answers are necessarily subjective. But right now there are few ways to find them at all. Experts are worried that blind spots about the people who run open-source software make the whole edifice ripe for potential manipulation and attacks. For Bratus, the primary threat is the prospect of ‘untrustworthy code’ running America’s critical infrastructure—a situation that could invite unwelcome surprises. …This kind of research also aims to find underinvestment—that is critical software run entirely by one or two volunteers.”
The program relies on partnerships between DARPA and several small cybersecurity research firms like New York’s Margin Research. These firms will ascertain who is working on what open-source projects. Margin will focus on Linux, considered the most urgent point of concern. Open-source programming language Python, which is often used in machine-learning projects, is another priority. SocialCyber is quite an undertaking—it is the pound of cure we could have avoided with an ounce of foresight several years ago.
Cynthia Murrell, August 9, 2022