Trust: Zuck, Meta, and Llama 4
April 17, 2025
Sorry, no AI used to create this item.
CNET published a very nice article that says to me: “Hey, we don’t trust you.” Navigate to “Meta Llama 4 Benchmarking Confusion: How Good Are the New AI Models?” The write up is like a wimpy version of the old PC Perspective podcast with Ryan Shrout. Before the embrace of Intel’s intellectual blanket, the podcast would raise questions about video card benchmarks. Most of the questions addressed: “Is this video card that fast?” In some cases, yes, the video card benchmarks were close to the real world. In other cases, video card manufacturers did what the butcher on Knoxville Avenue did in 1951. Mr. Wilson put his thumb on the scale. My grandmother watched friendly Mr. Wilson who drove a new Buick in a very, very modest neighborhood, closely. He did not smile as broadly when my grandmother and I would enter the store for a chicken.
Would someone put an AI professional benchmarked to this type of test? Of course not. But the idea has a certain charm. Plus, if the person dies, he was fooling. If the person survives, that individual is definitely a witch. This was a winner method to some enlightened leaders at one time.
The CNET story says about the Zuck’s most recent non-virtual reality investment:
Meta’s Llama 4 models Maverick and Scout are out now, but they might not be the best models on the market.
That’s a good way to say, “Liar, liar, pants on fire.”
The article adds:
the model that Meta actually submitted to the LMArena tests is not the model that is available for people to use now. The model submitted for testing is called “llama-4-maverick-03-26-experimental.” In a footnote on a chart on Llama’s website (not the announcement), in tiny font in the final bullet point, Meta clarifies that the model submitted to LMArena was ‘optimized for conversationality.”
Isn’t this a GenZ way to say, “You put your thumb on the scale, Mr. Wilson”?
Let’s review why one should think about the desire to make something better than it is:
- Meta’s decision is just marketing. Think about the self driving Teslas. Consequences? Not for fibbing.
- The Meta engineers have to deliver good news. Who wants to tell the Zuck that the Llama innovations are like making the VR thing a big winner? Answer: No one who wants to get a bonus and curry favor.
- Meta does not have the ability to distinguish good from bad. The model swap is what Meta is going to do anyway. So why not just use it? No big deal. Is this a moral and ethical dead zone?
What’s interesting is that from my point of view, Meta and the Zuck have a standard operating procedure. I am not sure that aligns with what some people expect. But as long as the revenue flows and meaningful regulation of social media remains a windmill for today’s Don Quixotes, Meta is the best — until another AI leader puts out a quantumly supreme news release.
Stephen E Arnold, April 17, 2025
Google AI: Invention Is the PR Game
April 17, 2025
Google was so excited to tout its AI’s great achievement: In under 48 hours, It solved a medical problem that vexed human researchers for a decade. Great! Just one hitch. As Pivot to AI tells us, "Google Co-Scientist AI Cracks Superbug Problem in Two Days!—Because It Had Been Fed the Team’s Previous Paper with the Answer In It." With that detail, the feat seems much less impressive. In fact, two days seems downright sluggish. Writer David Gerard reports:
"The hype cycle for Google’s fabulous new AI Co-Scientist tool, based on the Gemini LLM, includes a BBC headline about how José Penadés’ team at Imperial College asked the tool about a problem he’d been working on for years — and it solved it in less than 48 hours! [BBC; Google] Penadés works on the evolution of drug-resistant bacteria. Co-Scientist suggested the bacteria might be hijacking fragments of DNA from bacteriophages. The team said that if they’d had this hypothesis at the start, it would have saved years of work. Sounds almost too good to be true! Because it is. It turns out Co-Scientist had been fed a 2023 paper by Penadés’ team that included a version of the hypothesis. The BBC coverage failed to mention this bit. [New Scientist, archive]"
It seems this type of Googley AI over-brag is a pattern. Gerard notes the company claims Co-Scientist identified new drugs for liver fibrosis, but those drugs had already been studied for this use. By humans. He also reminds us of this bit of truth-stretching from 2023:
"Google loudly publicized how DeepMind had synthesized 43 ‘new materials’ — but studies in 2024 showed that none of the materials was actually new, and that only 3 of 58 syntheses were even successful. [APS; ChemrXiv]"
So the next time Google crows about an AI achievement, we have to keep in mind that AI often is a synonym for PR.
Cynthia Murrell, April 17, 2026
AI Impacts Jobs: But Just 40 Percent of Them
April 16, 2025
AI enthusiasts would have us believe workers have nothing to fear from the technology. In fact, they gush, AI will only make our jobs easier by taking over repetitive tasks and allowing time for our creative juices to flow. It is a nice vision. Far-fetched, but nice. Euronews reports, “AI Could Impact 40 Percent of Jobs Worldwide in the Next Decade, UN Agency Warns.” Writer Anna Desmarais cites a recent report as she tells us:
“Artificial intelligence (AI) may impact 40 per cent of jobs worldwide, which could mean overall productivity growth but many could lose their jobs, a new report from the United Nations Department of Trade and Development (UNCTAD) has found. The report … says that AI could impact jobs in four main ways: either by replacing or complementing human work, deepening automation, and possibly creating new jobs, such as in AI research or development.”
So it sounds like we could possibly reach a sort of net-zero on jobs. However, it will take deliberate action to get there. And we are not currently pointed in the right direction:
“A handful of companies that control the world’s advancement in AI ‘often favour capital over labour,’ the report continues, which means there is a risk that AI ‘reduces the competitive advantage’ of low-cost labour from developing countries. Rebeca Grynspan, UCTAD’s Secretary-General, said in a statement that there needs to be stronger international cooperation to shift the focus away ‘from technology to people’.”
Oh, is that all? Easy peasy. The post notes it is not just information workers under threat—when combined with other systems, AI can also perform physical production jobs. Desmarais concludes:
“The impact that AI is going to have on the labour force depends on how automation, augmentation, and new positions interact. The UNCTAD said developing countries need to invest in reliable internet connections, making high-quality data sets available to train AI systems and building education systems that give them necessary digital skills, the report added. To do this, UNCTAD recommends building a shared global facility that would share AI tools and computing power equitably between nations.”
Will big tech and agencies around the world pull together to make it happen?
Cynthia Murrell, April 16, 2025
Google Wears a Necklace and Sneakers with Flashing Blue LEDs. Snazzy.
April 15, 2025
No AI. Just an old dinobaby pointing out some exciting developments in the world “beyond search.”
I can still see the flashing blue light in Aisle 7. Yes, there goes the siren. K-Mart in Central Illinois was running a big sale on underwear. My mother loved those “blue light specials.” She would tell me as I covered my eyes and ears, “I don’t want to miss out.” Into the scrum she would go, emerging with two packages of purple boxer shorts for my father. He sat in the car while my mother shopped. I accompanied her because that’s what sons in Central Illinois do. I wonder if procurement officials are familiar with blue light specials. The sirens in DC wail 24×7.
Thanks, OpenAI. You produced a good enough illustration. A first!
I thought about K-Mart when I read “Google Slashes Business Software Prices for US Federal Agencies.” I see that flickering blue light as I type this short blog post. The trusted “real” news source reports:
Google will offer steep discounts to U.S. federal agencies for its business apps package as the company looks to capitalize on the Trump administration’s cost-cutting push and chip away at Microsoft’s longstanding grip on the government software market.
Yep, discounts. Now Microsoft has some traction in the US government. I cannot imagine what life would be like for aides to a senior Pentagon if he did not have nifty PowerPoint presentations. Perhaps offering a deal will get some Microsoft afficionados to learn to live without Excel and Word? I don’t know, but Google is giving the “discount” method a whirl.
What’s up with Google? I think someone told me that Gemini 2.5 was free. Now a discount on GSA listed services which could amount to $2 billion in savings … if — yes, that magic word — if the US government dumps the Softies’ outstanding products for the cloudy goodness of the Google’s way. Yep, “if.”
I have a cute anecdote about Google and the US government from the year 2000, but, alas, I cannot share it. Trust me. It is a knee slapper. And, no, it is not about Sergey wearing silver sparkle sneakers to meetings with US elected officials. Those were indeed eye catchers among shoes with toes that looked like potatoes.
Several observations:
- Google, like Amazon, is trying to obtain US government business. I think the flashing blue lights, if I were still working in the hallowed halls, would impair my vision. Price cutting seems to be the one true way right now.
- Will lower prices have an impact on US government procurement? I am not sure. The procurement process chugs along every day and in quite predictable ways. How long does it take to turn a battleship, assuming the captain can pull off the maneuver without striking a small fishing boat, of course.
- Google seems to think that slashing prices for its “products” will boost sales. My understanding of Google is that its sale to government agencies pivots on several characteristics; for example, [a] listening and understanding what government professionals say, [b] providing a modicum of customer support or at the very least answering a phone call from a government professional, and [c] delivering products that the aides, assistants, and contractors understand and can use to crank out documents with numbered lines, dense charts, and bullet points that mostly stay in place after a graphic is inserted.
To sum up, I find the idea of price cuts interesting. My initial reaction is that price cuts and procurement are not necessarily lined up procedurally. But I am a dinobaby. But after 50 years of “government” work I have a keen desire to see if the Google can shine enough blue lights to bedazzle people involved in purchasing software to keep the admirals happy. (I speak from a little experience working with the late Admiral Craig Hosmer, R-Calif. whom I thank for his service.)
Stephen E Arnold, April 15, 2025
Oracle: Pricked by a Rose and Still Bleeding
April 15, 2025
How disappointing. DoublePulsar documents a senior tech giant’s duplicity in, “Oracle Attempt to Hide Serious Cybersecurity Incident from Customers in Oracle SaaS Service.” Blogger Kevin Beaumont cites reporting by Bleeping Computer as he tells us someone going by rose87168 announced in March they had breached certain Oracle services. The hacker offered to remove individual companies’ data for a price. They also invited Oracle to email them to discuss the matter. The company, however, immediately denied there had been a breach. It should know better by now.
Rose87168 responded by releasing evidence of the breach, piece by piece. For example, they shared a recording of an internal Oracle meeting, with details later verified by Bleeping Computer and Hudson Rock. They also shared the code for Oracle configuration files, which proved to be current. Beaumont writes:
“In data released to a journalist for validation, it has now become 100% clear to me that there has been cybersecurity incident at Oracle, involving systems which processed customer data. … All the systems impacted are directly managed by Oracle. Some of the data provided to journalists is current, too. This is a serious cybersecurity incident which impacts customers, in a platform managed by Oracle. Oracle are attempting to wordsmith statements around Oracle Cloud and use very specific words to avoid responsibility. This is not okay. Oracle need to clearly, openly and publicly communicate what happened, how it impacts customers, and what they’re doing about it. This is a matter of trust and responsibility. Step up, Oracle — or customers should start stepping off.”
In an update to the original post, Beaumont notes some linguistic slight-of-hand employed by the company:
“Oracle rebadged old Oracle Cloud services to be Oracle Classic. Oracle Classic has the security incident. Oracle are denying it on ‘Oracle Cloud’ by using this scope — but it’s still Oracle cloud services that Oracle manage. That’s part of the wordplay.”
However, it seems the firm finally admitted the breach was real to at least some users. Just not in in black and white. We learn:
“Multiple Oracle cloud customers have reached out to me to say Oracle have now confirmed a breach of their services. They are only doing so verbally, they will not write anything down, so they’re setting up meetings with large customers who query. This is similar behavior to the breach of medical PII in the ongoing breach at Oracle Health, where they will only provide details verbally and not in writing.”
So much for transparency. Beaumont pledges to keep investigating the breach and Oracle’s response to it. He invites us to follow his Mastodon account for updates.
Cynthis Murrell, April 15, 2025
Blockchain: Adoption Lag Lies in the Implementation
April 15, 2025
Why haven’t cryptocurrencies taken over the financial world yet? The Observer shares some theories in, "Blockchain’s Billion-Dollar Blunder: How Finance’s Tech Revolution Became an Awkward Evolution." Writer Boris Bohrer-Bilowitzki believes the mistake was trying to reinvent the wheel, instead of augmenting it. He observes:
"For years, the strategy has been replacement rather than integration. We’ve attempted to create entirely new financial systems from scratch, expecting the world to abandon centuries of established infrastructure overnight. It hasn’t worked, and it won’t work. … This disconnect highlights our fundamental misunderstanding of how technological evolution works. Credit cards didn’t replace cash; they complemented it. They added a layer of convenience and security that made transactions easier while working within the existing financial framework. That’s the model blockchain has always needed to follow."
Gee, it makes sense when you put it that way. The write-up points to Swift as an organization that gets it. We learn:
"One of the most promising developments in this space is the international banking system SWIFT’s ongoing blockchain pilot program. In 2025, SWIFT will facilitate live trials, enabling central and commercial banks across North America, Europe and Asia to conduct digital asset transactions on its network. These trials aim to explore how blockchain can enhance payments, foreign exchange (FX), securities trading and trade finance without requiring banks to overhaul their systems."
That sounds promising. But what about consumers? Many are baffled by the very concept of cryptocurrency, never mind how to interact with it. Intuitive interfaces, Bohrer-Bilowitzki stresses, would remove that hurdle. After all, he notes, folks only embrace new technologies that make their lives easier.
Did the esteemed Observer overlook these blockchain downsides?
- Distributed autonomous organizations look more like Discord groups and club members
- Wonky reward layers
- Funding “hopes” is losing some traction
- Web3 seems to filled with janky STARs.
Novelty alone is not enough to drive large-scale adoption. Imagine that.
Cynthia Murrell, April 16, 2025
Stanford AI Report: Credible or Just Marketing?
April 14, 2025
No AI. Just a dinobaby sharing an observation about younger managers and their innocence.
I am not sure I believe reports or much of anything from Stanford University. Let me explain my skepticism. Here’s one of the snips a quick search provided:
I think it was William James said great things about Stanford University when he bumped into the distinguished outfit. If Billie was cranking out Substacks, he would probably be quite careful in using words like “leadership,” “ethical behavior,” and the moral sanctity of big thinkers. Presidents don’t get hired like a temporary worker in front of Home Depot. There is a process, and it quite clear the process and the people and cultural process at the university failed. Failed spectacularly.
Stanford hired and retained a cheater if the news reports are accurate.
Now let’s look at “The 2025 AI Index Report.”
The document’s tone is one of lofty pronouncements.
Stanford mixes comments about smart software with statements like “
Global AI optimism is rising—but deep regional divides remain.
Yep, I would submit that AI-equipped weapons are examples of “regional divides.”
I think this report is:
- Marketing for Stanford’s smart software activities
- A reminder that another country (China) is getting really capable in smart software and may zip right past the noodlers in the Gates Computer Science Building
- Stanford wants to be a thought leader which helps the “image” of the school, the students, the faculty, and the wretches in fund raising who face a tough slog in the years ahead.
For me personally, I think the “report” should be viewed with skepticism. Why? A university which hires a cheater makes quite clear that the silly notions of William James are irrelevant.
I am not sure they are.
Stephen E Arnold, April 14, 2025
Ad Blockers and a Googley Consequence
April 11, 2025
Another dinobaby blog post. Eight decades and still thrilled when I point out foibles.
Motivated individuals are acting in a manner usually associated with Cloudflare-type of outfits. The idea of a “man in the middle” is a good one. It works when one buys something from Amazon. The user wants convenience and does not take the time to hunt around for a better or cheaper version of a particular product.
“Block YouTube Ads on AppleTV by Decrypting and Stripping Ads from Profobuf” provides a recipe for dumping advertisements in some streaming services, but the spotlight is on the lovable Google and Apple’s streaming device. (Poor Apple. Like its misfiring AI and definitely interesting glasses, the company caught a bright person’s attention.)
Social media needs two things: Beacons that phone home and advertising because how else is a company going to push products and services. The write up provides step-by-step instructions for chopping out ads from two big outfits.
Here’s what I think will happen at the monopolies:
- At least two software people will tackle this “problem”: One from Apple and one from Google.
- One will come up with a “fix” to the work-around
- The “fix” will be shared with the company who did not come up with an enhancement first
- The modified method will be deployed
- The game begins again.
The cat-and-mouse sequence is little more than that von Neumann game theory just in real life with money at stake. It’s too bad Johnny and his pals (some of whom were quite quirky) are not around to work on ad blocking instead of nuclear weapons.
Well, Johnny isn’t around, and I think that game theory does not work when one battles multi billion dollar monopolies with lots of reasonably bright people around providing they aren’t veterans of the Apple AI team or the original Google Glass product.
The write up is interesting. I admire the effort the author put into the blocking. How long will it persist? Good question, but the next iteration will probably be designed to preserve the money flow. Ads and user tracking are the means to the end: Big revenue.
Stephen E Arnold, April 11, 2025
Stamping Out Intelligence: Censorship May Work Wonders
April 9, 2025
Sorry, no AI used to create this item.
I live in a state which has some interesting ideas. One of them is that the students are well educated. At this time, I think the state in which I reside holds position 47 out of 50 in terms of reading skills or academic performance. Are the numbers accurate? Probably not, but they indicate that learning is not priority number one in some quarters.
A young student with a gift for mathematics is the class dunce. He has to write on the chalk board, “I will not do linear algebra in class.” Thanks, OpenAI. Know any budding Einsteins in Mississippi?
However, there is a state which performs less well than mine. That state is Mississippi. Should that state hold the rank of the 50th less academically slick entity in the US. Probably not, but the low ranking does say something to some people.
I thought about this notion of “low academic performance” when I read “Mississippi Libraries Ordered to Delete Academic Research in Response to State Laws.” The write up says:
A state commission scrubbed academic research from a database used by Mississippi libraries and public schools — a move made to comply with recent state laws changing what content can be offered in libraries. The Mississippi Library Commission ordered the deletion of two research collections that might violate state law, a March 31 internal memo obtained by Mississippi Today shows. One of the now deleted research collections focused on “race relations” and the other on “gender studies.”
So what?
I find it interesting that in a state holding down the 50th spot in academic slickness assumes that its students will be reading research on these topics or any topics for that matter.
I did a very brief stint as a teacher. In fact, I invested one year teaching in a quite challenging high school environment about 100 miles south of Chicago. If my students read anything, I was quite happy. I suppose today that I would be terminated because I used the Sunday comics, gas station credit card application forms, job applications for the local Hunt’s Drive In, and a wide range of printed matter. My goal was to provide reading material that was different from the standard text book, a text book I used when I was in high school years before I showed up at my teaching job.
The goal is to get students reading. Today, I assume that removing books and research material is more informed than what I did.
Several observations:
- Taking steps to prevent reading is different from how I would approach the question, “What should be in the school library?”
- The message sent to students who actually learn that books and research materials are being removed from the library seems to me to be, “Hey, don’t read this academic garbage.”
- The anti-intellectualism which this removal seems to underscore means that Mississippi is working hard to nail down its number 50 spot.
I am a dinobaby. I am quite thrilled with this fact. I will probably fall over dead with a book in my hands. Remember: I used outside materials to try to engage my students in reading for that one year of high school teaching. I should have been killed when a library stack fell over when I was in grade school.
These types of decisions are going to get the job done for me I think.
Stephen E Arnold, April 9, 2025
AI: Job Harvesting
April 9, 2025
It is a question that keeps many of us up at night. Commonplace ponders, "Will AI Automate Away Your Job?" The answer: Probably, sooner or later. The when depends on the job. Some workers may be lucky enough to reach retirement age before that happens. Writer Jason Hausenloy explains:
"The key idea where the American worker is concerned is that your job is as automatable as the smallest, fully self-contained task is. For example, call center jobs might be (and are!) very vulnerable to automation, as they consist of a day of 10- to 20-minute or so tasks stacked back-to-back. Ditto for many forms of many types of freelancer services, or paralegals drafting contracts, or journalists rewriting articles. Compare this to a CEO who, even in a day broken up into similar 30-minute activities—a meeting, a decision, a public appearance—each required years of experiential context that a machine can’t yet simply replicate. … This pattern repeats across industries: the shorter the time horizon of your core tasks, the greater your automation risk."
See the post for a more detailed example that compares the jobs of a technical support specialist and an IT systems architect.
Naturally, other factors complicate the matter. For example, Hausenloy notes, blue-collar jobs may be safer longer because physical robots are more complex to program than information software. Also, the more data there is on how to do a job, the better equipped algorithms are to mimic it. That is one reason many companies implement tracking software. Yes, it allows them to micromanage workers. And also it gathers data needed to teach an LLM how to do the job. With every keystroke and mouse click, many workers are actively training their replacements.
Ironically, it seems those responsible for unleashing AI on the world may be some of the most replaceable. Schadenfreude, anyone? The article notes:
"The most vulnerable jobs, then, are not those traditionally thought of as threatened by automation—like manufacturing workers or service staff—but the ‘knowledge workers’ once thought to be automation-proof. And most vulnerable of all? The same Silicon Valley engineers and programmers who are building these AI systems. Software engineers whose jobs are based on writing code as discrete, well-documented tasks (often following standardized updates to a central directory) are essentially creating the perfect training data for AI systems to replace them."
In a section titled "Rethinking Work," Hausenloy waxes philosophical on a world in which all of humanity has been fired. Is a universal basic income a viable option? What, besides income, do humans get out of their careers? In what new ways will we address those needs? See the write-up for those thought exercises. Meanwhile, if you do want to remain employed as long as possible, try to make your job depend less on simple, repetitive tasks and more on human connection, experience, and judgement. With luck, you may just reach retirement before AI renders you obsolete.
Cynthia Murrell, April 9, 2025