Vendor Lock In: Good for Geese, Bad for Other Birds
December 21, 2023
This essay is the work of a dumb dinobaby. No smart software required.
TechCrunch examines the details behind the recent OpenAI CEO exodus and why vendors are projecting a clear and present danger to tech startups: “OpenAI Mess Exposes The Dangers Of Vendor Lock-In For Start-Ups.” When startups fundraise for seed capital, they attract investors that can influence the company’s future.
OpenAI’s leaders, including ex-CEO Sam Altman and cofounder Greg Brockman, accepted investments from Microsoft. This forced OpenAI into a vendor lock-in situation, where they relied on Microsoft as their business model and vendor of OpenAI products. While there are many large language model products on the market, ChatGPT 3.5 and 4 were touted as the best product on the market. It’s in no small part to Microsoft’s investment and association with the respected and powerful tech company.
As Sam Altman and his OpenAI friends head to Microsoft, it points to the importance of diversifying a company’s vendor portfolio. While ChatGPT was the most popular solution, many tech experts believe that it’s better to research all options instead of relying on one service:
“The companies that chose a flexible approach over depending on a single AI model vendor must be feeling pretty good today. If there is any object lesson to be learned from all this, even as the drama continues to play out in real time, it’s that it’s never, ever a good idea to go with a single vendor.
Founders who put all of their eggs in the OpenAI basket now find themselves suddenly in a very uncomfortable situation, as the uncertainty around OpenAI continues to swirl.”
This is what happens when you rely on one vendor to supply all technology and related services. It’s always best to research and have multiple backup options in case one doesn’t work out.
Whitney Grace, December 21, 2023
Google AI and Ads: Beavers Do What Beavers Do
December 20, 2023
This essay is the work of a dumb dinobaby. No smart software required.
Consider this. Take a couple of beavers. Put them in the Cloud Room near the top of the Chrysler Building in Manhattan. Shut the door. Come back in a day. What have the beavers done? The beavers start making a dam. Beavers do what beavers do. That’s a comedian’s way of explaining that some activities are hard wired into an organization. Therefore, beavers do what beavers do.
I read the paywalled article “Google Plans Ad Sales Restructuring as Automation Booms” and the other versions of the story on the shoulder of the Information Superhighway; for example, the trust outfit’s recycling of the Information’s story. The giant quantum supremacy, protein folding, and all-round advertising company is displaying beaver-like behavior. Smart software will be used to sell advertising.
That ad DNA? Nope, the beavers do what beavers do. Here’s a snip from the write up:
The planned reorganization comes as Google is relying more on machine-learning techniques to help customers buy more ads on its search engine, YouTube and other services…
Translating: Google wants fewer people to present information to potential and actual advertisers. The idea is to reduce costs and sell more advertising. I find it interesting that the quantum supremacy hoo-hah boils down to … selling ads and eliminating unreliable, expensive, vacation-taking, and latte consuming humans.
Two real beavers are surprised to learn that a certain large and dangerous creature also has DNA. Notice that neither of the beavers asks the large reptile to join them for lunch. The large reptile may, in fact, view the beavers as something else; for instance, lunch. Thanks, MSFT Copilot. Good enough.
Are there other ad-related changes afoot at the Google? According to “Google Confirms It is Testing Ad Copy Variation in Live Ads” points out:
Google quietly started placing headlines in ad copy description text without informing advertisers
No big deal. Just another “test”, I assume. Search Engine Land (a publication founded, nurtured, and shaped into the search engine optimization information machine by Dan Sullivan, now a Googler) adds:
Changing the rules without informing advertisers can make it harder for them to do their jobs and know what needs to be prioritized. The impact is even more significant for advertisers with smaller budgets, as assessing the changes, especially with responsive search ads, becomes challenging, adding to their workload.
Google wants to reduce its workload. In pursuing that noble objective, if Search Engine Land is correct, may increase the workload of the advertisers. But never fear, the change is trivial, “a small test.”
What was that about beavers? Oh, right. Certain behaviors are hard wired into the DNA of a corporate entity, which under US law is a “person” someone once told me.
Let me share with you several observations based on my decades-long monitoring of the Google.
- Google does what Google wants and then turns over the explanation to individuals who say what is necessary to deflect actual intent, convert actions into fuzzy Google speech, and keep customer and user pushback to a minimum. (Note: The tactic does not work with 100 percent reliability as the recent loss to US state attorneys general illustrates.)
- Smart software is changing rapidly. What appears to be one application may (could) morph into more comprehensive functionality. Predicting the future of AI and Google’s actions is difficult. Google will play the odds which means what the “entity” does will favor its objective and goals.
- The quaint notion of a “small test” is the core of optimization for some methods. Who doesn’t love “quaint” as a method for de-emphasizing the significance of certain actions. The “small test” is often little more than one component of a larger construct. Dismissing the small is to ignore the larger component’s functionality; for example, data control and highly probable financial results.
Let’s flash back to the beavers in the Cloud Room. Imagine the surprise of someone who opens the door and sees gnawed off portions of chairs, towels, a chunk of unidentifiable gook piled between two tables.
Those beavers and their beavering can create an unexpected mess. The beavers, however, are proud of their work because they qualify under an incentive plan for a bonus. Beavers do what beavers do.
Stephen E Arnold, December 20, 2023
FTC Enacts Investigative Process for AI Technology
December 20, 2023
This essay is the work of a dumb dinobaby. No smart software required.
Creative types and educational professionals are worried about the influence of AI-generated work. However, law, legal, finance, business operations, and other industries are worried about how AI will impact them. Aware about the upward trend in goods and services that are surreptitiously moving into the market, the Federal Trade Commission (FTC) took action. The FTC released a briefing on the new consumer AI protection: “FTC Authorities Compulsory Process For AI-Related Products And Services.”
The FTC passed an omnibus resolution that authorizes a compulsory process in nonpublic investigations about products and services that use or claim to be made with AI or claim to detect it. The new omnibus resolution will increase the FTC’s efficiency with civil investigation demands (CIDs), a compulsory process like a subpoena. CIDs are issued to collect information, similar to legal discovery, for consumer protection and competition investigations. The new resolution will be in effect for ten years and the FTC voted to approve it 3-0.
The FTC defines AI as:
“AI includes, but is not limited to, machine-based systems that can, for a set of defined objectives, make predictions, recommendations, or decisions influencing real or virtual environments. Generative AI can be used to generate synthetic content including images, videos, audio, text, and other digital content that appear to be created by humans. Many companies now offer products and services using AI and generative AI, while others offer products and services that claim to detect content made by generative AI.”
AI can also be used for deception, privacy infringements, fraud, and other illegal activities. AI can causes competition problems, such as if a few companies monopolize algorithms are other AI-related technologies.
The FTC is taking preliminary steps to protect consumers from bad actors and their nefarious AI-generated deeds. However, what constitutes a violation in relation to AI? Will the data training libraries be examined along with the developers?
Whitney Grace, December 20, 2023
Why Humans Follow Techno Feudal Lords and Ladies
December 19, 2023
This essay is the work of a dumb dinobaby. No smart software required.
“Seduced By The Machine” is an interesting blend of human’s willingness to follow the leader and Silicon Valley revisionism. The article points out:
We’re so obsessed by the question of whether machines are rising to the level of humans that we fail to notice how the humans are becoming more like machines.
I agree. The write up offers an explanation — it’s arriving a little late because the Internet has been around for decades:
Increasingly we also have our goals defined for us by technology and by modern bureaucratic systems (governments, schools, corporations). But instead of providing us with something equally rich and well-fitted, they can only offer us pre-fabricated values, standardized for populations. Apps and employers issue instructions with quantifiable metrics. You want good health – you need to do this many steps, or achieve this BMI. You want expertise? You need to get these grades. You want a promotion? Hit these performance numbers. You want refreshing sleep? Better raise your average number of hours.
A modern high-tech pied piper leads many to a sanitized Burning Man? Sounds like fun. Look at the funny outfit. The music is a TikTok hit. The followers are looking forward to their next “experience.” Thanks, MSFT Copilot. One try for this cartoon. Good enough again.
The idea is that technology offers a short cut. Who doesn’t like a short cut? Do you want to write music in the manner of Herr Bach or do you want to do the loop and sample thing?
The article explores the impact of metrics; that is, the idea of letting Spotify make clear what a hit song requires. Now apply that malleability and success incentive to getting fit, getting start up funding, or any other friction-filled task. Let’s find some Teflon, folks.
The write up concludes with this:
Human beings tend to prefer certainty over doubt, comprehensibility to confusion. Quantified metrics are extremely good at offering certainty and comprehensibility. They seduce us with the promise of what Nguyen calls “value clarity”. Hard and fast numbers enable us to easily set goals, justify decisions, and communicate what we’ve done. But humans reach true fulfilment by way of doubt, error and confusion. We’re odd like that.
Hot button alert! Uncertainty means risk. Therefore, reduce risk. Rely on an “authority,” “data,” or “research.” What if the authority sells advertising? What if the data are intentionally poisoned (a somewhat trivial task according to watchers of disinformation outfits)? What if the research is made up? (I am thinking of the Stanford University president and the Harvard ethic whiz. Both allegedly invented data; both found themselves in hot water. But no one seems to have cared.
With smart software — despite its hyperbolic marketing and its role as the next really Big Thing — finding its way into a wide range of business and specialized systems, just trust the machine output. I went for a routine check up. One machine reported I was near death. The doctor was recommending a number of immediate remediation measures. I pointed out that the data came from a single somewhat older device. No one knew who verified its accuracy. No one knew if the device was repaired. I noted that I was indeed still alive and asked if the somewhat nervous looking medical professional would get a different device to gather the data. Hopefully that will happen.
Is it a positive when the new pied piper of Hamelin wants to have control in order to generate revenue? Is it a positive when education produces individuals who do not ask, “Is the output accurate?” Some day, dinobabies like me will indeed be dead. Will the willingness of humans to follow the pied piper be different?
Absolutely not. This dinobaby is alive and kicking, no matter what the aged diagnostic machine said. Gentle reader, can you identify fake, synthetic, or just plain wrong data? If you answer yes, you may be in the top tier of actual thinkers. Those who are gatekeepers of information will define reality and take your money whether you want to give it up or not.
Stephen E Arnold, December 19, 2023
AI: Are You Sure You Are Secure?
December 19, 2023
This essay is the work of a dumb dinobaby. No smart software required.
North Carolina University published an interesting article. Are the data in the write up reproducible. I don’t know. I wanted to highlight the report in the hopes that additional information will be helpful to cyber security professionals. The article is “AI Networks Are More Vulnerable to Malicious Attacks Than Previously Thought.”
I noted this statement in the article:
Artificial intelligence tools hold promise for applications ranging from autonomous vehicles to the interpretation of medical images. However, a new study finds these AI tools are more vulnerable than previously thought to targeted attacks that effectively force AI systems to make bad decisions.
A corporate decision maker looks at a point of vulnerability. One of his associates moves a sign which explains that smart software protects the castel and its crown jewels. Thanks, MSFT Copilot. Numerous tries, but I finally got an image close enough for horseshoes.
What is the specific point of alleged weakness?
At issue are so-called “adversarial attacks,” in which someone manipulates the data being fed into an AI system in order to confuse it.
The example presented in the article is that a bad actor manipulates data provided to the smart software; for example, causing an image or content to be deleted or ignored. Another use case is that a bad actor could cause an X-ray machine to present altered information to the analyst.
The write up includes a description of software called QuadAttacK. The idea is to test a network for “clean” data. Four different networks were tested. The report includes a statement from Tianfu Wu, co-author of a paper on the work and an associate professor of electrical and computer engineering at North Carolina State University. He allegedly said:
“We were surprised to find that all four of these networks were very vulnerable to adversarial attacks,” Wu says. “We were particularly surprised at the extent to which we could fine-tune the attacks to make the networks see what we wanted them to see.”
You can download the vulnerability testing tool at this link.
Here are the observations my team and I generated at lunch today (Friday, December 14, 2023):
- Poisoned data is one of the weak spots in some smart software
- The free tool will allow bad actors with access to certain smart systems a way to identify points of vulnerability
- AI, at this time, may be better at marketing than protecting its reasoning systems.
Stephen E Arnold, December 19, 2023
Facing an Information Drought Tech Feudalists Will Innovate
December 18, 2023
This essay is the work of a dumb dinobaby. No smart software required.
The Exponential View (Azeem Azhar) tucked an item in his “blog.” The item is important, but I am not familiar with the cited source of the information in “LLMs May Soon Exhaust All Available High Quality Language Data for Training.” The main point is that the go-to method for smart software requires information in volume to [a] be accurate, [b] remain up to date, and [c] sufficiently useful to pay for the digital plumbing.
Oh, oh. The water cooler is broken. Will the Pilates’ teacher ask the students to quench their thirst with synthetic water? Another option is for those seeking refreshment to rejuvenate tired muscles with more efficient metabolic processes. The students are not impressed with these ideas? Thanks, MSFT Copilot. Two tries and close enough.
One datum indicates / suggests that the Big Dogs of AI will run out of content to feed into their systems in either 2024 or 2025. The date is less important than the idea of a hard stop.
What will the AI companies do? The essay asserts:
OpenAI has shown that it’s willing to pay eight figures annually for historical and ongoing access to data — I find it difficult to imagine that open-source builders will…. here are ways other than proprietary data to improve models, namely synthetic data, data efficiency, and algorithmic improvements – yet it looks like proprietary data is a moat open-source cannot cross.
Several observations:
- New methods of “information” collection will be developed and deployed. Some of these will be “off the radar” of users by design. One possibility is mining the changes to draft content is certain systems. Changes or deltas can be useful to some analysts.
- The synthetic data angle will become a go-to method using data sources which, by themselves, are not particularly interesting. However, when cross correlated with other information, “new” data emerge. The new data can be aggregated and fed into other smart software.
- Rogue organizations will acquire proprietary data and “bitwash” the information. Like money laundering systems, the origin of the data are fuzzified or obscured, making figuring out what happened expensive and time consuming.
- Techno feudal organizations will explore new non commercial entities to collect certain data; for example, the non governmental organizations in a niche could be approached for certain data provided by supporters of the entity.
Net net: Running out of data is likely to produce one high probability event: Certain companies will begin taking more aggressive steps to make sure their digital water cooler is filled and working for their purposes.
Stephen E Arnold, December 18, 2023
Ignoring the Big Thing: Google and Its PR Hunger
December 18, 2023
This essay is the work of a dumb dinobaby. No smart software required.
I read “FunSearch: Making New Discoveries in Mathematical Sciences Using Large Language Models.” The main idea is that Google’s smart software is — once again — going where no mortal man has gone before. The write up states:
Today, in a paper published in Nature, we introduce FunSearch, a method to search for new solutions in mathematics and computer science. FunSearch works by pairing a pre-trained LLM, whose goal is to provide creative solutions in the form of computer code, with an automated “evaluator”, which guards against hallucinations and incorrect ideas. By iterating back-and-forth between these two components, initial solutions “evolve” into new knowledge. The system searches for “functions” written in computer code; hence the name FunSearch.
I like the idea of getting the write up in Nature, a respected journal. I like even better the idea of Google-splaining how a large language model can do mathy things. I absolutely love the idea of “new.”
“What’s with the pointed stick? I needed a wheel,” says the disappointed user of an advanced technology in days of yore. Thanks, MSFT Copilot. Good enough, which is a standard of excellence in smart software in my opinion.
Here’s a wonderful observation summing up Google’s latest development in smart software:
FunSearch is like one of those rocket cars that people make once in a while to break land speed records. Extremely expensive, extremely impractical and terminally over-specialized to do one thing, and do that thing only. And, ultimately, a bit of a show. YeGoblynQueenne via YCombinator.
My question is, “Is Google dusting a code brute force method with marketing sprinkles?” I assume that the approach can be enhanced with more tuning of the evaluator. I am not silly enough to ask if Google will explain the settings, threshold knobs, and probability levers operating behind the scenes.
Google’s prose makes the achievement clear:
This work represents the first time a new discovery has been made for challenging open problems in science or mathematics using LLMs. FunSearch discovered new solutions for the cap set problem, a longstanding open problem in mathematics. In addition, to demonstrate the practical usefulness of FunSearch, we used it to discover more effective algorithms for the “bin-packing” problem, which has ubiquitous applications such as making data centers more efficient.
The search for more effective algorithms is a never-ending quest. Who bothers to learn how to get a printer to spit out “Hello, World”? Today I am pleased if my printer outputs a Gmail message. And bin-packing is now solved. Good.
As I read the blog post, I found the focus on large language models interesting. But that evaluator strikes me as something of considerable interest. When smart software discovers something new, who or what allows the evaluator to “know” that something “new” is emerging. That evaluator must be something to prevent hallucination (a fancy term for making stuff up) and blocking the innovation process. I won’t raise any Philosophy 101 questions, but I will say, “Google has the keys to the universe” with sprinkles too.
There’s a picture too. But where’s the evaluator. Simplification is one thing, but skipping over the system and method that prevents smart software hallucinations (falsehoods, mistakes, and craziness) is quite another.
Google is not a company to shy from innovation from its human wizards. If one thinks about the thrust of the blog post, will these Googlers be needed. Google’s innovativeness has drifted toward me-too behavior and being clever with advertising.
The blog post concludes:
FunSearch demonstrates that if we safeguard against LLMs’ hallucinations, the power of these models can be harnessed not only to produce new mathematical discoveries, but also to reveal potentially impactful solutions to important real-world problems.
I agree. But the “how” hangs above the marketing. But when a company has quantum supremacy, the grimness of the recent court loss, and assorted legal hassles — what is this magical evaluator?
I find Google’s deal to use facial recognition to assist the UK in enforcing what appears to be “stop porn” regulations more in line with what Google’s smart software can do. The “new” math? Eh, maybe. But analyzing every person trying to access a porn site and having the technical infrastructure to perform cross correlation. Now that’s something that will be of interest to governments and commercial customers.
The bin thing and a short cut for a python script. Interesting but it lacks the practical “big bucks now” potential of the facial recognition play. That, as far as I know, was not written up and ponied around to prestigious journals. To me, that was news, not the FUN as a cute reminder of a “function” search.
Stephen E Arnold, December 18, 2023
Google and Its Age Verification System: Will There Be a FAES Off?
December 18, 2023
This essay is the work of a dumb dinobaby. No smart software required.
Just in time for the holidays! Google’s user age verification system is ready for 2024. “Google Develops Selfie Scanning Software Ahead of Porn Crackdown” reports:
Google has developed face-scanning technology that would block children from accessing adult websites ahead of a crackdown on online porn. An artificial intelligence system developed by the web giant for estimating a person’s age based on their face has quietly been approved in the UK.
Thanks, MSFT Copilot. A good enough eyeball with a mobile phone, a pencil, a valise, stealthy sneakers, and data.
Facial recognition, although widely used in some countries, continues to make some people nervous. But in the UKL, the Google method will allow the UK government to obtain data to verify one’s age. The objective is to stop those who are younger than 18 from viewing “adult Web sites.”
[Google] says the technology is 99.9pc reliable in identifying that a photo of an 18-year-old is under the age of 25. If users are believed to be under the age of 25, they could be asked to provide additional ID.
The phrase used to describe the approach is “face age estimation system.”
The cited newspaper article points out:
It is unclear what Google plans to use the system for. It could use it within its own services, such as YouTube and the Google Play app download store, or build it into its Chrome web browser to allow websites to verify that visitors are over 18.
Google is not the only outfit using facial recognition to allegedly reduce harm to individuals. Facebook and OnlyFans, according to the write up are already deploying similar technology.
The news story says:
It is unclear what privacy protections Google would apply to the system.
I wonder what interesting insights would be available if data from the FAES were cross cross correlated with other information. That might have value to advertisers and possibly other commercial or governmental entities.
Stephen E Arnold, December 18, 2023
FTC Enacts Investigative Process On AI Products and Services
December 15, 2023
This essay is the work of a dumb dinobaby. No smart software required.
Creative types and educational professionals are worried about the influence of AI-generated work. However, law, legal, finance, business operations, and other industries are worried about how AI will impact them. Aware about the upward trend in goods and services that are surreptitiously moving into the market, the Federal Trade Commission (FTC) took action. The FTC released a briefing on the new consumer AI protection: “FTC Authorities Compulsory Process For AI-Related Products And Services.”
The executive recruiter for a government contractor says, “You can earn great money with a side gig helping your government validate AI algorithms. Does that sound good?” Will American schools produce enough AI savvy people to validate opaque and black box algorithms? Thanks, MSFT Copilot. You hallucinated on this one, but your image was good enough.
The FTC passed an omnibus resolution that authorizes a compulsory process in nonpublic investigations about products and services that use or claim to be made with AI or claim to detect it. The new omnibus resolution will increase the FTC’s efficiency with civil investigation demands (CIDs), a compulsory process like a subpoena. CIDs are issued to collect information, similar to legal discovery, for consumer protection and competition investigations. The new resolution will be in effect for ten years and the FTC voted to approve it 3-0.
The FTC defines AI as:
“AI includes, but is not limited to, machine-based systems that can, for a set of defined objectives, make predictions, recommendations, or decisions influencing real or virtual environments. Generative AI can be used to generate synthetic content including images, videos, audio, text, and other digital content that appear to be created by humans. Many companies now offer products and services using AI and generative AI, while others offer products and services that claim to detect content made by generative AI.”
AI can also be used for deception, privacy infringements, fraud, and other illegal activities. AI can causes competition problems, such as if a few companies monopolize algorithms are other AI-related technologies.
The FTC is taking preliminary steps to protect consumers from bad actors and their nefarious AI-generated deeds. However, what constitutes a violation in relation to AI? Will the data training libraries be examined along with the developers? Where will the expert analysts come? An online university training program?
Whitney Grace, December 15, 2023
Why Is a Generative System Lazy? Maybe Money and Lousy Engineering
December 13, 2023
This essay is the work of a dumb dinobaby. No smart software required.
Great post on the Xhitter. From @ChatGPT app:
we’ve heard all your feedback about GPT4 getting lazier! we haven’t updated the model since Nov 11th, and this certainly isn’t intentional. model behavior can be unpredictable, and we’re looking into fixing it
My experience with Chat GPT is that it responds like an intern working with my team between the freshman and sophomore years at college. Most of the information output is based on a “least effort” algorithm; that is, the shortest distance between A and B is vague promises.
An engineer at a “smart” software company leaps into action. Thanks, MSFT Copilot. Does this cartoon look like any of your technical team?
When I read about “unpredictable”, I wonder if people realize that probabilistic systems are wrong a certain percentage of the time or outputs. The horse loses the race. Okay, a fact. The bet on that horse is a different part of the stall.
But the “lazier” comment evokes several thoughts in my dinobaby mind:
- Allocate less time per prompt to reduce the bottlenecks in a computationally expensive system; thus, laziness is signal about crappy engineering
- Recognize that recycling results for frequent queries is a great way to give a user “something” close enough for horseshoes. If the user is clever, that user will use words like “give me more” or some similar rah rah to trigger another pass through what’s available
- The costs of system are so great, the Sam AI-Man system is starved for cash for engineers, hardware, bandwidth, and computational capacity. Until there’s more dough, the pantry will be poorly stocked.
Net net: Lazy may be a synonym for more serious issues. How does one make AI perform? Fabrication and marketing seem to be useful.
Stephen E Arnold, December 13, 2023