LLM Unreliable? Probably Absolutely No Big Deal Whatsoever For Sure
July 19, 2023
Note: This essay is the work of a real and still-alive dinobaby. No smart software involved, just a dumb humanoid.
My team and I are working on an interesting project. Part of that work requires that we grind through papers, journal articles, and self-published (and essentially unverifiable) comments about smart software.
“What do you mean the outputs from the smart software I have been using for my homework delivers the wrong answer?” says this disappointed user of a browser and word processor with artificial intelligence baked in. Is she damning recursion? MidJourney created this emotion-packed image of a person who has learned that she has been accursed of plagiarism by her Sociology 215 professor.
Not surprisingly, we come across some wild and crazy information. On rare occasions we come across a paper, mostly ignored, which presents information that confirms many of our tests of smart software. When we do tests, we arrive with specific queries in mind. These relate to the behaviors of bad actors; for example, online services which front for cyber criminals, systems which are purpose built to make it time consuming to unmask a bad actor, and determine what person owns a particular domain engaged in the sale of fullz.
You can probably guess that most of the smart and dumb online finding services are of little or no help. We have to check these, however, simply because we want to be thorough. At a meeting last week, one of my team members who has a degree in library science, pointed out that the outputs from the services we use were becoming less useful than they were several months ago. I don’t spend too much time testing these services because I am a dinobaby and I run projects. My doing days are over. But I do listen to informed feedback. Her comment was one I had not seen in the Google PR onslaught about its method, the utterances of Sam AI-Man at OpenAI, or from the assorted LinkedIn gurus who post about smart software.
Then I spotted “How Is ChatGPT’s Behavior Changing over Time?”
I think the authors of the paper have documented what my team member articulated to me and others working on a smart software project. The paper states is polite academic prose:
Our findings demonstrate that the behavior of GPT-3.5 and GPT-4 has varied significantly over a relatively short amount of time.
The authors provide some data, a few diagrams, and some footnotes.
What is fascinating is that the most significant item in the journal article, in my opinion, is the use of the word “drifts.” Here’s the specific line:
Monitoring reveals substantial LLM drifts.
Yep, drifts.
What exactly is a drift in a numerical mélange like a large language model, its algorithms, and its probabilistic pulsing? In a nutshell, LLMs are formed by humans and use information to some degree created by humans. The idea is that sharp corners are created from decisions and data which may have rounded corners or be the equivalent of wad of Play-Doh after a kindergartener manipulates the stuff. The idea is that layers of numerical recipes are hooked together to output information useful to a human or system.
Those who worked with early versions of the Autonomy Neuro Linguistic black box know about the Play-Doh effect. Train the system on a crafted set of documents (information). Run test queries. Adjust a few knobs and dials afforded by the Autonomy system. Turn it loose on the Word documents and other content for which filters were installed. Then let users run queries.
To be upfront, using the early version of Autonomy in 1999 or 2000 was pretty darned good. However, Autonomy recommended that the system be retrained every few months.
Why?
The answer, as I recall, is that as new data were encountered by the Autonomy Neuro Linguistic engine, the engine had to cope with new words, names of companies, and phrases. Without retraining, the system would use what it had from its initial set up and tuning. Without retraining or recalibration, the Autonomy system would return results which were less useful in some situations. Operate a system without retraining, the results would degrade over time.
Math types labor to make inference-hooked and probabilistic systems stay on course. The systems today use tricks that make a controlled vocabulary look like the tool of a dinobaby like me. Without getting into the weeds, the Autonomy system would drift.
And what does the cited paper say, “LLM drift too.”
What does this mean? Here’s my dinobaby list of items to keep in mind:
- Smart software, if left to its own devices, will degrade over time; that is, outputs will drift from what the user wants. Feedback from users accelerates the drift because some feedback is from the smart software’s point of view is spot on even if it is crazy or off the wall. Do this over a period of time and you get what the paper’s authors and my team member pointed out: Degradation.
- Users who know how to look at a system’s outputs and validate or identify off the mark results can take corrective action; that is, ignore the outputs or fix them up. This is not common, and it requires specialized knowledge, time, and mental sharpness. Those who depend on TikTok or a smart system may not have these qualities in equal amounts.
- Entrepreneurs want money, power, or a new Tesla. Bringing up issues about smart software growing increasingly crazy like the dinobaby down the street is not valued. Hence, substantive problems with smart systems will require time, money, and expertise to remediate. Who wants that? Smart software is designed to improve efficiency, reduce costs, and make money. The result is a group of individuals who do PR, not up-to-snuff software.
Will anyone pay attention to this cited journal article? Sure, a few interns and maybe a graduate student or two. But at this time, the trend is that AI works and AI applied to something delivers a solution. Is that solution reliable or is it just good enough? What if the outputs deteriorate in a subtle way over time? What’s the fix? Who is responsible? The engineer who fiddled with thresholds? The VP of product development who dismissed objections about inherent bias in outputs?
I think you may have an answer to these questions. As a dinobaby, I can say, “Folks, I don’t have a clue about fixing up the smart software juggernaut.” I am skeptical of those who say, “Hey, it just works.” Okay, I hope you are correct.
Stephen E Arnold, July 19, 2023
Smart Software: Good Enough Plus 18 Percent More Quality
July 19, 2023
Note: This essay is the work of a real and still-alive dinobaby. No smart software involved, just a dumb humanoid.
Do I believe the information in “ChatGPT Can Turn Bad Writers into Better Ones”? No, I don’t. First, MIT is the outfit which had a special relationship with Jeffrey Epstein. Yep, that guy. Quite a pal. Second, academic outfits are known to house individuals who just make up or enhance research data. Does MIT have professors who do that? Of course not. But With Harvard professionals engaging in some ethical ballroom dancing with data, I want to be cautious. (And, please, navigate to the original write up and read the report. Subscribe too because Mr. Epstein is indisposed and unable to contribute to the academic keel of the scholarly steamboat.)
What counts, however, is perception, not reality. The write up fosters some Chemical Guys’s shine on information, so let’s take a look. It will be a shallow one because that is the spirit of some research today, and this dinobaby wants to get with the program. My writing may be lousy, but I do it myself, which seems to go against the current trend.
Here’s the core point in the write from my point of view in rural Kentucky, a state known for its intellectual rigor and fine writing about basketball:
A new study by two MIT economics graduate students … suggests it could help reduce gaps in writing ability between employees. They found that it could enable less experienced workers who lack writing skills to produce work similar in quality to that of more skilled colleagues.
The point in my opinion is that cheaper workers can do what more expensive workers can do.
Just to drive home the point, the write up included this point:
The writers who chose to use ChatGPT took 40% less time to complete their tasks, and produced work that the assessors scored 18% higher in quality than that of the participants who didn’t use it.
The MidJourney highly original art system produced this picture of an accountant, trained online by the once proud University of Phoenix, manifests great joy when discovering that smart software can produce marketing and PR collateral faster, cheaper, and better than a disgruntled English major wanting to rent a larger apartment in a big city. The accountant seems to be sitting in a modest thundershower of budget surplus.
For many, MIT has heft. Therefore, will this write up and the expert researchers’ data influence people; for instance, owners of marketing, SEO, reputation management, and PR companies?
Yep.
Observations:
- Layoffs will be accelerating
- Good enough becomes outstanding when financial benefits are fungible
- Assurances about employment security will be irrelevant.
And what about those MIT graduates? Better get a degree in math, computer science, engineering, or medieval English poetry. No, strike that medieval English poetry. Substitute “prompt engineer” or museum guide in Albania.
Stephen E Arnold, July 19, 2023
AI-Search Tool Talpa Burrows Into Library Catalogues
July 19, 2023
Note: This essay is the work of a real and still-alive dinobaby. No smart software involved, just a dumb humanoid.
For a few years now, libraries have been able to augment their online catalogue with enrichment services from Syndetics Unbound, which adds details and imagery to each entry. Now the company is incorporating new AI capabilities, we learn from its write-up, “Introducing Talpa Search.” Talpa is still experimental and is temporarily available to libraries already using Syndetics Unbound.
A book lover in action. Thanks MidJourney. You made me more appealing than I was in the 1951 when I got kicked out of the library for reading books for adults, not stuff about Freddy the Pig.
Participating libraries will get a year of the service for free. We cannot know just how much they will be saving, though, since the pricing remains a mystery. Writer Tim Spalding describes how Talpa works:
“First, Talpa queries large language models (from Claude AI and ChatGPT) for books and other media. Critically, every item is checked against true and authoritative bibliographic data, solving the problem of invented answers (called ‘hallucinations’) that such models can fall into. Second, Talpa uses the natural-language abilities of large language models to parse and understand queries, which are then answered using traditional library data. Thus a search for ‘novels about World War II in France’ is broken down into subjects and tags and answered with results from the library’s collection. Our authoritative book data comes from Syndetics Unbound, Bowker and LibraryThing. Surprisingly, Talpa’s ability to find books by their cover design isn’t powered by AI at all, but by the effort of thousands of book lovers who have played LibraryThing’s CoverGuess cover-tagging game since 2010!”
Interesting. If you don’t happen to be part of a library using Syndetics, you can try Talpa out at one of the three libraries linked to in the post. The tool sports a cute mole mascot and, to add a bit of personality, supplies mole facts beneath the search bar. As with many AI tools, the functionality has plenty of room to grow. For example, my search for “weaving velvet” did return a few loom-centered books scattered through the results but more prominently suggested works of fiction or philosophy that simply contained “velvet” in the title. (Including, adorably, several versions of “The Velveteen Rabbit.”) The write-up does not share when the tool will be available more widely, but we hope it will be more refined when it is. Is it AI? Isn’t everything?
Cynthia Murrell, July 19, 2023
When Wizards Flail: The Mysteries of Smart Software
July 18, 2023
Note: This essay is the work of a real and still-alive dinobaby. No smart software involved, just a dumb humanoid.
How about that smart software stuff? VCs are salivating. Whiz kids are emulating Sam AI-man. Users are hoping there is a job opening for a Wal-Mart greeter. But there is a hitch in the git along; specifically, some bright experts are not able to understand what smart software does to generate output. The cloud of unknowing is thick and has settled over the Land of Obfuscation.
“Even the Scientists Who Build AI Can’t Tell You How It Works” has a particularly interesting kicker:
“We built it, we trained it, but we don’t know what it’s doing.”
A group of artificial intelligence engineers struggling with the question, “What the heck is the system doing?” A click of the slide rule for MidJourney for this dramatic depiction of AI wizards at work.
The write up (which is an essay-interview confection) includes some thought-provoking comments. Here are three; you can visit the cited article for more scintillating insights:
Item 1: “… with reinforcement learning, you say, “All right, make this entire response more likely because the user liked it, and make this entire response less likely because the user didn’t like it.”
Item 2: “… The other big unknown that’s connected to this is we don’t know how to steer these things or control them in any reliable way. We can kind of nudge them
Item 3: “We don’t have the concepts that map onto these neurons to really be able to say anything interesting about how they behave.”
Item 4: “… we can sort of take some clippers and clip it into that shape. But that doesn’t mean we understand anything about the biology of that tree.”
Item 5: “… because there’s so much we don’t know about these systems, I imagine the spectrum of positive and negative possibilities is pretty wide.”
For more of this type of “explanation,” please, consult the source document cited above.
Several observations:
- I like the nudge and watch approach. Humanoids learning about what their code does may be useful.
- The nudging is subjective (human skill) and the reference to growing a tree and not knowing how that works exactly. Just do the bonsai thing. Interesting but is it efficient? Will it work? Sure or at least as Silicon Valley thinking permits
- The wide spectrum of good and bad. My reaction is to ask the striking writers and actors what their views of the bad side of the deal is. What if the writers get frisky and start throwing spit balls or (heaven forbid) old IBM Selectric type balls. Scary.
Net net: Perhaps Google knows best? Tensors, big computers, need for money, and control of advertising — I think I know why Google tries so hard to frame the AI discussion. A useful exercise is to compare what Google’s winner in the smart software power struggle has to say about Google’s vision. You can find that PR emission at this link. Be aware that the interviewer’s questions are almost as long at the interview subject’s answers. Does either suggest downsides comparable to the five items cited in this blog post?
Stephen E Arnold, July 18, 2023
Financial Analysts, Lawyers, and Consultants Can See Their Future
July 17, 2023
It is the middle of July 2023, and I think it is time for financial analysts, lawyers, and consultants to spruce up their résumés. Why would a dinobaby make such a suggestion to millions of the beloved Millennials, GenXers, the adorable GenY folk, and the vibrant GenZ lovers of TikTok, BMWs, and neutral colors?
I read three stories helpfully displayed by my trusty news reader. Let’s take a quick look at each and offer a handful of observations.
The first article is “This CEO Replaced 90% of Support Staff with an AI Chatbot.” The write up reports:
The chief executive of an Indian startup laid off 90% of his support staff after the firm built a chatbot powered by artificial intelligence that he says can handle customer queries much faster than his employees.
Yep, better, faster, and cheaper. Pick all three which is exactly what some senior managers will do. AI is now disrupting. But what about “higher skill” jobs than talking on the phone and looking up information for a clueless caller?
The second article is newsy or is it newsie? “Open AI and Associated Press Announce Partnership to Train AI on New Articles” reports:
[The deal] will see OpenAI licensing text content from the AP archives that will be used for training large language models (LLMs). In exchange, the AP will make use of OpenAI’s expertise and technology — though the media company clearly emphasized in a release that it is not using generative AI to help write actual news stories.
Will these stories become the property of the AP? Does Elon Musk have confidence in himself?
Young professionals learning that they are able to find their future elsewhere. In the MidJourney confection is a lawyer, a screenwriter, and a consultant at a blue chip outfit selling MBAs at five times the cost of their final year at university.
I think that the move puts Google in a bit of a spot if it processes AP content and a legal eagle can find that content in a Bard output. More significantly, hasta la vista reporters. Now the elimination of hard working, professional journalists will not happen immediately. However, from my vantage point in rural Kentucky, I hear the train a-rollin’ down the tracks. Whooo Whooo.
The third item is “Producers Allegedly Sought Rights to Replicate Extras Using AI, Forever, for Just $200.” The write up reports:
Hollywood’s top labor union for media professionals has alleged that studios want to pay extras around $200 for the rights to use their likenesses in AI – forever – for just $200.
Will the unions representing these skilled professionals refuse to cooperate? Does Elon Musk like Grimes’s music?
A certain blue chip consulting firm has made noises about betting $2 billion on smart software and Microsoft consulting. Oh, oh. Junior MBAs, it may not be too late to get an associate of arts degree in modern poetry so you can work as a prompt engineer. As a famous podcasting person says, “What say you?”
Several questions:
- Will trusted, reliable, research supporting real news organizations embrace smart software and say farewell to expensive humanoids?
- Will those making videos use computer generated entities?
- Will blue chip consulting firms find a way to boost partners’ bonuses standing on the digital shoulders of good enough software?
I sure hope you answered “no” to each of these questions. I have a nice two cruzeiro collectable from Brazil, circa 1952 to sell you. Make me an offer. Collectible currency is an alternative to writing prompts or becoming a tour guide in Astana. Oh, that’s in Kazakhstan.
Smart software is a cost reducer because humanoids [a] require salaries and health care, [b] take vacations, [c] create security vulnerabilities or are security vulnerabilities, and [d] require more than high school science club management methods related to sensitive issues.
Money and good enough will bring changes in news, Hollywood, and professional services.
Stephen E Arnold, July 17, 2023
AI Analyzed by a Human from Microsoft
July 14, 2023
Note: This essay is the work of a real and still-alive dinobaby. No smart software involved, just a dumb humanoid.
“Artificial Intelligence Doesn’t Have Capability to Take Over, Microsoft Boss Says” provides some words of reassurance when Sam AI-Man’s team are suggesting annihilation of the human race. Here are two passages I found interesting in the article-as-interview write up.
This is an illustration of a Microsoft training program for its smart future employees. Humans will learn or be punished by losing their Microsoft 365 account. The picture is a product of the gradient surfing MidJourney.
First snippet of interest:
“The potential for this technology to really drive human productivity… to bring economic growth across the globe, is just so powerful, that we’d be foolish to set that aside,” Eric Boyd, corporate vice president of Microsoft AI Platforms told Sky News.
Second snippet of interest:
“People talk about how the AI takes over, but it doesn’t have the capability to take over. These are models that produce text as output,” he said.
Now what about this passage posturing as analysis:
Big Tech doesn’t look like it has any intention of slowing down the race to develop bigger and better AI. That means society and our regulators will have to speed up thinking on what safe AI looks like.
I wonder if anyone is considering that AI in the hands of Big Tech might have some interest in controlling some of the human race. Smart software seems ideal as an enabler of predatory behavior. Regulators thinking? Yeah, that’s a posture sure to deal with smart software’s applications. Microsoft, do you believe this colleague’s marketing hoo hah?
Stephen E Arnold, July 14, 2023
What, Google? Accuracy Through Plagiarism
July 14, 2023
Note: This essay is the work of a real and still-alive dinobaby. No smart software involved, just a dumb humanoid.
Now that AI is such a hot topic, tech companies cannot afford to hold back due to small flaws. Like a tendency to spit out incorrect information, for example. One behemoth seems to have found a quick fix for that particular wrinkle: simple plagiarism. Eager to incorporate AI into its flagship Search platform, Google recently released a beta version to select users. Forbes contributor Matt Novak was among the lucky few and shares his observations in, “Google’s New AI-Powered Search Is a Beautiful Plagiarism Machine.”
The blacksmith says, “Oh, oh, I think I have set my shop on fire.” The image is the original work of the talented MidJourney system.
The author takes us through his query and results on storing live oysters in the fridge, complete with screenshots of the Googlebot’s response. (Short answer: you can for a few days if you cover them with a damp towel.) He highlights passages that were lifted from websites, some with and some without tiny tweaks. To be fair, Google does link to its source pages alongside the pilfered passages. But why click through when you’ve already gotten what you came for? Novak writes:
“There are positive and negative things about this new Google Search experience. If you followed Google’s advice, you’d probably be just fine storing your oysters in the fridge, which is to say you won’t get sick. But, again, the reason Google’s advice is accurate brings us immediately to the negative: It’s just copying from websites and giving people no incentive to actually visit those websites. Why does any of this matter? Because Google Search is easily the biggest driver of traffic for the vast majority of online publishers, whether it’s major newspapers or small independent blogs. And this change to Google’s most important product has the potential to devastate their already dwindling coffers. … Online publishers rely on people clicking on their stories. It’s how they generate revenue, whether that’s in the sale of subscriptions or the sale of those eyeballs to advertisers. But it’s not clear that this new form of Google Search will drive the same kind of traffic that it did over the past two decades.”
Might Google be like a blacksmith who accidentally sets fire to his workshop? Content is needed to make the fires of revenue burn brightly. No content, problem?
Cynthia Murrell, July 14, 2023
Refining Open: The AI Weak Spot during a Gold Rush
July 13, 2023
Note: This essay is the work of a real and still-alive dinobaby. No smart software involved, just a dumb humanoid.
Nope, no reference will I make to sell picks and denim pants to those involved in a gold rush. I do want to highlight the essay “AI Weights Are Not Open Source.” There is a nifty chart with rows and columns setting forth some conceptual facets of smart software. Please, navigate to the cited document so you can read the text in the rows and columns.
For me, the most important sentence in the essay in my opinion is this one:
Many AI weights with the label “open” are not open source.
How are these “weights” determined or contrived? Are these weights derived by proprietary systems and methods? Are these weights assigned by a subject matter expert, a software engineer using guess-timation, or are low wage workers pressed against the task?
The answers to these questions reveal how models are configured to generate “good enough” results. Present models are prone to providing incomplete, incorrect, or pastiche information.
Furthermore, the popularity of obtaining images of Mr. Trump in an orange jumpsuit illustrates how “censorship” is applied to certain requests for information. Try it yourself. Navigate to MidJourney. Jump through the Discord hoops. Input the command “President Donald Trump in an orange jumpsuit.” Get the improper request flag. Then ask yourself, “How does BoingBoing keep creating Mr. Trump in an orange jumpsuit?”
Net net: The power of AI rests with the weights and controls which allow certain information and disallows other types of information. “Open” does not mean open like “the door is open.” Open for AI means a means to obtain power and exert control in my opinion.
Stephen E Arnold, July 13, 2023
Understanding Reality: A Job for MuskAI
July 12, 2023
Note: This essay is the work of a real and still-alive dinobaby. No smart software involved, just a dumb humanoid
I read “Elon Musk Launches His Own xAI Biz to Understand Reality.” Upon reading this article, I was immediately perturbed. The name of the company should be MuskAI (pronounced mus-key like the lovable muskox (Ovibos moschatus). This imposing and aromatic animal can tip the scales at up to 900 pounds. Take that to the cage match and watch the opposition wilt or at least scrunch up its nose.
I also wanted to interpret the xAI as AIX. IBM, discharger of dinobabies, could find that amusing. (What happens when AIX memory is corrupted? Answer: Aches in the posterior. Snort snort.)
Finally, my thoughts coalesced around the name Elon-AI, illustrated below by the affable MidJourney:
Bummer. Elon AI is the name of a “coin.” And the proper name Elonai means “a person who has the potential to attain spiritual enlightenment.” A natural!
The article reports:
Elon Musk is founding of his own AI company with some lofty ambitions. According to the billionaire, his xAI venture is being formed “to understand reality.” Those hoping to get a better explanation than Musk’s brief tweet by visiting xAI’s website won’t find much to help them understand what the company actually plans to do there, either. “The goal of xAI is to understand the true nature of the universe,” xAI said of itself…
I have a number of questions. Let me ask one:
Will Elon AI go after the Zuck AI?
And another:
Will the two AIs power an unmanned fighter jet, each loaded with live ordnance?
And the must-ask:
Will the AIs attempt to kill one another?
The mano-a-mano fight in Las Vegas (maybe in the weird LED appliqued in itsy bitsy LEDs) is less interesting to me than watching two warbirds from the Dayton Air Museum gear up and dog fight.
Imagine a YouTube video, then some TikToks, and finally a Netflix original released to the few remaining old-fashioned theaters.
That’s entertainment. Sigh. I mean xAI.
Stephen E Arnold, July 12, 2023
Open AI and Its Alignment Pipeline
July 12, 2023
Note: This essay is the work of a real and still-alive dinobaby. No smart software involved, just a dumb humanoid.
Yep, alignment pipeline. No, I have zero clue what that means. I came across this felicitous phrase in “OpenAI Co-Founder Warns Superintelligent AI Must Be Controlled to Prevent Possible Human Extinction.” The “real news” story focuses on the PR push for Sam AI-Man’s OpenAI outfit. The idea for the story strikes me as a PR confection, but I am a dinobaby. Dinobabies can be skeptical.
An OpenAI professional explains to some of his friends that smart software may lead to human extinction. Maybe some dogs and cockroaches will survive. He points out that his company may save the world with an alignment pipeline. The crowd seems to be getting riled up. Someone says, “What’s an alignment pipeline.” A happy honk from the ArnoldIT logo to the ever-creative MidJourney system. (Will it be destroyed too?)
The write up reports a quote from one of Sam AI-Man’s colleagues; to wit:
“Superintelligence will be the most impactful technology humanity has ever invented, and could help us solve many of the world’s most important problems. But the vast power of superintelligence could also be very dangerous, and could lead to the disempowerment of humanity or even human extinction,” Ilya Sutskever and head of alignment Jan Leike wrote in a Tuesday blog post, saying they believe such advancements could arrive as soon as this decade.
There you go. Global warming, the threat of nuclear discharges in Japan and Ukraine, post-Covid hangover, and human extinction. Okay
What’s interesting to this dinobaby is that OpenAI made a decision to make the cloud service available. OpenAI hooked up with the thoughtful, kind, and humane Microsoft. OpenAI forced the somewhat lethargic Googzilla to shift into gear and respond.
The Murdoch article presents another OpenAI wizard output:
“Currently, we don’t have a solution for steering or controlling a potentially superintelligent AI, and preventing it from going rogue. Our current techniques for aligning AI, such as reinforcement learning from human feedback, rely on humans’ ability to supervise AI. But humans won’t be able to reliably supervise AI systems much smarter than us and so our current alignment techniques will not scale to superintelligence,” they wrote. “We need new scientific and technical breakthroughs.”
This type of jibber jabber is fascinating. I wonder why the OpenAI folks did not do a bit of that “what if” thinking before making the service available. Yeah, woulda, shoulda, coulda. It sounds to me like a driver saying to a police officer, “I didn’t mean to run over Grandma Wilson.”
How does that sound to the grand children, Grandma’s insurance company, and the judge?
Sounds good, but someone ran over Grandma Wilson, right, Mr. OpenAI wizards? Answer the question, please.
The OpenAI geniuses have an answer, and I quote:
To solve these problems, within a period of four years, they said they’re leading a new team and dedicating 20% of the compute power secured to date to this effort. “While this is an incredibly ambitious goal and we’re not guaranteed to succeed, we are optimistic that a focused, concerted effort can solve this problem,” they said.
Now the capstone:
Its goal is to devise a roughly human-level automated alignment researcher, using vast amounts of compute to scale it and “iteratively align superintelligence.” In order to do so, OpenAI will develop a scalable training method, validate the resulting model and then stress test its alignment pipeline.
Yes, the alignment pipeline. What a crock of high school science club yip yap. Par for the course today. Nice thinking, PR people. One final thought: Grandma is dead. CYA words may not impress some people. To a high school science club type, the logic and the committee make perfect sense. Good work, Mr. AI-Men.
Stephen E Arnold, July 12, 2023