Rapid Change: The Technological Meteor Causing Craziness
September 6, 2024
This essay is the work of a dumb dinobaby. No smart software required.
The mantra “Move fast and break things” creates opportunities for entrepreneurs and mental health professionals. “Eminent Scientist Richard Dawkins Reveals Fascinating Theory Behind West’s Mental Health Crisis” quotes Dr. Dawkins:
‘Certainly, the rate at which we are evolving genetically is miniscule compared to the rate at which we are evolving non-genetically, culturally,’ Dawkins told the hosts of the TRIGGERnometry podcast. ‘And much of the mental illness that afflicts people may be because we are in a constantly changing unpredictable environment,’ the biologist added, ‘in a way that our ancestors were not.’
Thanks, Microsoft Copilot. Is that a Windows Phone doing the flame out thing?
The write up reports:
Dawkins expressed more direct concerns with other aspects of human technology’s impact on evolution: climate change and basic self-reliance in the face of a new Dark Age. ‘The internet is a huge change, it’s gigantic change,’ he noted. ‘We’ve become adapted to it with astonishing rapidity.’ ‘if we lost electricity, if we suddenly lost the technology we’re used to,’ Dawkins worried, humanity might not be able to eve ‘begin’ to adapt in time, without great social upheaval and death… ‘Man-made extinction,’ he said, ‘it’s just as bad as the others. I think it’s tragic.’
There you go, death.
I know that brilliant people often speak carefully. Experts take time to develop their knowledge base and put words together that make complex ideas easy to understand.
From my redoubt in rural Kentucky, I have watched the panoply of events parading across my computer monitor. Among the notable moments were:
- Images from US cities showing homeless people slumped over either scrolling on their mobile phones or from the impact of certain compounds on their body
- Young people looting stores and noting similar items offered for sale on Craigslist.com-type sites
- Graphs of US academic performance illustrating the winners and losers of educational achievement tests
- The number of people driving around at times I associated with being in an office at “work” when I was younger
- Advertisements for prescription drugs with peculiar names and high-resolution images of people with smiles and contented lives but for the unnamed disease plaguing the otherwise cheerful folk.
What are the links between these unrelated situations and online access? I think I have a reasonably good idea. Why have experts, parents, and others required decades to figure out that flows of information are similar to sand-blasting systems. Provide electronic information to an organization, and it begins to decompose. The “bonds” which hold the people, processes, and products together are weakened. Then some break. Pump electronic information into younger people. They begin to come apart too. Give college students a tool to write their essays. Like lemmings, many take the AI solution and watch TikToks.
I am pleased that Dr. Dawkins has identified a problem. Now what’s the fix? The digital meteor has collided with human civilization. Can the dinosaurs be revivified?
Stephen E Arnold, September 6, 2024
Google and Search: A Fix or a Pipe Dream?
September 6, 2024
This essay is the work of a dumb dinobaby. No smart software required.
I read “Dawn of a New Era in Search: Balancing Innovation, Competition, and Public Good.”
Don’t get me wrong. I think multiple search systems are a good thing. The problem is that search (both enterprise and Web) are difficult problems, and these problems are expensive to solve. After working more than 50 years in electronic information, I have seen search systems come and go. I have watched systems morph from search into weird products that hide the search plumbing beneath fancy words like business intelligence and OSINT tools, among others. In 2006 or 2007, one of my financial clients published some of our research. The bank received an email from an “expert” (formerly and Verity) that his firm had better technology than Google. In that conversation, that “expert” said, “I can duplicate Google search for $300 million.” The person who said these incredibly uninformed words is now head of search at Google. Ed Zitron has characterized the individual as the person who killed Google search. Well, that fellow and Google search are still around. This suggests that baloney and high school reunions provide a career path for some people. But search is not understood particularly well at Google at this time. It is, therefore, that awareness of the problems of search is still unknown to judges, search engine marketing experts, developers of metasearch systems which recycle Bing results, and most of the poohbahs writing about search in blogs like Beyond Search.
The poor search kids see the rich guy with lots of money. The kids want it. The situation is not fair to those with little or nothing. Will the rich guy share the money? Thanks, Microsoft Copilot. Good enough. Aren’t you one of the poor Web search vendors?
After five decades of arm wrestling with finding on point information for myself, my clients, and for the search-related start ups with whom I have worked, I have an awareness of how much complexity the word “search” obfuscates. There is a general perception that Google indexes the Web. It doesn’t. No one indexes the Web. What’s indexed are publicly exposed Web pages which a crawler can access. If the response is slow (like many government and underfunded personal / commercial sites), spiders time out. The pages are not indexed. The crawlers have to deal in a successful way with the changes on how Web pages are presented. Upon encountering something for which the crawler is not configured, the Web page is skipped. Certain Web sites are dynamic. The crawler has to cope with these. Then there are Web pages which are not composed of text. The problems are compounded by the vagaries of intermediaries’ actions; for example, what’s being blocked or filtered today? The answer is the crawler skips them.
Without revealing information I am not permitted to share, I want to point out that crawlers have a list which contains bluebirds, canaries, and dead ducks. The bluebirds are indexed by crawlers on an aggressive schedule, maybe multiple times every hour. The canaries are the index-on-a-normal-cycle, maybe once every day or two. The dead ducks are crawled when time permits. Some US government Web sites may not be updated in six or nine months. The crawler visits the site once every six months or even less frequently. Then there are forbidden sites which the crawler won’t touch. These are on the open Web but urls are passed around via private messages. In terms of a Web search, these sites don’t exist.
How much does this cost? The answer is, “At scale, a lot. Indexing a small number of sites is really cheap.” The problem is that in order to pull lots of clicks, one has to have the money to scale or a niche no one else is occupying. Those are hard to find, and when one does, it makes sense to slap a subscription fee on them; for example, POISINDEX.
Why am I running though what strikes me as basic information about searching the Web? “Dawn of a New Era in Search: Balancing Innovation, Competition, and Public Good” is interesting and does a good job of expressing a specific view of Web search and Google’s content and information assets. I want to highlight the section of the write up titled “The Essential Facilities Doctrine.” The idea is that Google’s search index should be made available to everyone. The idea is interesting, and it might work after legal processes in the US were exhausted. The gating factor will be money and the political climate.
From a competitor’s point of view, the index blended with new ideas about how to answer a user’s query would level the playing field. From Google’s point of view it would loss of intellectual property.
Several observations:
- The hunger to punish Big Tech seems to demand being satisfied. Something will come from the judicial decision that Google is a monopoly. It took a couple of decades to arrive at what was obvious to some after the Yahoo ad technology settlement prior to the IPO, but most people didn’t and still don’t get “it.” So something will happen. What is not yet known.
- Wide access to the complete Google index could threaten the national security of the US. Please, think about this statement. I can’t provide any color, but it is a consideration among some professionals.
- An appeal could neutralize some of the “harms,” yet allow the indexing business to continue. Specific provisions might be applied to the decision of Judge Mehta. A modified landscape for search could be created, but online services tend to coalesce into efficient structures. Like the break up of AT&T, the seven Baby Bells and Bell Labs have become AT&T and Verizon. This could happen if “ads” were severed from Web search. But after a period of time, the break up is fighting one of the Arnold Laws of Online: A single monopoly is more efficient and emergent.
To sum up, the time for action came and like a train in Switzerland, left on time. Undoing Google is going to be more difficult than fiddling with Standard Oil or the railroad magnates.
Stephen E Arnold, September 6, 2024
Is Open Source Doomed?
September 6, 2024
Open source cheerleaders may need to find a new team to route for. Web developer and blogger Baldur Bjarnason describes “The Slow Evaporation of the Free/Open Source Surplus.” He notes he is joining a conversation begun by Tara Tarakiyee with the post, Is the Open Source Bubble about to Burst? and continued by Ben Werdmuller.
Bjarnason begins by specifying what has made open source software possible up until now: surpluses in both industry (high profit margins) and labor (well-paid coders with plenty of free time.) Now, however, both surpluses are drying up. The post lists several reasons for this. First, interest rates remain high. Next, investment dollars are going to AI, which “doesn’t really do real open source.” There were also the waves of tech layoffs and cost-cutting after post-pandemic overspending. Severe burnout from a thankless task does not help. We are reminded:
“Very few FOSS projects are lucky enough to have grown a sustainable and supportive community. Most of the time, it seems to be a never-ending parade of angry demands with very little reward.”
Good point. A few other factors, Bjarnason states, make organizations less likely to invest in open source:
- Why compete with AWS or similar services that will offer your own OSS projects at a dramatically lower price?
- Why subsidise projects of little to no strategic value that contribute anything meaningfully to the bottom-line?
- Why spend time on OSS when other work is likely to have higher ROI?
- Why give your work away to an industry that treats you as disposable?”
Finally, Bjarnason suspects even users are abandoning open source. One factor: developers who increasingly reach for AI generated code instead of searching for related open source projects. Ironically, those LLMs were trained on open source software in the first place. The post concludes:
Best case scenario, seems to me, is that Free and Open Source Software enters a period of decline. After all, that’s generally what happens to complex systems with less investment. Worst case scenario is a vicious cycle leading to a collapse:
- Declining surplus and burnout leads to maintainers increasingly stepping back from their projects.
- Many of these projects either bitrot serious bugs or get taken over by malicious actors who are highly motivated because they can’t relay on pervasive memory bugs anymore for exploits.
- OSS increasingly gets a reputation (deserved or not) for being unsafe and unreliable.
- That decline in users leads to even more maintainers stepping back.”
Bjarnason notes it is possible some parts of the Open Source ecosystem will not crash and burn. Overall, though, the outlook seems bleak.
Cynthia Murrell, September 6, 2024
Hey, Alexa, Why Does Amazon AI Flail?
September 5, 2024
This essay is the work of a dumb dinobaby. No smart software required.
Amazon has its work cut out for itself. The company has those pesky third-party vendors shipping “interesting” products to customers and then ignoring complaints. Amazon is on the radar of some legal eagles in the EU and the US. Now the company has found itself in an unusual situation: Its super duper smart software does not work. The fix, if the information in “Gen AI Alexa to Use Anthropic Tech After it Struggled for Words” with Amazon’s” is correct, is to use Anthropic AI technology. Hey, why not? Amazon allegedly invested $5 billion in the company. Maybe that implementation of Google technology will do the trick?
The mother is happy with Alexa’s answers. The weird sounds emitted from the confused device surprise her daughter. Thanks, MSFT Copilot. Good enough.
The write up reports:
Amazon demoed a generative AI version of Alexa in September 2023 and touted it as being more advanced, conversational, and capable, including the ability to do multiple smart home tasks with simpler commands. Gen AI Alexa is expected to come with a subscription fee, as Alexa has reportedly lost Amazon tens of billions of dollars throughout the years. Earlier reports said the updated voice assistant would arrive in June, but Amazon still hasn’t confirmed an official release date.
A year later, Amazon is punting and giving the cash furnace Alexa more brains courtesy of Anthropic. Will the AI wizards working on Amazon’s own AI have a chance to work in one of the Amazon warehouses?
Ars Technica says without a trace of irony:
The previously announced generative AI version of Amazon’s Alexa voice assistant “will be powered primarily by Anthropic’s Claude artificial intelligence models," Reuters reported today. This comes after challenges with using proprietary models, according to the publication, which cited five anonymous people “with direct knowledge of the Alexa strategy.”
Amazon has a desire to convert the money-losing Alexa into a gold mine, or at least a modest one.
This report, if accurate, suggests some interesting sparkles on the Bezos bulldozer’s metal flake paint; to wit:
- The two pizza team approach to technology did not work either for Alexa (the money loser) or the home grown AI money spinner. What other Amazon technologies are falling short of the mark?
- How long will it take to get a money-generating Alexa working and into the hands of customers eager for a better Alexa experience and a monthly or annual subscription for the new Alexa? A year has been lost already, and Alexa users continue to ask for the weather and a timer for cooking broccoli.
- What happens if the product, its integration with smart TV, and the Ring doorbell is like a Pet Rock? The fad has come and gone, replaced by smart watches and mobile phones? The answer: Collectibles!
Why am I questioning Amazon’s technology competency? The recent tie up between Microsoft and Palantir Technologies makes clear that Amazon’s cloud services don’t have the horsepower to pull government sales. When these pieces are shifted around, the resulting puzzle says, “Amazon is flailing to me.” Consider this: AI was beyond the reach of a big money outfit like Amazon. There’s a message in that factoid.
Stephen E Arnold, September 5, 2024
Uber Leadership May Have to Spend Money to Protect Drivers. Wow.
September 5, 2024
This essay is the work of a dumb dinobaby. No smart software required.
Senior managers — now called “leadership” — care about their employees. I added a wonderful example about corporate employee well being and co-worker sensitivity when I read “Wells Fargo Employee Found Dead in Her Cubicle 4 Days After She Clocked in for Work.” One of my team asked me, “Will leadership at that firm check her hours of work so she is not overpaid for the day she died?” I replied, “You will make a wonderful corporate leader one day.” Another analyst asked, “Didn’t the cleaning crew notice?” I replied, “Not when they come once every two weeks.”
Thanks, MSFT Copilot. Good enough given your filters.
A similar approach to employee care popped up this morning. My newsreader displayed this headline: “Ninth Circuit Rules Uber Had Duty to Protect Washington Driver Murdered by Passengers.” The write up reported:
The estate of Uber driver Cherno Ceesay sued the rideshare company for negligence and wrongful death in 2021, arguing that Uber knew drivers were at risk of violent assault from passengers but neglected to install any basic safety measures, such as barriers between the front and back seats of Uber vehicles or dash cameras. They also claimed Uber failed to employ basic identity-verification technology to screen out the two customers who murdered Ceesay — Olivia Breanna-Lennon Bebic and Devin Kekoa Wade — even though they opened the Uber account using a fake name and unverified form of payment just minutes before calling for the ride.
Hold it right there. The reason behind the alleged “failure” may be the cost of barriers, dash cams, and identity verification technology. Uber is a Big Dog high technology company. Its software manages rides, maps, payments, and the outstanding Uber app. If you want to know where your driver is, text the professional. Want to know the percentage of requests matched to drivers from a specific geographic point, forget that, gentle reader. Request a ride and wait for a confirmation. Oh, what if a pick up is cancelled after a confirmation? Fire up Lyft, right?
The cost of providing “basic” safety for riders is what helps make old fashioned taxi rides slightly more “safe.” At one time, Uber was cheaper than a weirdly painted taxi with a snappy phone number like 666 6666 or 777 7777 painted on the side. Now that taxis have been stressed by Uber, the Uber rides have become more expensive. Thanks to surge pricing, Uber in some areas is more expensive than taxis and some black car services if one can find one.
Uber wants cash and profits. “Basic” safety may add the friction of additional costs for staff, software licenses, and tangibles like plastic barriers and dash cams. The write up explains by quoting the legalese of the court decision; to wit:
“Uber alone controlled the verification methods of drivers and riders, what information to make available to each respective party, and consistently represented to drivers that it took their safety into consideration Ceesay relied entirely on Uber to match him with riders, and he was not given any meaningful information about the rider other than their location,” the majority wrote.
Now what? I am no legal eagle. I think Uber “leadership” will have meetings. Appropriate consultants will be retained to provide action plan options. Then staff (possibly AI assisted) will figure out how to reduce the probability of a murder in or near an Uber contractor’s vehicle.
My hunch is that the process will take time. In the meantime, I wonder if the Uber app autofills the “tip” section and then intelligently closes out that specific ride? I am confident that universities offering business classes will incorporate one or both of these examples in a class about corporate “leadership” principles. Tip: The money matters. Period.
Stephen E Arnold, September 5, 2024
What are the Real Motives Behind the Zuckerberg Letter?
September 5, 2024
Senior correspondent at Vox Adam Clarke Estes considers the motives behind Mark Zuckerberg’s recent letter to Rep. Jim Jordan. He believes “Mark Zuckerberg’s Letter About Facebook Censorship Is Not What it Seems.” For those who are unfamiliar: The letter presents no new information, but reminds us the Biden administration pressured Facebook to stop the spread of Covid-19 misinformation during the pandemic. Zuckerberg also recalls his company’s effort to hold back stories about Hunter Biden’s laptop after the FBI warned they might be part of a Russian misinformation campaign. Now, he insists, he regrets these actions and vows never to suppress “freedom of speech” due to political pressure again.
Naturally, Republicans embrace the letter as further evidence of wrongdoing by the Biden-Harris administration. Many believe it is evidence Zuckerberg is kissing up to the right, even though he specifies in the missive that his goal is to be apolitical. Estes believes there is something else going on. He writes:
“One theory comes from Peter Kafka at Business Insider: ‘Zuckerberg very carefully gave Jordan just enough to claim a political victory — but without getting Meta in any further trouble while it defends itself against a federal antitrust suit. To be clear, Congress is not behind the antitrust lawsuit. The case, which dates back to 2021, comes from the FTC and 40 states, which say that Facebook illegally crushed competition when it acquired Instagram and WhatsApp, but it must be top of mind for Zuckerberg. In a landmark antitrust case less than a month ago, a federal judge ruled against Google, and called it a monopoly. So antitrust is almost certainly on Zuckerberg’s mind. It’s also possible Zuckerberg was just sick of litigating events that happened years ago and wanted to close the loop on something that has caused his company massive levels of grief. Plus, allegations of censorship have been a distraction from his latest big mission: to build artificial general intelligence.”
So is it coincidence this letter came out during the final weeks of a severely close, high-stakes presidential election? Perhaps. An antitrust ruling like the one against Google could be inconvenient for Meta. Curious readers can navigate to the article for more background and more of Estes reasoning.
Cynthia Murrell, September 5, 2024
Accountants: The Leaders Like Philco
September 4, 2024
This essay is the work of a dumb dinobaby. No smart software required.
AI or smart software has roiled the normal routine of office gossip. We have shifted from “What is it?” to “Who will be affected next?” The integration of AI into work processes, however, is not a new thing. Most people don’t know or don’t recall that when a consultant could do a query from a clunky device like the Texas Instrument Silent 700, AI was already affecting jobs. Whose? Just ask a special librarian who worked when an intermediary was not needed to retrieve information from an online database.
A nervous smart robot running state-of-the-art tax software is sufficiently intelligent to be concerned about the meeting with an IRS audit team. Thanks, MSFT Copilot. How’s that security push coming along? Oh, too bad.
I read “Why America’s Most Boring Job Is on the Brink of Extinction.” I think the story was crafted by a person who received either a D or an F in Accounting 100. The lingo links accountants with being really dull people and the nuking of an entire species. No meteor is needed; just smart software, the silent killer. By the way, my two accountants are quite sporty. I rarely fall asleep when they explain life from their point of view. I listen, and I urge you to be attentive as well. Smart software can do some excellent things, but not everything related to tax, financial planning, and keeping inside the white lines of the quite fluid governmental rules and regulations.
Nevertheless, the write up cited above states:
Experts say the industry is nearing extinction because the 150-hour college credit rule, the intense entry exam and long work hours for minimal pay are unappealing to the younger generation.
The “real” news article includes some snappy quotes too. Here’s one I circled: “’The pay is crappy, the hours are long, and the work is drudgery, and the drudgery is especially so in their early years.’”
I am not an accountant, so I cannot comment on the accuracy of this statement. My father was an accountant, and he was into detail work and was able to raise a family. None of us ended up in jail or in the hospital after a gang fight. (I was and still am a sissy. Imagine that: An 80 year old dinobaby sissy with the DNA of an accountant. I am definitely exciting.)
With fewer people entering the field of accounting, the write up makes a remarkable statement:
… Accountants are becoming overworked and it is leading to mistakes in their work. More than 700 companies cited insufficient staff in accounting and other departments as a reason for potential errors in their quarterly earnings statements…
Does that mean smart software will become the accountants of the future? Some accountants may hope that smart software cannot do accounting. Others will see smart software as an opportunity to improve specific aspects of accounting processes. The problem, however, is not the accountants. The problem will AI is the companies or entrepreneurs who over promise and under deliver.
Will smart software replace the insight and timeline knowledge of an experienced numbers wrangler like my father or the two accountants upon whom I rely?
Unlikely. It is the smart software vendors and their marketers who are most vulnerable to the assertions about Philco, the leader.
Stephen E Arnold, September 4, 2024
Salesforces Disses Microsoft Smart Software
September 4, 2024
This essay is the work of a dumb dinobaby. No smart software required.
Senior managers can be frisky at times. A good example appears in the Fortune online service write up “Salesforce CEO Marc Benioff Says Microsoft Copilot Has Disappointed Many Customers.” I noted this statement in the article:
Marc Benioff said Microsoft’s Copilot AI hasn’t lived up to the hype…. unimpressive.
The old fish comparison works for smart software it seems. Thanks, MSFT Copilot. Good enough just not tastier.
Consider the number of organizations which use Microsoft and its smart software. Will those organizations benefit from “unimpressive” programs and services. What about the US government which might struggle to operate without Microsoft software. What if the US government operates in a way which delivers unimpressive performance? What about companies relying on Microsoft technology? Will these organizations struggle to deliver high-octane performance?
The article reported that the Big Dog of Salesforce opined:
“So many customers are so disappointed in what they bought from Microsoft Copilot because they’re not getting the accuracy and the response that they want,” Benioff said. “Microsoft has disappointed so many customers with AI.”
“Disappointed” — That’s harsh.
True to its rich history of business journalism, the article included a response from Microsoft, a dominant force in enterprise and consumer software (smart or otherwise). I noted this Microsoft comment:
Jared Spataro, Microsoft’s corporate vice president for AI at work, said in a statement to Fortune that the company was “hearing something quite different,” from its customers. The company’s Copilot customers also shot up 60% last quarter and daily users have more than doubled, Spataro added.
From Microsoft’s point of view, this is evidence that Microsoft is delivering high-value smart software. From Salesforce’s point of view, Microsoft is creating customers for Salesforce’s smart software. The problem is that Salesforce is not exactly the same type of software outfit as Salesforce. Nevertheless, the write up included this suggestive comment from the Big Dog of Salesforce:
“With our new Agentforce platform, we’re going to make a quantum leap for AI,” he said.
I like the use of the word “quantum.” It suggests uncertainty to me. I remain a bit careful when it comes to discussions of “to be” software. Marketing-type comments are far easier to create than a functional, reliable, and understandable system infused with smart software.
But PR and marketing are one thing. Software which does not hallucinate or output information that cannot be verified given an organization’s resources is different. Who cares? That’s a good question. Stakeholders, those harmed by AI outputs, and unemployed workers replaced by more “efficient” systems maybe?
Content marketing, sales hyperbole, and PR — The common currency of artificial intelligence makes life interesting.
Stephen E Arnold, September 4, 2024
Indifference or Carelessness: The Security Wrecks from Georgia Tech
September 4, 2024
DOJ Sues Georgia Tech for DOD-Related Cybersecurity Violations
The Justice Department takes cybersecurity standards for our military very seriously. Just ask Georgia Tech University. Nextgov/FCW reports, “DOJ Suit Claims Georgia Tech ‘Knowingly Failed’ to Meet Cyber Standards for DOD Contracts.” The suit began in 2022 with a whistleblower lawsuit filed by two members of the university’s cybersecurity compliance team. They did so under the DOJ’s Civil Cyber-Fraud Initiative. Now the DOJ has joined the fray. Reporter Edward Graham tells us:
“In a press release, DOJ alleged that the institutions committed numerous violations of the Department of Defense’s cybersecurity policy in the years prior to the whistleblower complaint. Among the most serious allegations was the claim that ‘Georgia Tech and [Georgia Tech Research Corporation] submitted a false cybersecurity assessment score to DOD for the Georgia Tech campus’ in December 2020. … The lawsuit also asserted that the Astrolavos Lab at Georgia Tech previously ‘failed to develop and implement a system security plan, which is required by DOD cybersecurity regulations.’ Once the security document was finally implemented in February 2020, the complaint said the university ‘failed to properly scope that plan to include all covered laptops, desktops and servers.’ Additionally, DOJ alleged that the Astrolavos Lab did not use any antivirus or antimalware programs on its devices until December 2021. The university reportedly allowed the lab to refuse the installation of the software ‘in violation of both federal cybersecurity requirements and Georgia Tech’s own policies’ at the request of its director.”
Georgia Tech disputes the charges. It claims there was no data breach or data leak, the information involved was not confidential anyway, and the government had stated this research did not require cybersecurity restrictions. Really? Then why the (allegedly) falsified cybersecurity score? The suit claims the glowing self-reported score for the Georgia Tech campus:
“… was for a ‘fictitious’ or ‘virtual’ environment and did not apply to any covered contracting system at Georgia Tech that could or would ever process, store or transmit covered defense information.”
That one will be hard to explain away. Other entities with DOD contractor will want to pay attention—Graham states the DOJ is cracking down on contractors that lie about their cyber protections.
Cynthia Murrell, September 4, 2024
Google Synthetic Content Scaffolding
September 3, 2024
This essay is the work of a dumb dinobaby. No smart software required.
Google posted what I think is an important technical paper on the arXiv service. The write up is “Towards Realistic Synthetic User-Generated Content: A Scaffolding Approach to Generating Online Discussions.” The paper has six authors and presumably has the grade of “A”, a mark not award to the stochastic parrot write up about Google-type smart software.
For several years, Google has been exploring ways to make software that would produce content suitable for different use cases. One of these has been an effort to use transformer and other technology to produce synthetic data. The idea is that a set of real data is mimicked by AI so that “real” data does not have to be acquired, intercepted, captured, or scraped from systems in the real-time, highly litigious real world. I am not going to slog through the history of smart software and the research and application of synthetic data. If you are curious, check out Snorkel and the work of the Stanford Artificial Intelligence Lab or SAIL.
The paper I referenced above illustrates that Google is “close” to having a system which can generate allegedly realistic and good enough outputs to simulate the interaction of actual human beings in an online discussion group. I urge you to read the paper, not just the abstract.
Consider this diagram (which I know is impossible to read in this blog format so you will need the PDF of the cited write up):
The important point is that the process for creating synthetic “human” online discussions requires a series of steps. Notice that the final step is “fine tuned.” Why is this important? Most smart software is “tuned” or “calibrated” so that the signals generated by a non-synthetic content set are made to be “close enough” to the synthetic content set. In simpler terms, smart software is steered or shaped to match signals. When the match is “good enough,” the smart software is good enough to be deployed either for a test, a research project, or some use case.
Most of the AI write ups employ steering, directing, massaging, or weaponizing (yes, weaponizing) outputs to achieve an objective. Many jobs will be replaced or supplemented with AI. But the jobs for specialists who can curve fit smart software components to produce “good enough” content to achieve a goal or objective will remain in demand for the foreseeable future.
The paper states in its conclusion:
While these results are promising, this work represents an initial attempt at synthetic discussion thread generation, and there remain numerous avenues for future research. This includes potentially identifying other ways to explicitly encode thread structure, which proved particularly valuable in our results, on top of determining optimal approaches for designing prompts and both the number and type of examples used.
The write up is a preliminary report. It takes months to get data and approvals for this type of public document. How far has Google come between the idea to write up results and this document becoming available on August 15, 2024? My hunch is that Google has come a long way.
What’s the use case for this project? I will let younger, more optimistic minds answer this question. I am a dinobaby, and I have been around long enough to know a potent tool when I encounter one.
Stephen E Arnold, September 3, 2024