Yet Another Way to Spot AI Generated Content
July 21, 2023
Note: This essay is the work of a real and still-alive dinobaby. No smart software involved, just a dumb humanoid.
The dramatic emergence of ChatGPT has people frantically searching for ways to distinguish AI-generated content from writing by actual humans. Naturally, many are turning to AI solutions to solve an AI problem. Some tools have been developed that detect characteristics of dino-baby writing, like colloquialisms and emotional language. Unfortunately for the academic community, these methods work better on Reddit posts and Wikipedia pages than academic writings. After all, research papers have employed a bone-dry writing style since long before the emergence of generative AI.
Which tea cup is worth thousands and which is a fabulous fake? Thanks, MidJourney. You know your cups or you are in them.
Cell Reports Physical Science details the development of a niche solution in the ad article, “Distinguishing Academic Science Writing from Humans or ChatGPT with Over 99% Accuracy Using Off-the-Shelf Machine Learning Tools.” We learn:
“In the work described herein, we sought to achieve two goals: the first is to answer the question about the extent to which a field-leading approach for distinguishing AI- from human-derived text works effectively at discriminating academic science writing as being human-derived or from ChatGPT, and the second goal is to attempt to develop a competitive alternative classification strategy. We focus on the highly accessible online adaptation of the RoBERTa model, GPT-2 Output Detector, offered by the developers of ChatGPT, for several reasons. It is a field-leading approach. Its online adaptation is easily accessible to the public. It has been well described in the literature. Finally, it was the winning detection strategy used in the two most similar prior studies. The second project goal, to build a competitive alternative strategy for discriminating scientific academic writing, has several additional criteria. We sought to develop an approach that relies on (1) a newly developed, relevant dataset for training, (2) a minimal set of human-identified features, and (3) a strategy that does not require deep learning for model training but instead focuses on identifying writing idiosyncrasies of this unique group of humans, academic scientists.”
One of these idiosyncrasies, for example, is a penchant for equivocal terms like “but,” “however,” and “although.” Developers used the open source XGBoost software library for this project. The write-up describes the tool’s development and results at length, so navigate there for those details. But what happens, one might ask, the next time ChatGPT levels up? and the next? and so on? We are assured developers have accounted for this game of cat and mouse and will release updated tools quickly each time the chatbot evolves. What a winner—for the marketing team, that is.
Cynthia Murrell, July 21, 2023
Threads and Twitter: A Playground Battle for the Ages
July 18, 2023
Note: This essay is the work of a real and still-alive dinobaby. No smart software involved, just a dumb humanoid.
Twitter helped make some people famous. No big name publisher needed. Just an algorithm and a flow of snappy comments. Fame. Money. A platformer, sorry, I meant platform.
Is informed, objective analysis of Facebook and Twitter needed? Sure, but the approach taken by some is more like an argument at a school picnic over the tug –of – war teams. Which team will end up with grass stains? Which will get the ribbon with the check mark? MidJourney developed this original art object.
Now that Twitter has gone Musky, those who may perceive themselves as entitled to a blue check, algorithmic love, and a big, free megaphone are annoyed. At least that’s how I understand “Five Reasons Threads Could Still Go the Distance.” This essay is about the great social media dust up between those who love Teslas and those who can find some grace in the Zuck.
Wait, wasn’t the Zuck the subject of some criticism? Cambridge Analytic-type activities and possibly some fancy dancing with the name of the company, the future of the metaverse, and expanding land holdings in Hawaii? Forget that.
I learned in the article, which is flavored with some business consulting advice from a famous social media personality:
It’s always a fool’s errand to judge the prospects of a new social network a couple weeks into its history.
So what is the essay about? Exactly.
I learned from the cited essay:
Twitter’s deterioration continues to accelerate. Ad revenue is down by 50 percent, according to Musk, and — despite the company choosing not to pay many of its bills — the company is losing money. Rate limits continue to make the site unusable to many free users, and even some paid ones. Spam is overwhelming users’ direct messages so much that the company disabled open DMs to free users. The company has lately been reduced to issuing bribe-like payouts to a handful of hand-picked creators, many of whom are aligned with right-wing politics. If that’s not a death spiral, what is?
Wow, a death spiral at the same time Threads may be falling in love with “rate limits.”
Can the Zuck can kill off Twitter. Here’s hoping. But there is only one trivial task to complete, according to the cited article:
To Zuckerberg, the concept has been proved out. The rest is simply an execution problem. [Emphasis added]
As that lovable influencer, social media maven, and management expert Peter Drucker observed:
What gets measured, gets managed.
Isn’t it early days for measurement? Instagram was a trampoline for Threads. The Musk managment modifications seem to be working exactly as the rocket scientist planned them to function. What’s billions in losses mean to a person whose rockets don’t blow up too often.
Several observations:
- Analyzing Threads and Twitter is a bit like a school yard argument, particularly when the respective big dogs want to fight in a cage in Las Vegas
- The possible annoyance or mild outrage from those who loved the good old free Twitter is palpable
- Social media remains an interesting manifestation of human behavior.
Net net: I find social media a troubling innovation. But it does create news which some find as vital as oxygen, water, and clicks. Yes, clicks. The objective I believe.
Stephen E Arnold, July 18, 2023
Step 1: Test AI Writing Stuff. Step 2: Terminate Humanoids. Will Outrage Prevent the Inevitable?
July 5, 2023
Note: This essay is the work of a real and still-alive dinobaby. No smart software involved, just a dumb humanoid.
I am fascinated by the information (allegedly actual factual) in “Gizmodo and Kotaku Staff Furious After Owner Announces Move to AI Content.” Part of my interest is the subtitle:
God, this is gonna be such a f***ing nightmare.
Ah, for whom, pray tell. Probably not for the owners, who may see a pot of gold at the end of the smart software rainbow; for example, Costs Minus Humans Minus Health Care Minus HR Minus Miscellaneous Humanoid costs like latte makers, office space, and salaries / bonuses. What do these produce? More money (value) for the lucky most senior managers and selected stakeholders. Humanoids lose; software wins.
A humanoid writer sits at desk and wonders if the smart software will become a pet rock or a creature let loose to ruin her life by those who want a better payoff.
For the humanoids, it is hasta la vista. Assume the quality is worse? Then the analysis requires quantifying “worse.” Software will be cheaper over a time interval, expensive humans lose. Quality is like love and ethics. Money matters; quality becomes good enough.
Will, fury or outrage or protests make a difference? Nope.
The write up points out:
“AI content will not replace my work — but it will devalue it, place undue burden on editors, destroy the credibility of my outlet, and further frustrate our audience,” Gizmodo journalist Lin Codega tweeted in response to the news. “AI in any form, only undermines our mission, demoralizes our reporters, and degrades our audience’s trust.” “Hey! This sucks!” tweeted Kotaku writer Zack Zwiezen. “Please retweet and yell at G/O Media about this! Thanks.”
Much to the delight of her significant others, the “f***ing nightmare” is from the creative, imaginative humanoid Ashley Feinberg.
An ideal candidate for early replacement by a software system and a list of stop words.
Stephen E Arnold, July 5, 2023
Two Creatures from the Future Confront a Difficult Puzzle
June 15, 2023
Note: This essay is the work of a real and still-alive dinobaby. No smart software involved, just a dumb humanoid.
I was interested in a suggestion a colleague made to me at lunch. “Check out the new printed World Book encyclopedia.”
I replied, “A new one. Printed? Doesn’t information change quickly today.”
My lunch colleague said, “That’s what I have heard.”
I offered, “Who wants a printed, hard-to-change content objects? Where’s the fun in sneaky or sockpuppet edits? Do you really want to go back to non-fluid information?”
My hungry debate opponent said, “What? Do you mean misinformation is good?”
I said, “It’s a digital world. Get with the program.”
Navigate to World Book.com and check out the 10 page sample about dinosaurs. When I scanned the entry, there was no information about dinobabies. I was disappointed because the dinosaur segment is bittersweet for these reasons:
- The printed encyclopedia is a dinosaur of sorts, an expensive one to produce and print at that
- As a dinobaby, I was expecting an IBM logo or maybe an illustration of a just-RIF’ed IBM worker talking with her attorney about age discrimination
- Those who want to fill a bookshelf can buy books at a second hand bookstore or connect with a zippy home designer to make the shelf tasteful. I think there is wallpaper of books on a shelf as an alternative.
Two aliens are trying to figure out what a single volume of a World Book encyclopedia contains? I assume the creatures will be holding the volume 6 “I”, the one with information about the Internet. The image comes from the creative bits at MidJourney.
Let me dip into my past. Ah, you are not interested? Tough. Here we go down memory lane:
In 1953 or 1954, my father had an opportunity to work in Brazil. Off our family went. One of the must-haves was a set of World Book encyclopedias. The covers were brown; the pictures were most black and white; and the information was, according to my parents, accurate.
The schools in Campinas, Brazil, at that time used one language. Portuguese. No teacher spoke English. Therefore, after failing every class except mathematics, my parents decided to get me a tutor. The course work was provided by something called Calvert in Baltimore, Maryland. My teacher would explain the lesson, watch me read, ask me a couple of questions, and bail out after an hour or two. That lasted about as long as my stint in the Campinas school near our house. My tutor found himself on the business end of a snake. The snake lived; the tutor died.
My father — a practical accountant — concluded that I should read the World Book encyclopedia. Every volume. I think there were about 20 plus a couple of annual supplements. My mother monitored my progress and made me write summaries of the “interesting” articles. I recall that interesting or not, I did one summary a day and kept my parents happy.
I hate World Books. I was in the fourth or fifth grade. Campinas had great weather. There were many things to do. Watch the tarantulas congregate in our garage. Monitor the vultures circling my mother when she sunbathed on our deck. Kick a soccer ball when the students got out of school. (I always played. I sucked, but I had a leather, size five ball. Prior to our moving to the neighborhood, the kids my age played soccer with a rock wrapped in rags. The ball was my passport to an abuse free stint in rural Brazil.)
But a big chunk of my time was gobbled by the yawing white maw of a World Book.
When we returned to the US, I entered the seventh grade. No one at the public school in Illinois asked about my classes in Brazil. I just showed up in Miss Soape’s classroom and did the assignments. I do know one thing for sure: I was the only student in my class who did not have to read the assigned work. Reading the World Book granted me a free ride through grade school, high school, and the first couple of years at college.
Do I recommend that grade school kids read the World Book cover to cover?
No, I don’t. I had no choice. I had no teacher. I had no radio because the electricity was on several hours a day. There was no TV because there were no broadcasts in Campinas. There were no English language anything. Thus, the World Book, which I hate, was the only game in town.
Will I buy the print edition of the 2023 World Book? Not a chance.
Will other people? My hunch is that sales will be a slog outside of library acquisitions and a few interior decorators trying to add color to a client’s book shelf.
I may be a dinobaby, but I have figured out how to look up information online.
The book thing: I think many young people will be as baffled about an encyclopedia as the two aliens in the illustration.
By the way, the full set is about $1,200. A cheap smartphone can be had for about $250. What will kids use to look up information? If you said, the printed encyclopedia, you are a rare bird. If you move to a remote spot on earth, you will definitely want to lug a set with you. Starlink can be expensive.
Stephen E Arnold, June 14, 2023
The TikTok Addition: Has a Fortune Magazine Editor Been Up Swiping?
June 2, 2023
Note: This essay is the work of a real and still-alive dinobaby. No smart software involved, just a dumb humanoid.
A colleague called my attention to the Fortune Magazine article boldly titled “Gen Z Teens Are So Unruly in Malls, Fed by Their TikTok Addition, That a Growing Number Are requiring Chaperones and Supervision.” A few items I noted in this headline:
- Malls. I thought those were dead horses. There is a YouTube channel devoted to these real estate gems; for example, Urbex Offlimits and a creator named Brandon Moretti’s videos.
- Gen Z. I just looked up how old Gen Zs are. According to Mental Floss, these denizens of empty spaces are 11 to 26 years old. Hmmm. For what purpose are 21 to 25 year olds hanging out in empty malls? (Could that be a story for Fortune?)
- The “TikTok addition” gaffe. My spelling checker helps me out too. But I learned from a super-duper former Fortune writer whom I shall label Peter V, “Fortune is meticulous about its thorough research, its fact checking, and its proofreading.” Well, super-duper Peter, not in 2023. Please, explain in 25 words of less this image from the write up:
I did notice several factoids and comments in the write up; to wit:
Interesting item one:
“On Friday and Saturdays, it’s just been a madhouse,” she said on a recent Friday night while shopping for Mother’s Day gifts with Jorden and her 4-month-old daughter.
A madhouse is, according to the Cambridge dictionary is “a place of great disorder and confusion.” I think of malls as places of no people. But Fortune does the great fact checking, according to the attestation of Peter V.
Interesting item two:
Even a Chik-fil-A franchise in southeast Pennsylvania caused a stir with its social media post earlier this year that announced its policy of banning kids under 16 without an adult chaperone, citing unruly behavior.
I thought Chik-fil-A was a saintly, reserved institution with restaurants emulating Medieval monasteries. No longer. No wonder so many cars line up for a chickwich.
Interesting item three:
Cohen [a mall expert] said the restrictions will help boost spending among adults who must now accompany kids but they will also likely reduce the number of trips by teens, so the overall financial impact is unclear.
What these snippets tell me is that there is precious little factual data in the write up. The headline leading “TikTok addiction” is not the guts of the write up. Maybe the idea that kids who can’t go to the mall will play online games? I think it is more likely that kids and those lost little 21 to 25 year olds will find other interesting things to do with their time.
But malls? Kids can prowl Snapchat and TikTok, but those 21 to 25 year olds? Drink or other chemical activities?
Hey, Fortune, let’s get addicted to the Peter V. baloney: “Fortune is meticulous about its thorough research, its fact checking, and its proofreading.”
Stephen E Arnold, June 2, 2023
The Death of Digital News Upstarts: Woohoo!
May 31, 2023
Note: This essay is the work of a real and still-alive dinobaby. No smart software involved, just a dumb humanoid.
When I worked at a “real” newspaper, I learned that obituaries were cooked; that is, the newspaper reports of death were written whilst the subject was still alive and presumably buying advertisements in the paper or at least subscribing. The Guardian ran its obituary for upstart digital news outfits. No, the opinion writer did not include the word “woohoo.” I just picked up the Hopf vibration with my spidey sense.
The essay is “Vice Is Boing Bankrupt, BuzzFeed News Is Dead. What Does It Mean?” I don’t want to be picky, but these are two separate entities and each, as far as I know, is still breathing. There may be life support equipment involved, but neither entity’s online presence delivers a cheerful 404 message… yet.
The essay sails forward with no interest in my online check or the fact that two separate entities do not in my mind comprise an “it”. I am not going to differentiate because if the Guardian sees two identical Lego blocks, that’s the reality.
The write up says via a quote from the “brilliant” Clay Shirky, author and meme generator:
“This is what real revolutions are like. The old stuff gets broken faster than the new stuff is put in its place,” Shirky wrote. And, amid the ensuing chaos, it’s extremely hard to see what’s going next: “The importance of any given experiment isn’t apparent at the moment it appears, big changes stall, small changes spread.”
There are some bright spots; for example, ProPublica, the Gray Lady of Wordle fame, the Bezos news service, and most important, The Guardian, “owned by the Scott Trust and sustained by its endowment” and supported by readers who roll over for the jazzy pop ups in blue and yellow saying, “Give cash.”
Too bad the write up did not include the woohoo.
Stephen E Arnold, May 31, 2023
The Gray Lady: Objective Gloating about Vice
May 15, 2023
Note: This essay is the work of a real and still-alive dinobaby. No smart software involved, just a dumb humanoid.
Do you have dreams about the church lady on Saturday Night Live. That skit frightened me. A flashback shook my placid mental state when I read “Vice, Decayed Digital Colossus, Files for Bankruptcy.” I conjured up without the assistance of smart software, the image of Dana Carvey talking about the pundit spawning machine named Vice with the statement, “Well, isn’t that special?”
The New York Times’s article reported:
Vice Media filed for bankruptcy on Monday, punctuating a years long descent from a new-media darling to a cautionary tale of the problems facing the digital publishing industry.
The write up omits any reference to the New York Times’s failure with its own online venture under the guidance of Jeff Pemberton, the flame out with its LexisNexis play, the fraught effort to index its own content, and the misadventures which have become the Wordle success story. The past Don Quixote-like sallies into the digital world are either Irrelevant or unknown to the current crop of Gray Lady “real” news hounds I surmise.
The article states:
Investments from media titans like Disney and shrewd financial investors like TPG, which spent hundreds of millions of dollars, will
be rendered worthless by the bankruptcy, cementing Vice’s status among the most notable bad bets in the media industry. [Emphasis added.]
Well, isn’t that special? Perhaps similar to the Times’s first online adventure in the late 1970s?
The article includes a quote from a community journalism company too:
“We now know that a brand tethered to social media for its growth and audience alone is not sustainable.”
Perhaps like the desire for more money than the Times’s LexisNexis deal provided? Perhaps?
Is Vice that special? I think the story is a footnote to the Gray Lady’s own adventures in the digital realm?
Isn’t that special too?
Stephen E Arnold, May 15, 2023
Publishers: Why Not Replace Authors with ChatGPT and Raise Subscription Rates?
May 11, 2023
Note: This essay is the work of a real and still-alive dinobaby. No smart software involved, just a dumb humanoid.
I read another article about professional publishers. Nature Magazine reports that 40 editors have bailed from medical journals due to the fees Elsevier imposes on authors. The individuals who publish peer reviewed journal articles are often desperate for getting their name in what one supposes is a prestigious journal. I remember hearing at the Cornell Theory Center years ago that online “free” publications would not be considered for tenure evaluation or for certain grant applications. Why? Hey, that’s what universities want: Old school scholarship, thank you. Professional publishers cheerfully support the scheme. Libraries have to pay big subscription fees; commercial database producers are hamstrung due to restrictions on certain content; and the aspiring PhD student or starving adjunct professor is supposed to pay hundreds of dollars for output to proof. Yeah, that’s a great approach.
Now some professors (presumably with tenure) are doing a bit of the crawfish thing; that is, backing up and getting away from what is now viewed as a bit of a scam. I used to review articles for publishers. Guess what? I did not get paid. I was improving the quality of the publication. Yeah, right. As soon as I rejected papers written in incomprehensible English with statistics which actually did not add up, I learned via a friendly chat that I should not reject so many papers.
Oh, right. I quit. What baloney.
If you want to read about Elsevier’s explanation of the fees in today’s Word to typeset page fees, check out the original. I am not an academic, a fact I happily share with crazy publishers who want me to write for their “prestigious” journals. I write stuff and have for decades. Now I post information in my blog and I write monographs which I make available to those in my lectures.
Publishers are not for me. Most are dead tree types, snared in the craziness of slicing and dicing non reproducible research results, specious cross references to legal and accounting content, and pretending that their industry is essential to the smooth running of the knowledge centric world.
Nope. Too bad it has taken decades for a handful of editors to wake up and smell the ersatz which passes for real coffee.
Stephen E Arnold, May 11, 2023
Digital Tech Journalism Killed by a Digital Elephant
May 4, 2023
Note: This essay is the work of a real and still-alive dinobaby. No smart software involved, just a dumb humanoid.
I read a labored explanation, analysis, and rhetorical howl from Slate.com. The article is “Digital Media’s Original Sin: The Big Tech Bubble Burst and the News Industry Got Splattered with Shrapnel.” The article states:
For years, the tech industry has propped up digital journalism with advertising revenue, venture capital injections, and far-reaching social platforms.
My view is that the reason for the problem in digital tech journalism is the elephant. When electronic information flows, it acts in a way similar to water eroding soil. In short, flows of electronic information have what I call a “deconstructive element.” The “information business” once consisted of discrete platforms, essentially isolated by choice and by accident. Who in your immediate locale pays attention to the information published in the American Journal of Mathematics? Who reads Craigslist for listings of low-ball vacation rentals near Alex Murdaugh’s “estate”?
Convert this content to digital form and dump the physical form of the data. Then live in a dream world in which those who want the information will flock to a specific digital destination and pay big money for the one story or the privilege of browsing information which may or may not be accurate. Slate points out that it did not work out.
But what’s the elephant? Digital information to people today is like water to the goldfish in a bowl. It is just there.
The elephant was spawned by a few outfits which figured out that paying money to put content in front of eyeballs. The elephant grew and developed new capabilities; for example, the “pay to play” model of GoTo.com morphed into Overture.com and became something Yahoo.com thought would be super duper. However, the Google was inspired by “pay to play” and had the technical ability to create a system for creating a market from traffic, charging people to put content in front of the eyeballs, and charge anyone in the enabling chain money to use the Google system.
The combination of digital flows’ deconstructive operation plus the quasi-monopolization of online advertising death lethal blows to the crowd Slate addresses. Now the elephant has morphed again, and it is stomping around in the space defined by TikTok. A visual medium with advertising poses a threat to the remaining information producers as well as to Google itself.
The elephant is not immortal. But right now no group is armed with Mossberg Patriot Laminate Marinecotes and the skill to kill the elephant. Electronic information gulping advertising revenue may prove to be harder to kill than a cockroach. Maybe that’s why most people ask, “What elephant?”
Stephen E Arnold, May 4, 2023
Libraries: Who Needs Them? Perhaps Everyone
May 3, 2023
Note: This essay is the work of a real and still-alive dinobaby. No smart software involved, just a dumb humanoid.
How dare libraries try to make the works they purchase more easily accessible to their patrons! The Nation ponders, “When You Buy a Book, You Can Loan It to Anyone. This Judge Says Libraries Can’t. Why Not?” The case was brought before the U.S. District Court in Manhattan by four publishers unhappy with the Internet Archive’s (IA) controlled digital lending (CDL) program. We learn the IA does plan to appeal the decision. Writer Michelle M. Wu explains:
“At issue was whether a library could legally digitize the books it already owned and lend the digital copies in place of the print. The IA maintained that it could, as long as it lent only the same number of copies it owned and locked down the digital copies so that a borrower could not copy or redistribute them. It would be doing what libraries had always done, lend books—just in a different format. The publishers, on the other hand, asserted that CDL infringed on authors’ copyrights, making unauthorized copies and sharing these with libraries and borrowers, thereby depriving the authors and publishers of rightful e-book sales. They viewed CDL as piracy. While Judge John G. Koeltl’s opinion addressed many issues, all his reasoning was based on one assumption: that copyright primarily is about authors’ and publishers’ right to profit. Despite the pervasiveness of this belief, the history of copyright tells us something different.”
Wu recounts copyright’s evolution from a means to promote the sharing of knowledge to a way for publishers to rake in every possible dime. The shift was driven by a series of developments in technology. In the 1980s, the new ability to record content to video tape upset Hollywood studios. Apparently, being able to (re)watch a show after its initial broadcast was so beyond the pale a lawsuit was required. Later, Internet-based innovations prompted more legal proceedings. On the other hand, tools evolved that enabled publishers to enforce their interpretation of copyright, no judicial review required. Wu asserts:
“Increasing the impact on the end user, publishers—not booksellers or authors—now control prices and access. They can charge libraries multiple times what they charge an individual and bill them repeatedly for the same content. They can limit the number of copies a library buys, or even refuse to sell e-books to libraries at all. Such actions ultimately reduce the amount of content that libraries can provide to their readers.”
So that is how the original intention of copyright law has been turned on its head. And how publishers are undermining the whole purpose of libraries, which are valiantly trying to keep pace with technology. Perhaps the IA will win it’s appeal and the valuable CDL program will be allowed to continue. Either way, their litigious history suggests publishers will keep fighting for control over content.
Cynthia Murrell, May 3, 2023