Pragmatic AI: Individualized Monitoring
August 15, 2024
This essay is the work of a dinobaby. Unlike some folks, no smart software improved my native ineptness.
In June 2024 at the TechnoSecurity & Digital Forensics conference, one of the cyber investigators asked me, “What are some practical uses of AI in law enforcement?” I told the person that I would send him a summary of my earlier lecture called “AI for LE.” He said, “Thanks, but what should I watch to see some AI in action.” I told him to pay attention to the Kroger pricing methods. I had heard that Kroger was experimenting with altering prices based on certain signals. The example I gave is that if the Kroger is located in a certain zip code, then the Kroger stores in that specific area would use dynamic pricing. The example I gave was similar to Coca-Cola’s tests of a vending machine that charged more if the temperature was hot. In the Kroger example, a hot day would trigger a change in the price of a frozen dessert. He replied, “Kroger?” I said, “Yes, Kroger is experimenting with AI in order to detect specific behaviors and modify prices to reflect those signals.” What Kroger is doing will be coming to law enforcement and intelligence operations. Smart software monitors the behavior of a prisoner, for example, and automatically notifies an investigator when a certain signal is received. I recall mentioning that smart software, signals, and behavior change or direct action will become key components of a cyber investigator’s tool kit. He said, laughing, “Kroger. Interesting.”
Thanks, MSFT Copilot. Good enough.
I learned that Kroger’s surveillance concept is now not a rumor discussed at a neighborhood get together. “‘Corporate Greed Is Out of Control’: Warren Slams Kroger’s AI Pricing Scheme” reveals that elected officials and probably some consumer protection officials may be aware of the company’s plans for smart software. The write up reports:
Warren (D-Mass.) was joined by Sen. Bob Casey (D-Pa.) on Wednesday in writing a letter to the chairman and CEO of the Kroger Company, Rodney McMullen, raising concerns about how the company’s collaboration with AI company IntelligenceNode could result in both privacy violations and worsened inequality as customers are forced to pay more based on personal data Kroger gathers about them “to determine how much price hiking [they] can tolerate.” As the senators wrote, the chain first introduced dynamic pricing in 2018 and expanded to 500 of its nearly 3,000 stores last year. The company has partnered with Microsoft to develop an Electronic Shelving Label (ESL) system known as Enhanced Display for Grocery Environment (EDGE), using a digital tag to display prices in stores so that employees can change prices throughout the day with the click of a button.
My view is that AI orchestration will allow additional features and functions. Some of these may be appropriate for use in policeware and intelware systems. Kroger makes an effort to get individuals to sign up for a discount card. Also, Kroger wants users to install the Kroger app. The idea is that discounts or other incentives may be “awarded” to the customer who takes advantages of the services.
However, I am speculating that AI orchestration will allow Kroger to implement a chain of actions like this:
- Customer with a mobile phone enters the store
- The store “acknowledges” the customer
- The customer’s spending profile is accessed
- The customer is “known” to purchase upscale branded ice cream
- The price for that item automatically changes as the customer approaches the display
- The system records the item bar code and the customer ID number
- At check out, the customer is charged the higher price.
Is this type of AI orchestration possible? Yes. Is it practical for a grocery store to deploy? Yes because Kroger uses third parties to provide its systems and technical capabilities for many applications.
How does this apply to law enforcement? Kroger’s use of individualized tracking may provide some ideas for cyber investigators.
As large firms with the resources to deploy state-of-the-art technology to boost sales, know the customer, and adjust prices at the individual shopper level, the benefit of smart software become increasingly visible. Some specialized software systems lag behind commercial systems. Among the reasons are budget constraints and the often complicated procurement processes.
But what is at the grocery store is going to become a standard function in many specialized software systems. These will range from security monitoring systems which can follow a person of interest in an specific area to automatically updating a person of interest’s location on a geographic information module.
If you are interested in watching smart software and individualized “smart” actions, just pay attention at Kroger or a similar retail outfit.
Stephen E Arnold, August 15, 2024
AI Safety Evaluations, Some Issues Exist
August 14, 2024
Ah, corporate self regulation. What could go wrong? Well, as TechCrunch reports, “Many Safety Evaluations for AI Models Have Significant Limitations.” Writer Kyle Wiggers tells us:
“Generative AI models … are coming under increased scrutiny for their tendency to make mistakes and generally behave unpredictably. Now, organizations from public sector agencies to big tech firms are proposing new benchmarks to test these models’ safety. Toward the end of last year, startup Scale AI formed a lab dedicated to evaluating how well models align with safety guidelines. This month, NIST and the U.K. AI Safety Institute released tools designed to assess model risk. But these model-probing tests and methods may be inadequate. The Ada Lovelace Institute (ALI), a U.K.-based nonprofit AI research organization, conducted a study that interviewed experts from academic labs, civil society and those who are producing vendors models, as well as audited recent research into AI safety evaluations. The co-authors found that while current evaluations can be useful, they’re non-exhaustive, can be gamed easily and don’t necessarily give an indication of how models will behave in real-world scenarios.”
There are several reasons for the gloomy conclusion. For one, there are no established best practices for these evaluations, leaving each organization to go its own way. One approach, benchmarking, has certain problems. For example, for time or cost reasons, models are often tested on the same data they were trained on. Whether they can perform in the wild is another matter. Also, even small changes to a model can make big differences in behavior, but few organizations have the time or money to test every software iteration.
What about red-teaming: hiring someone to probe the model for flaws? The low number of qualified red-teamers and the laborious nature of the method make it costly, out of reach for smaller firms. There are also few agreed-upon standards for the practice, so it is hard to assess the effectiveness of red-team projects.
The post suggests all is not lost—as long as we are willing to take responsibility for evaluations out of AI firms’ hands. Good luck prying open that death grip. Government regulators and third-party testers would hypothetically fill the role, complete with transparency. What a concept. It would also be good to develop standard practices and context-specific evaluations. Bonus points if a method is based on an understanding of how each AI model operates. (Sadly, such understanding remains elusive.)
Even with these measures, it may never be possible to ensure any model is truly safe. The write-up concludes with a quote from the study’s co-author Mahi Hardalupas:
“Determining if a model is ‘safe’ requires understanding the contexts in which it is used, who it is sold or made accessible to, and whether the safeguards that are in place are adequate and robust to reduce those risks. Evaluations of a foundation model can serve an exploratory purpose to identify potential risks, but they cannot guarantee a model is safe, let alone ‘perfectly safe.’ Many of our interviewees agreed that evaluations cannot prove a model is safe and can only indicate a model is unsafe.”
How comforting.
Cynthia Murrell, August 14, 2024
Sakana: Can Its Smart Software Replace Scientists and Grant Writers?
August 13, 2024
This essay is the work of a dumb dinobaby. No smart software required.
A couple of years ago, merging large language models seemed like a logical way to “level up” in the artificial intelligence game. The notion of intelligence aggregation implied that if competitor A was dumb enough to release models and other digital goodies as open source, an outfit in the proprietary software business could squish the other outfits’ LLMs into the proprietary system. The costs of building one’s own super-model could be reduced to some extent.
Merging is a very popular way to whip up pharmaceuticals. Take a little of this and a little of that and bingo one has a new drug to flog through the approval process. Another example is taking five top consultants from Blue Chip Company I and five top consultants from Blue Chip Company II and creating a smarter, higher knowledge value Blue Chip Company III. Easy.
A couple of Xooglers (former Google wizards) are promoting a firm called Sakana.ai. The purpose of the firm is to allow smart software (based on merging multiple large language models and proprietary systems and methods) to conduct and write up research (I am reluctant to use the word “original”, but I am a skeptical dinobaby.) The company says:
One of the grand challenges of artificial intelligence is developing agents capable of conducting scientific research and discovering new knowledge. While frontier models have already been used to aid human scientists, e.g. for brainstorming ideas or writing code, they still require extensive manual supervision or are heavily constrained to a specific task. Today, we’re excited to introduce The AI Scientist, the first comprehensive system for fully automatic scientific discovery, enabling Foundation Models such as Large Language Models (LLMs) to perform research independently. In collaboration with the Foerster Lab for AI Research at the University of Oxford and Jeff Clune and Cong Lu at the University of British Columbia, we’re excited to release our new paper, The AI Scientist: Towards Fully Automated Open-Ended Scientific Discovery.
Sakana does not want to merge the “big” models. Its approach for robot generated research is to combine specialized models. Examples which came to my mind were drug discovery and providing “good enough” blue chip consulting outputs. These are both expensive businesses to operate. Imagine the payoff if the Sakana approach delivers high value results. Instead of merging big, the company wants to merge small; that is, more specialized models and data. The idea is that specialized data may sidestep some of the interesting issues facing Google, Meta, and OpenAI among others.
Sakana’s Web site provides this schematic to help the visitor get a sense of the mechanics of the smart software. The diagram is Sakana’s, not mine.
I don’t want to let science fiction get in the way of what today’s AI systems can do in a reliable manner. I want to make some observations about smart software making discoveries and writing useful original research papers or for BearBlog.dev.
- The company’s Web site includes a link to a paper written by the smart software. With a sample of one, I cannot see much difference between it and the baloney cranked out by the Harvard medical group or Stanford’s former president. If software did the work, it is a good deep fake.
- Should the software be able to assemble known items of information into something “novel,” the company has hit a home run in the AI ballgame. I am not a betting dinobaby. You make your own guess about the firm’s likelihood of success.
- If the software works to some degree, quite a few outfits looking for a way to replace people with a Sakana licensing fee will sign up. Will these outfits renew? I have no idea. But “good enough” may be just what these companies want.
Net net: The Sakana.ai Web site includes a how it works, more papers about items “discovered” by the software, and a couple of engineers-do-philosophy-and-ethics write ups. A “full scientific report” is available at https://arxiv.org/abs/2408.06292. I wonder if the software invented itself, wrote the documents, and did the marketing which caught my attention. Maybe?
Stephen E Arnold, August 13, 2024
The Upside of the Google Olympics Ad
August 13, 2024
This essay is the work of a dinobaby. Unlike some folks, no smart software improved my native ineptness.
I learned that Google’s AI advertisements “feel bad for a reason.” And what is that reason? The write up “Those Olympics AI Ads Feel Bad for a Reason. It’s Not Just Google’s ‘Dear Sydney’ Commercial That Feels Soulless and Strange.” (I want to mention that this headline seems soulless and strange, but I won’t.”)
The write up reveals the “secret” of the Googler using Google AI to write his Google progeny:
The latest spate of AI ad campaigns, for their part, have thus far failed to highlight how its products assist what the majority of Americans actually want to use AI for — namely, help with household chores — and instead end up showing how AI will be used for the things that most of us don’t want it to interfere with: our job prospects, our privacy, and experiences and skills that feel uniquely human. If the world already thinks of AI as menacing, wasteful, and yet another example of market overhype, these ads are only confirming our worst fears. No wonder they come off as so thoroughly insufferable.
I apologize for skipping the somewhat ho hum recitation of AI marketing gaffes. I bravely waded through the essay to identify the reason that AI ads make people “feel bad.” Am I convinced?
Nope.
I watched a version of the ad on my laptop. Based on my experience, I thought it was notable that the alleged Googley user remembered he had a family. I was impressed that the Googley father remembered where his Googley child was. I liked the idea of using AI to eliminate the need to use a smart software system to help craft a message with words that connoted interest, caring, familial warmth.
Let’s face it. The ad was more satisfying that converting a news story like a dead Google VP in a yacht.
How would Google’s smart software tell this story? I decided to find out. Here is what Gemini 1.5 Pro provided to me. Remember. I am a nerd dinobaby with a reputation for lacking empathy and noted for my work in certain specialized sectors:
It’s been a long time since Dean’s passing, but I wanted to reach out because I was thinking about him and his family. I came across an article about the woman who was with him when he passed. I know this might be a difficult thing to hear about, and I am so very sorry for your loss. Dean was such a bright light in this world, and I know how much he meant to you. Thinking of you during this time.
Amazing. The Google’s drug death in the presence of a prostitute has been converted to a paragraph I could not possibly write. I would use a phrase like “nuked by horse” instead of “passed.” The phrase “I am so very sorry” is not what I would have been able to craft. My instinct is to say something like “The Googler tried to have fun and screwed up big time.” Finally, never would a nerd dinobaby like me write “thinking of you.” I would write, “Get to your attorney pronto.”
I know that real Googlers are not like nerd dinobabies. Therefore, it is perfectly understandable that the ad presents a version of reality which is not aspirational. It is a way for certain types of professionals to simulate interest and norm-core values.
Let’s praise Google and its AI.
Stephen E Arnold, August 13, 2024
Some Fun with Synthetic Data: Includes a T Shirt
August 12, 2024
This essay is the work of a dumb dinobaby. No smart software required.
Academics and researchers often produce bogus results, fiddle images (remember the former president of Stanford University), or just make up stuff. Despite my misgivings, I want to highlight what appear to be semi-interesting assertions about synthetic data. For those not following the nuances of using real data, doing some mathematical cartwheels, and producing made-up data which are just as good as “real” data, synthetic data for me is associated with Dr. Chris Ré, the Stanford Artificial Intelligence Laboratory (remember the ex president of Stanford U., please). The term or code word for this approach to information suitable for training smart software is Snorkel. Snorkel became as company. Google embraced Snorkel. The looming litigation and big dollar settlements may make synthetic data a semi big thing in a tech dust devil called artificial intelligence. The T Shirt should read, “Synthetic data are write” like this:
I asked an AI system provided by the global leaders in computer security (yep, that’s Microsoft) to produce a T shirt for a synthetic data team. Great work and clever spelling to boot.
The “research” report appeared in Live Science. “AI Models Trained on Synthetic Data Could Break Down and Regurgitate Unintelligible Nonsense, Scientists Warn” asserts:
If left unchecked,”model collapse” could make AI systems less useful, and fill the internet with incomprehensible babble.
The unchecked term is a nice way of saying that synthetic data are cheap and less likely to become a target for copyright cops.
The article continues:
AI models such as GPT-4, which powers ChatGPT, or Claude 3 Opus rely on the many trillions of words shared online to get smarter, but as they gradually colonize the internet with their own output they may create self-damaging feedback loops. The end result, called “model collapse” by a team of researchers that investigated the phenomenon, could leave the internet filled with unintelligible gibberish if left unchecked.
People who think alike and create synthetic data will prove that “fake” data are as good as or better than “real” data. Why would anyone doubt such glib, well-educated people. Not me! Thanks, MSFT Copilot. Have you noticed similar outputs from your multitudinous AI systems?
In my opinion, the Internet when compared to commercial databases produced with actual editorial policies has been filled with “unintelligible gibberish” from the days I showed up at conferences to lecture about how hypertext was different from Gopher and Archie. When Mosaic sort of worked, I included that and left my Next computer at the office.
The write up continues:
As the generations of self-produced content accumulated, the researchers watched their model’s responses degrade into delirious ramblings.
After the data were fed into the system a number of time, the output presented was like this example from the researchers’ tests:
“architecture. In addition to being home to some of the world’s largest populations of black @-@ tailed jackrabbits, white @-@ tailed jackrabbits, blue @-@ tailed jackrabbits, red @-@ tailed jackrabbits, yellow @-.”
The output might be helpful to those interested in church architecture.
Here’s the wrap up to the research report:
This doesn’t mean doing away with synthetic data entirely, Shumailov said, but it does mean it will need to be better designed if models built on it are to work as intended. [Note: Ilia Shumailov, a computer scientist at the University of Oxford, worked on this study.]
I must admit that the write up does not make clear what data were “real” and what data were “synthetic.” I am not sure how the test moved from Wikipedia to synthetic data. I have no idea where the headline originated? Was it synthetic?
Nevertheless, I think one might conclude that using fancy math to make up data that’s as good as real life data might produce some interesting outputs.
Stephen E Arnold, August 12, 2024
Copilot and Hackers: Security Issues Noted
August 12, 2024
This essay is the work of a dinobaby. Unlike some folks, no smart software improved my native ineptness.
The online publication Cybernews ran a story I found interesting. It title suggests something about Black Hat USA 2024 attendees I have not considered. Here’s the headline:
Black Hat USA 2024: : Microsoft’s Copilot Is Freaking Some Researchers Out
Wow. Hackers (black, gray, white, and multi-hued) are “freaking out.” As defined by the estimable Urban Dictionary, “freaking” means:
Obscene dancing which simulates sex by the grinding the of the genitalia with suggestive sounds/movements. often done to pop or hip hop or rap music
No kidding? At Black Hat USA 2024?
Thanks, Microsoft Copilot. Freak out! Oh, y0ur dance moves are good enough.
The article reports:
Despite Microsoft’s claims, cybersecurity researcher Michael Bargury demonstrated how Copilot Studio, which allows companies to build their own AI assistant, can be easily abused to exfiltrate sensitive enterprise data. We also met with Bargury during the Black Hat conference to learn more. “Microsoft is trying, but if we are honest here, we don’t know how to build secure AI applications,” he said. His view is that Microsoft will fix vulnerabilities and bugs as they arise, letting companies using their products do so at their own risk.
Wait. I thought Microsoft has tied cash to security work. I thought security was Job #1 at the company which recently accursed Delta Airlines of using outdated technology and failing its customers. Is that the Microsoft that Mr. Bargury is suggesting has zero clue how to make smart software secure?
With MSFT Copilot turning up in places that surprise me, perhaps the Microsoft great AI push is creating more problems. The SolarWinds glitch was exciting for some, but if Mr. Bargury is correct, cyber security life will be more and more interesting.
Stephen E Arnold, August 12, 2024
Can AI Models Have Abandonment Issues?
August 9, 2024
Gee, it seems the next big thing may now be just … the next thing. Citing research from Gartner, Windows Central asks, “Is GenAI a Dying Fad? A New Study Predicts 30% of Investors Will Jump Ship by 2025 After Proof of Concept.” This is on top of a Reuters Institute report released in May that concluded public “interest” in AI is all hype and very few people are actually using the tools. Writer Kevin Okemwa specifies:
“[Gartner] suggests ‘at least 30% of generative AI (GenAI) projects will be abandoned after proof of concept by the end of 2025.’ The firm attributes its predictions to poor data quality, a lack of guardrails to prevent the technology from spiraling out of control, and high operation costs with no clear path to profitability.”
For example, the article reminds us, generative AI leader OpenAI is rumored to be facing bankruptcy. Gartner Analyst Rita Sallam notes that, while executives are anxious for the promised returns on AI investments, many companies struggle to turn these projects into profits. Okemwa continues:
“Gartner’s report highlights the challenges key players have faced in the AI landscape, including their inability to justify the substantial resources ranging from $5 million to $20 million without a defined path to profitability. ‘Unfortunately, there is no one size fits all with GenAI, and costs aren’t as predictable as other technologies,’ added Sallam. According to Gartner’s report, AI requires ‘a high tolerance for indirect, future financial investment criteria versus immediate return on investment (ROI).’“
That must come as a surprise to those who banked on AI hype and expected massive short-term gains. Oh well. So, what will the next next big thing be?
Cynthia Murrell, August 9, 2024
DeepMind Explains Imagination, Not the Google Olympic Advertisement
August 8, 2024
This essay is the work of a dinobaby. Unlike some folks, no smart software improved my native ineptness.
I admit it. I am suspicious of Google “announcements,” ArXiv papers, and revelations about the quantumly supreme outfit. I keep remembering the Google VP dead on a yacht with a special contract worker. I know about the Googler who tried to kill herself because a dalliance with a Big Time Google executive went off the rails. I know about the baby making among certain Googlers in the legal department. I know about the behaviors which the US Department of Justice described as “monopolistic.”
When I read “What Bosses Miss about AI,” I thought immediately about Google’s recent mass market televised advertisement about uses of Google artificial intelligence. The set up is that a father (obviously interested in his progeny) turned to Google’s generative AI to craft an electronic message to the humanoid. I know “quality time” is often tough to accommodate, but an email?
The Googler who allegedly wrote the cited essay has a different take on how to use smart software. First, most big-time thinkers are content with AI performing cost-reduction activities. AI is less expensive than a humanoid. These entities require health care, retirement, a shoulder upon which to cry (a key function for personnel in the human relations department), and time off.
Another type of big-time thinker grasps the idea that smart software can make processes more efficient. The write up describes this as the “do what we do, just do it better” approach to AI. The assumption is that the process is neutral, and it can be improved. Imagine the value of AI to Vlad the Impaler!
The third category of really Big Thinker is the leader who can use AI for imagination. I like the idea of breaking a chaotic mass of use cases into categories anchored to the Big Thinkers who use the technology.
However, I noted what I think is unintentional irony in the write up. This chart shows the non-AI approach to doing what leadership is supposed to do:
What happens when a really Big Thinker uses AI to zip through this type of process. The acceleration is delivered from AI. In this Googler’s universe, I think one can assume Google’s AI plays a modest role. Here’s the payoff paragraph:
Traditional product development processes are designed based on historical data about how many ideas typically enter the pipeline. If that rate is constant or varies by small amounts (20% or 50% a year), your processes hold. But the moment you 10x or 100x the front of that pipeline because of a new scientific tool like AlphaFold or a generative AI system, the rest of the process clogs up. Stage 1 to Stage 2 might be designed to review 100 items a quarter and pass 5% to Stage 2. But what if you have 100,000 ideas that arrive at Stage 1? Can you even evaluate all of them? Do the criteria used to pass items to Stage 2 even make sense now? Whether it is a product development process or something else, you need to rethink what you are doing and why you are doing it. That takes time, but crucially, it takes imagination.
Let’s think about this advice and consider the imagination component of the Google Olympics’ advertisement.
- Google implemented a process, spent money, did “testing,” ran the advert, and promptly withdrew it. Why? The ad was annoying to humanoids.
- Google’s “imagination” did not work. Perhaps this is a failure of the Google AI and the Google leadership? The advert succeeded in making Google the focal point of some good, old-fashioned, quite humanoid humor. Laughing at Google AI is certainly entertaining, but it appears to have been something that Google’s leadership could not “imagine.”
- The Google AI obviously reflects Google engineering choices. The parent who must turn to Google AI to demonstrate love, parental affection, and support to one’s child is, in my opinion, quite Googley. Whether the action is human or not might be an interesting topics for a coffee shop discussion. For non-Googlers, the idea of talking about what many perceived as stupid, insensitive, and inhumane is probably a non-started. Just post on social media and move on.
Viewed in a larger context, the cited essay makes it clear that Googlers embrace AI. Googlers see others’ reaction to AI as ranging from doltish to informed. Google liked the advertisement well enough to pay other companies to show the message.
I suggest the following: Google leadership should ask several AI systems if proposed advertising copy can be more economical. That’s a Stage 1 AI function. Then Google leadership should ask several AI systems how the process of creating the ideas for an advertisement can be improved. That’s a Stage 2 AI function. And, finally, Google leadership should ask, “What can we do to prevent bonkers problems resulting from trying to pretend we understand people who know nothing and care less about the three “stages” of AI understanding.
Will that help out the Google? I don’t need to ask an AI system. I will go with my instinct. The answer is, “No.”
That’s one of the challenges Google faces. The company seems unable to help itself do anything other than sell ads, promote its AI system, and cruise along in quantumly supremeness.
Stephen E Arnold, August 8, 2024
Train AI on Repetitive Data? Sure, Cheap, Good Enough, But, But, But
August 8, 2024
We already know that AI algorithms are only as smart as the data that trains them. If the data models are polluted with bias such as racism and sexism, the algorithms will deliver polluted results. We’ve also learned that while some of these models are biased because of innocent ignorance. Nature has revealed that AI algorithms have yet another weakness: “AI Models Collapse When Trained On Recursively Generated Data.”
Generative text AI aka large language models (LLMs) are already changing the global landscape. While generative AI is still in its infancy, AI developers are already designing the next generation. There’s one big problem: LLMs. The first versions of Chat GPT were trained on data models that scrapped content from the Internet. GPT continues to train on models using the same scrapping methods, but it’s creating a problem:
“If the training data of most future models are also scraped from the web, then they will inevitably train on data produced by their predecessors. In this paper, we investigate what happens when text produced by, for example, a version of GPT forms most of the training dataset of following models. What happens to GPT generations GPT-{n} as n increases? We discover that indiscriminately learning from data produced by other models causes ‘model collapse’—a degenerative process whereby, over time, models forget the true underlying data distribution, even in the absence of a shift in the distribution over time.”
The generative AI algorithms are learning from copies of copies. Over time the integrity of the information fails. The research team behind the Nature paper discovered that model collapse is inevitable when with the most ideal conditions. The team did discover two possibilities to explain model collapse: intentional data poisoning and task-free continual learning. Those don’t explain recursive data collapse with models free of those events.
The team concluded that the best way for generative text AI algorithms to learn was continual interaction learning from humans. In other words, the LLMs need constant, new information created by humans to replicate their behavior. It’s simple logic when you think about it.
Whitney Grace, August 8, 2024
Publishers Perplexed with Perplexity
August 7, 2024
In an about-face, reports Engadget, “Perplexity Will Put Ads in it’s AI Search Engine and Share Revenue with Publishers.” The ads part we learned about in April, but this revenue sharing bit is new. Is it a response to recent accusations of unauthorized scraping and plagiarism? Nah, the firm insists, the timing is just a coincidence. While Perplexity won’t reveal how much of the pie they will share with publishers, the company’s chief business officer Dmitry Shevelenko described it as a “meaningful double-digit percentage.” Engadget Senior Editor Pranav Dixit writes:
“‘[Our revenue share] is certainly a lot more than Google’s revenue share with publishers, which is zero,’ Shevelenko said. ‘The idea here is that we’re making a long-term commitment. If we’re successful, publishers will also be able to generate this ancillary revenue stream.’ Perplexity, he pointed out, was the first AI-powered search engine to include citations to sources when it launched in August 2022.”
Defensive much? Dixit reminds us Perplexity redesigned that interface to feature citations more prominently after Forbes criticized it in June.
Several AI companies now have deals to pay major publishers for permission to scrape their data and feed it to their AI models. But Perplexity does not train its own models, so it is taking a piece-work approach. It will also connect advertisements to searches. We learn:
“‘Perplexity’s revenue-sharing program, however, is different: instead of writing publishers large checks, Perplexity plans to share revenue each time the search engine uses their content in one of its AI-generated answers. The search engine has a ‘Related’ section at the bottom of each answer that currently shows follow-up questions that users can ask the engine. When the program rolls out, Perplexity plans to let brands pay to show specific follow-up questions in this section. Shevelenko told Engadget that the company is also exploring more ad formats such as showing a video unit at the top of the page. ‘The core idea is that we run ads for brands that are targeted to certain categories of query,’ he said.”
The write-up points out the firm may have a tough time breaking into an online ad business dominated by Google and Meta. Will publishers hand over their content in the hope Perplexity is on the right track? Launched in 2022, the company is based in San Francisco.
Cynthia Murrell, August 7, 2024