The Zuck: Limited by Regulation. Is This a Surprise?

September 25, 2024

Privacy laws in the EU are having an effect on Meta’s actions in that region. That’s great. But what about the rest of the world? When pressed by Australian senators, a the company’s global privacy director Melinda Claybaugh fessed up. “Facebook Admits to Scraping Every Australian Adult User’s Public Photos and Posts to Train AI, with No Opt-Out Option,” reports ABC News. Journalist Jake Evans writes:

“Labor senator Tony Sheldon asked whether Meta had used Australian posts from as far back as 2007 to feed its AI products, to which Ms Claybaugh responded ‘we have not done that’. But that was quickly challenged by Greens senator David Shoebridge. Shoebridge: ‘The truth of the matter is that unless you have consciously set those posts to private since 2007, Meta has just decided that you will scrape all of the photos and all of the texts from every public post on Instagram or Facebook since 2007, unless there was a conscious decision to set them on private. That’s the reality, isn’t it? Claybaugh: ‘Correct.’ Ms Claybaugh added that accounts of people under 18 were not scraped, but when asked by Senator Sheldon whether public photos of his own children on his account would be scraped, Ms Claybaugh acknowledged they would. The Facebook representative could not answer whether the company scraped data from previous years of users who were now adults, but were under 18 when they created their accounts.”

Why do users in Australia not receive the same opt-out courtesy those in the EU enjoy? Simple, responds Ms. Claybaugh—their government has not required it. Not yet, anyway. But Privacy Act reforms are in the works there, a response to a 2020 review that found laws to be outdated. The updated legislation is expected to be announced in August—four years after the review was completed. Ah, the glacial pace of bureaucracy. Better late than never, one supposes.

Cynthia Murrell, September 25, 2024

Written by Stephen E. Arnold · Filed Under Facebook, Government, News | 1 Comment

Consistency Manifested by Mr. Musk and the Delightfully Named X.com

September 25, 2024

This essay is the work of a dumb dinobaby. No smart software required.

You know how to build credibility: Be consistent, be sort of nice, be organized. I found a great example of what might be called anti-credibility in “Elon Rehires lawyers in Brazil, Removes Accounts He Insisted He Wouldn’t Remove.” The write up says:

Elon Musk fought the Brazilian law, and it looks like the Brazilian law won. After making a big show of how he was supposedly standing up for free speech, Elon caved yet again.

The article interprets the show of inconsistency and the abrupt about face this way:

So, all of this sounds like Elon potentially realizing that he did his “oh, look at me, I’m a free speech absolutist” schtick, it caused ExTwitter to lose a large chunk of its userbase, and now he’s back to playing ball again. Because, like so much that he’s done since taking over Twitter, he had no actual plan to deal with these kinds of demands from countries.

I agree, but I think the action illustrates a very significant point about Mr. Musk and possibly sheds light on how other US tech giants who get in regulatory trouble and lose customers will behave. Specifically, they knock off the master of the universe attitude and adopt the “scratch my belly” demeanor of a French bulldog wanting to be liked.

The failure to apply sanctions on companies which willfully violate a nation state’s laws has been one key to the rise of the alleged monopolies spawned in the US. Once a country takes action, the trilling from the French bulldog signals a behavioral change.

Now flip this around. Why do some regulators have an active dislike for some US high technology firms? The lack of respect for the law and the attitude of US super moguls might help answer the question.

I am certain many government officials find the delightfully named X.com and the mercurial Mr. Musk a topic of conversation. No wonder some folks love X.com so darned much. The approach used in Brazil and France hopefully signals consequences for those outfits who believe no mere nation state can do anything significant.

Stephen E Arnold, September 25, 2024

Written by Stephen E. Arnold · Filed Under Government, Legal matters, News, Social Media | Leave a Comment

Amazon Has a Better Idea about Catching Up with Other AI Outfits

September 25, 2024

AWS Program to Bolster 80 AI Startups from Around the World

Can boosting a roster of little-known startups help AWS catch up with Google’s and Microsoft’s AI successes? Amazon must hope so. It just tapped 80 companies from around the world to receive substantial support in its AWS Global Generative AI Accelerator program. Each firm will receive up to $1 million in AWS credits, expert mentorship, and a slot at the AWS re:Invent conference in December.

India’s CXOtoday is particularly proud of the seven recipients from that country. It boasts, “AWS Selects Seven Generative AI Startups from India for Global AWS Generative AI Accelerator.” We learn:

“The selected Indian startups— Convrse, House of Models, Neural Garage, Orbo.ai, Phot.ai, Unscript AI, and Zocket, are among the 80 companies selected by AWS worldwide for their innovative use of AI and their global growth ambitions. The Indian cohort also represents the highest number of startups selected from a country in the Asia-Pacific region for the AWS Global Generative AI Accelerator program.”

The post offers this stat as evidence India is now an AI hotspot. It also supplies some more details about the Amazon program:

“Selected startups will gain access to AWS compute, storage, and database technologies, as well as AWS Trainium and AWS Inferentia2, energy-efficient AI chips that offer high performance at the lowest cost. The credits can also be used on Amazon SageMaker, a fully managed service that helps companies build and train their own foundation models (FMs), as well as to access models and tools to easily and securely build generative AI applications through Amazon Bedrock. The 10-week program matches participants with both business and technical mentors based on their industry, and chosen startups will receive up to US$1 million each in AWS credits to help them build, train, test, and launch their generative AI solutions. Participants will also have access to technology and technical sessions from program presenting partner NVIDIA.”

See the write-up to learn more about each of the Indian startups selected, or check out the full roster here.

The question is, “Will this help Amazon which is struggling to make Facebook, Google, and Microsoft look like the leaders in the AI derby?”

Cynthia Murrell, September 25, 2024

Written by Stephen E. Arnold · Filed Under AI, Amazon, Business strategy, News | Leave a Comment

Open Source Dox Chaos: An Opportunity for AI

September 24, 2024

It is a problem as old as the concept of open source itself. ZDNet laments, “Linux and Open-Source Documentation Is a Mess: Here’s the Solution.” We won’t leave you in suspense. Writer Steven Vaughan-Nichols’ solution is the obvious one—pay people to write and organize good documentation. Less obvious is who will foot the bill. Generous donors? Governments? Corporations with their own agendas? That question is left unanswered.

But there is not doubt. Open-source documentation, when it exists at all, is almost universally bad. Vaughan-Nichols recounts:

“When I was a wet-behind-the-ears Unix user and programmer, the go-to response to any tech question was RTFM, which stands for ‘Read the F… Fine Manual.’ Unfortunately, this hasn’t changed for the Linux and open-source software generations. It’s high time we addressed this issue and brought about positive change. The manuals and almost all the documentation are often outdated, sometimes nearly impossible to read, and sometimes, they don’t even exist.”

Not only are the manuals that have been cobbled together outdated and hard to read, they are often so disorganized it is hard to find what one is looking for. Even when it is there. Somewhere. The post emphasizes:

“It doesn’t help any that kernel documentation consists of ‘thousands of individual documents’ written in isolation rather than a coherent body of documentation. While efforts have been made to organize documents into books for specific readers, the overall documentation still lacks a unified structure. Steve Rostedt, a Google software engineer and Linux kernel developer, would agree. At last year’s Linux Plumbers conference, he said, ‘when he runs into bugs, he can’t find documents describing how things work.’ If someone as senior as Rostedt has trouble, how much luck do you think a novice programmer will have trying to find an answer to a difficult question?”

This problem is no secret in the open-source community. Many feel so strongly about it they spend hours of unpaid time working to address it. Until they just cannot take it anymore. It is easy to get burned out when one is barely making a dent and no one appreciates the effort. At least, not enough to pay for it.

Here at Beyond Search we have a question: Why can’t Microsoft’s vaunted Copilot tackle this information problem? Maybe Copilot cannot do the job?

Cynthia Murrell, September 24, 2024

Written by Stephen E. Arnold · Filed Under AI, Microsoft, News, Open source | 1 Comment

Guess What? Most Conferences Leak High Value Information

September 24, 2024

This essay is the work of a dumb dinobaby. No smart software required.

I read the Wired “real news” article titled “Did a Chinese University Hacking Competition Target a Real Victim?” The main idea of the article is that a conference attracted security professionals. To spice up the person talking approach to conferences, “games” were organized. The article makes clear that the conference and the activities could have and maybe were a way for some people involved with and at the conference to obtain high-value information.

News flash! A typical conference setting. Everyone is listening for hot info. Thanks, MSFT Copilot. Good enough.

I have a “real news” flash for the folks at Wired. Any conference — including those with restricted attendance or special security checks — can be vectors for exfiltration of high-value information. After one lecture I delivered at a flashy public conference, a person who identified himself as a business professional wanted to invite me to give lectures in a country not in the EU. I listened. I asked questions. I received only fuzzy wuzzy answers. I did hear all expenses paid and an honorarium. I explained that I was a dinobaby. I wanted more details before I could say yes or no. I told the gentleman I had a meeting and had to get to that commitment. How often has that happened to me? At one conference I attended for six or seven years, a similar conversation took place with me and a business professional every time I gave a lecture.

Within the last 12 months, one of my talks was converted into an email from someone in the audience and a “real” journalist. Some of my team’s findings appeared without attribution in one of few remaining big name online publications. Based on my experience alone, I think attending conferences related to any “hot” technical subject is going to be like a freshly grilled Trader Joe’s veggie burger to a young-at-heart member of the Diptera clan (that’s a house fly, but you probably know that).

Let me offer several observations which may be use to people speaking at public, semi-public, or restricted events:

Make darned sure you are not providing high-value actionable information. If one is not self aware, speakers get excited and do a core dump. The people seeking information for a purpose the speaker has not intended just writes it down and snaps mobile phone pix of the visuals. If a speakers says something of utility, that information is gone and can make its way into the hands of competitors, bad actors, or enemies of one nation state or another. The burden is on the attendee. Period.
If handouts are provided, make certain these do not contain the complete information payload. If I prepare what I call a feuilles détachées, these are sanitized by omitting specific details. The general idea is expressed, but the good stuff is omitted. In short, neuter what is publicly available.
Research the conference. Know before you go. If the conference is “secure,” you will have to chase down one of the disorganized and harried organizers and ask them to read you the names of the companies or agencies which sent representatives.
Find out who the exhibitors are. Often some names appear on the conference Web site, but others — often some interesting outfits — don’t want any publicity. The conference is a way to learn what competitors are doing, identify prospects, pick up high value information, and recruit people to do work that can get them in some interesting conversations. Who knows? Maybe that consulting job dangled in front of a clueless attendee is a way to penetrate an organization?
Leveraging conferences for intelligence is standard operating procedure.

Net net: Answer the question, “What’s the difference between high-value information and marketing baloney?” Here’s my response: “A failure to know or anticipate what the other person knows and needs. This is not news. It is common sense.

Stephen E Arnold, September 24, 2024

Written by Stephen E. Arnold · Filed Under Business intelligence, Conferences, News | Leave a Comment

Open Podcast Index Lists Many

September 24, 2024

Podcasters who wish to be indexed by Apple or Spotify must abide by certain guidelines, some of which appear arbitrary or self-serving to some. Enter the Podcast Index, introduced by long-time broadcaster turned “podfather,” Adam Curry. The site follows the open-source tradition, promising:

“The Podcast Index is here to preserve, protect and extend the open, independent podcasting ecosystem. We do this by enabling developers to have access to an open, categorized index that will always be available for free, for any use. … Podcast Index LLC is a software developer focused partnership that provides tools and data to anyone who aspires to create new and exciting Podcast experiences without the heavy lifting of indexing, aggregation and data management.”

Funded by its founders and by donations, the site aims to list every available podcast so would-be listeners need not rely on commercial firms to discover them. This goal is emphasized by a running tally on the homepage, which counts over four million (!) podcasts listed as of this writing. One can filter and browse the many supporting apps, directories, and hosting companies here. Developers can sign up to use the API here. And, of course, donations can be made through the red button at the foot of the home page. For anyone wondering how to put content from around the world in their ears, this is a good place to start.

Cynthia Murrell, September 24, 2024

Written by Stephen E. Arnold · Filed Under News, Rich media | Leave a Comment

Zapping the Ghost Comms Service

September 23, 2024

This essay is the work of a dumb dinobaby. No smart software required.

Europol generated a news release titled “Global Coalition Takes Down New Criminal Communication Platform.” One would think that bad actors would have learned a lesson from the ANOM operation and from the take downs of other specialized communication services purpose built for bad actors. The Europol announcement explains:

Europol and Eurojust, together with law enforcement and judicial authorities from around the world, have successfully dismantled an encrypted communication platform that was established to facilitate serious and organized crime perpetrated by dangerous criminal networks operating on a global scale. The platform, known as Ghost, was used as a tool to carry out a wide range of criminal activities, including large-scale drug trafficking, money laundering, instances of extreme violence and other forms of serious and organized crime.

Eurojust, as you probably know, is the EU’s agency responsible for dealing with judicial cooperation in criminal matters among agencies. The entity was set up 2002 and concerns itself serious crime and cutting through the red tape to bring alleged bad actors to court. The dynamic of Europol and Eurojust is to investigate and prosecute with efficiency.

Two cyber investigators recognize that the bad actors can exploit the information environment to create more E2EE systems. Thanks, MSFT Copilot. You do a reasonable job of illustrating chaos. Good enough.

The marketing-oriented name of the system is or rather was Ghost. Here’s how Europol describes the system:

Users could purchase the tool without declaring any personal information. The solution used three encryption standards and offered the option to send a message followed by a specific code which would result in the self-destruction of all messages on the target phone. This allowed criminal networks to communicate securely, evade detection, counter forensic measures, and coordinate their illegal operations across borders. Worldwide, several thousand people used the tool, which has its own infrastructure and applications with a network of resellers based in several countries. On a global scale, around one thousand messages are being exchanged each day via Ghost.

With law enforcement compromising certain bad actor-centric systems like Ghost, what are the consequences of these successful shutdowns? Here’s what Europol says:

The encrypted communication landscape has become increasingly fragmented as a result of recent law enforcement actions targeting platforms used by criminal networks. Following these operations, numerous once-popular encrypted services have been shut down or disrupted, leading to a splintering of the market. Criminal actors, in response, are now turning to a variety of less-established or custom-built communication tools that offer varying degrees of security and anonymity. By doing so, they seek new technical solutions and also utilize popular communication applications to diversify their methods. This strategy helps these actors avoid exposing their entire criminal operations and networks on a single platform, thereby mitigating the risk of interception. Consequently, the landscape of encrypted communications remains highly dynamic and segmented, posing ongoing challenges for law enforcement.

Nevertheless, some entities want to create secure apps designed to allow criminal behaviors to thrive. These range from “me too” systems like one allegedly in development by a known bad actor to knock offs of sophisticated hardware-software systems which operate within the public Internet. Are bad actors more innovative than the whiz kids at the largest high-technology companies? Nope. Based on my team’s research, notable sources of ideas to create problems for law enforcement include:

Scanning patent applications for nifty ideas. Modern patent search systems make the identification of novel ideas reasonably straightforward
Hiring one or more university staff to identify and get students to develop certain code components as part of a normal class project
Using open source methods and coming up with ad hoc ways to obfuscate what’s being done. (Hats off to the open source folks, of course.)
Buying technology from middle “men” who won’t talk about their customers. (Is that too much information, Mr. Oligarch’s tech expert?)

Like much in today’s digital world or what I call the datasphere, each successful takedown provides limited respite. The global cat-and-mouse game between government authorities and bad actors is what some at the Santa Fe Institute might call “emergent behavior” at the boundary between entropy and chaos. That’s a wonderful insight despite suggesting another consequence of living at the edge of chaos.

Stephen E Arnold, September 23, 2024

Written by Stephen E. Arnold · Filed Under cybercrime, law enforcement, News | 1 Comment

Microsoft Explains Who Is at Fault If Copilot Smart Software Does Dumb Things

September 23, 2024

This essay is the work of a dumb dinobaby. No smart software required.

Those Windows Central experts have delivered a Dusie of a write up. “Microsoft Says OpenAI’s ChatGPT Isn’t Better than Copilot; You Just Aren’t Using It Right, But Copilot Academy Is Here to Help” explains:

Avid AI users often boast about ChatGPT’s advanced user experience and capabilities compared to Microsoft’s Copilot AI offering, although both chatbots are based on OpenAI’s technology. Earlier this year, a report disclosed that the top complaint about Copilot AI at Microsoft is that “it doesn’t seem to work as well as ChatGPT.”

I think I understand. Microsoft uses OpenAI, other smart software, and home brew code to deliver Copilot in apps, the browser, and Azure services. However, users have reported that Copilot doesn’t work as well as ChatGPT. That’s interesting. A hallucinating capable software processed by the Microsoft engineering legions is allegedly inferior to Copilot.

Enthusiastic young car owners replace individual parts. But the old car remains an old, rusty vehicle. Thanks, MSFT Copilot. Good enough. No, I don’t want to attend a class to learn how to use you.

Who is responsible? The answer certainly surprised me. Here’s what the Windows Central wizards offer:

A Microsoft employee indicated that the quality of Copilot’s response depends on how you present your prompt or query. At the time, the tech giant leveraged curated videos to help users improve their prompt engineering skills. And now, Microsoft is scaling things a notch higher with Copilot Academy. As you might have guessed, Copilot Academy is a program designed to help businesses learn the best practices when interacting and leveraging the tool’s capabilities.

I think this means that the user is at fault, not Microsoft’s refactored version of OpenAI’s smart software. The fix is for the user to learn how to write prompts. Microsoft is not responsible. But OpenAI’s implementation of ChatGPT is perceived as better. Furthermore, training to use ChatGPT is left to third parties. I hope I am close to the pin on this summary. OpenAI just puts Strawberries in front of hungry users and let’s them gobble up ChatGPT output. Microsoft fixes up ChatGPT and users are allegedly not happy. Therefore, Microsoft puts the burden on the user to learn how to interact with the Microsoft version of ChatGPT.

I thought smart software was intended to make work easier and more efficient. Why do I have to go to school to learn Copilot when I can just pound text or a chunk of data into ChatGPT, click a button, and get an output? Not even a Palantir boot camp will lure me to the service. Sorry, pal.

My hypothesis is that Microsoft is a couple of steps away from creating something designed for regular users. In its effort to “improve” ChatGPT, the experience of using Copilot makes the user’s life more miserable. I think Microsoft’s own engineering practices act like a struck brake on an old Lada. The vehicle has problems, so installing a new master cylinder does not improve the automobile.

Crazy thinking: That’s what the write up suggests to me.

Stephen E Arnold, September 23, 2024

Written by Stephen E. Arnold · Filed Under AI, Business process, Microsoft, News | Leave a Comment

Losing Knowledge: Yep and No One Does Much Except Sue to Prevent Archiving

September 23, 2024

Archives are bastions of history. What’s great about archives is that they physically store items for historical perseveration and researchers can visit them. When the Internet popped up, there wasn’t a digital archive to persevere everything on the World Wide Web. True, there’s the Internet Archive and other independent organizations, but according to the BBC there’s trouble brewing: “We’re Losing Our Digital History. Can The Internet Archive Save It?”

The Internet Archive has been around since 1996 and has done a phenomenal job archiving defunct Web sites, but external threats such as financial issues, technical challenges, legal battles with IP owners, and cyberattacks are big problems. There’s an even bigger problem for the Internet Archive. Most organizations and individuals keep their content in digital environments and those are fragile. WIth a single button or a solar flare, the can disappear forever.

The Internet should be archive so we understand its evolution and its also the most widely used resource in the world. Information on the Internet is a reflection of humanity like newspapers, magazines, radio, television, and movies. Despite all the backups and servers, its fragility is worse than past mediums. Persevering the Internet is an up hill battle and individuals are usually better at it than organizations:

“ ‘If you have to keep everything, it becomes very expensive,’ says Jackson of the Digital Preservation Coalition. ‘There’s often older content or less compelling content [that] gets lost by the wayside,’ he says. ‘We’re not capturing the non-Western world well,’ admits Jackson. ‘There are gaps now around incompleteness in different cultural domains.’ And while many of those organisations work to fight against their biases and prejudices, they’re often left to carry the weight of the task while governments and the companies that run the platforms and websites sit by. ‘Independent groups of people, who are just caring about it and are willing to spend their free time doing it, are better resourced and more highly skilled than the institutions which are formally responsible,’ says Jackson.”

Are they doomed? Maybe.

Who will the heroes be? The digital hoarders. They’re like physical hoarders who have OCD, except they keep digital records. I’m sensing the foundation of an Internet Archive Museum if lawyers permit such an activity.

Whitney Grace, September 23, 2024

Written by Stephen E. Arnold · Filed Under Legal matters, News | Leave a Comment

DAIS: A New Attempt to Make AI Play Nicely with Humans

September 20, 2024

This essay is the work of a dumb dinobaby. No smart software required.

How about a decentralized artificial intelligence “association”? One has been set up by Michael Casey, the former chief content officer at Coindesk. (Coindesk reports about the bright, sunny world of crypto currency and related topics.) I learned about this society in — you guessed it — Coindesk’s online information service called Coindesk. The article “Decentralized AI Society Launched to Fight Tech Giants Who ‘Own the Regulators’” is interesting. I like the idea that “tech giants” own the regulators. This is an observation which Apple and Google might not agree. Both “tech giants” have been facing some unfavorable regulatory decisions. If these regulators are “owned,” I think the “tech giants” need to exercise their leadership skills to make the annoying regulators go away. One resigned in the EU this week, but as Shakespeare said of lawyers, let’s drown them. So far the “tech giants” have been bumbling along, growing bigger as a result of feasting on data and amplifying allegedly monopolistic behaviors which just seem to pop up, rules or no rules.

Two experts look at what emerged from a Petri dish of technological goodies. Quite a surprise I assume. Thanks, MSFT Copilot. Good enough.

The write up reports:

Industry leaders have launched a non-profit organization called the Decentralized AI Society (DAIS), dedicated to tackling the probability of the monopolization of the artificial intelligence (AI) industry.

What is the DAIS outfit setting out to do? Here’s what Coindesk reports and this is a quote of the bullets from the write up:

Bringing capital to the decentralized AI world in what has already become an arms race for resources like graphical processing units (GPUs) and the data centers that compute together.

Shaping policy to craft AI regulations.

Education and promotion of decentralized AI.

Engineering to create new algorithms for learning models in a distributed way.

These are interesting targets. I want to point out that “decentralization” is the opposite of what the “tech giants” have already put in place; that is, concentration of money, talent, and infrastructure. Even old dogs like Oracle are now hopping on the centralized bandwagon. Even newcomers want to get as many cattle into the killing chute before the glamor of AI begins to lose some sparkles.

Several observations:

DAIS has some crypto roots. These may become positive or negative. Right now regulators are interested in crypto as are other enforcement entities
One of the Arnold Laws of Online is that centralization, consolidation, and concentration are emergent behaviors for online products and services. Countering this “law” and its “emergent” functionality is going to take more than conferences, a Web site, and some “logical” ideas which any “rational” person would heartily endorse. But emergent is tough to stop based on my experience.
Singapore has become a hot spot for certain financial and technical activities. The problem is that nation-states may not want to be inhibited in their AI ambitions. Some may find the notion of “education” a problem as well because curricula must conform to pre-defined frameworks. Distributed is not a pre-defined anything; it is the opposite of controlled and, therefore, likely to be a bit of a problem.

Net net: Interesting idea. But Amazon, Google, Facebook, Microsoft, and some other outfits may want to talk about “distributed” but really mean the technological notion is okay, but we want as much of the money as we can get.

Stephen E Arnold, September 20, 2024

Written by Stephen E. Arnold · Filed Under AI, News, Technology | Leave a Comment

« Previous Page — Next Page »

Search the site
Subscribe to Beyond Search
Feature archive
News archive

Stephen E. Arnold monitors search, content processing, text mining and related topics from his high-tech nerve center in rural Kentucky. He tries to winnow the goose feathers from the giblets. He works with colleagues worldwide to make this Web log useful to those who want to go "beyond search". Contact him at sa [at] arnoldit.com. His Web site with additional information about search is arnoldit.com.