Directories Have Value

November 29, 2024

Why would one build an online directory—to create a helpful reference? Or for self aggrandizement? Maybe both. HackerNoon shares a post by developer Alexander Isora, “Here’s Why Owning a Directory = Owning a Free Infinite Marketing Channel.”

First, he explains why users are drawn to a quality directory on a particular topic: because humans are better than Google’s algorithm at determining relevant content. No argument here. He uses his own directory of Stripe alternatives as an example:

“Why my directory is better than any of the top pages from Google? Because in the SERP [Search Engine Results Page], you will only see articles written by SEO experts. They have no idea about billing systems. They never managed a SaaS. Their set of links is 15 random items from Crunchbase or Product Hunt. Their article has near 0 value for the reader because the only purpose of the article is to bring traffic to the company’s blog. What about mine? I tried a bunch of Stripe alternatives myself. Not just signed up, but earned thousands of real cash through them. I also read 100s of tweets about the experiences of others. I’m an expert now. I can even recognize good ones without trying them. The set of items I published is WAY better than any of the SEO-optimized articles you will ever find on Google. That is the value of a directory.”

Okay, so that is why others would want a subject-matter expert to create a directory. But what is in it for the creator? Why, traffic, of course! A good directory draws eyeballs to one’s own products and services, the post asserts, or one can sell ads for a passive income. One could even sell a directory (to whom?) or turn it into its own SaaS if it is truly popular.

Perhaps ironically, Isora’s next step is to optimize his directories for search engines. Sounds like a plan.

Cynthia Murrell, November 29, 2024

Written by Stephen E. Arnold · Filed Under Business strategy, News, Publishing, Reference tool | Comments Off on Directories Have Value

FOGINT: Security Tools Over Promise & Under Deliver

November 22, 2024

While the United States and the rest of the world has been obsessed with the fallout of the former’s presidential election, bad actors planned terrorist plots. I24 News reports that after a soccer/football match in Amsterdam, there was a preplanned attack on Israeli fans: “Evidence From WhatsApp, Telegram Groups Shows Amsterdam Pogrom Was Organized.”

The Daily Telegraph located screenshots from WhatsApp and Telegram that displayed messages calling for a “Jew Hunt” after the game. The message writers were identified as Pro-Palestinian supports. The bad actors also called Jews “cancer dogs”, a vile slur in Dutch and told co-conspirators to bring fireworks to the planned attack. Dutch citizens and other observers were underwhelmed with the response of the Netherlands’ law enforcement. Even King Willem-Alexander noted that his country failed to protect the Jewish community when he spoke with Israeli President Isaac Herzog:

“Dutch king Willem-Alexander reportedly said to Israel’s President Isaac Herzog in a phone call on Friday morning that the ‘we failed the Jewish community of the Netherlands during World War II, and last night we failed again.’”

This an unfortunate example of the failure of cyber security tools that monitor social media. If this was a preplanned attack and the Daily Telegraph located the messages, then a cyber security company should have as well. These police ware and intelware systems failed to alert authorities. Is this another confirmation that cyber security and threat intelligence tools over promise and under deliver? Well, T-Mobile is compromised again and there is that minor lapse in Israel in October 2023.

Whitney Grace, November 22, 2024

Written by Stephen E. Arnold · Filed Under cybercrime, cybersecurity, News | Comments Off on FOGINT: Security Tools Over Promise & Under Deliver

The Bezos Bulldozer Could Stalls in a Nuclear Fuel Pool

November 11, 2024

Sorry to disappoint you, but this blog post is written by a dumb humanoid. The art? We used MidJourney.

Microsoft is going to flip a switch and one of Three Mile Islands’ nuclear units will blink on. Yeah. Google is investing in small nuclear power unit. But one, haul it to the data center of your choice, and plug it in. Shades of Tesla thinking. Amazon has also be fascinated by Cherenkov radiation which is blue like Jack Benny’s eyes.

A physics amateur learned about 880 volts by reading books on his Kindle. Thanks, MidJourney. Good enough.

Are these PR-tinged information nuggets for real? Sure, absolutely. The big tech outfits are able to do anything, maybe not well, but everything. Almost.

The “trusted” real news outfit (Thomson Reuters) published “US Regulators Reject Amended Interconnect for Agreement for Amazon Data Center.” The story reports as allegedly accurate information:

U.S. energy regulators rejected an amended interconnection agreement for an Amazon data center connected directly to a nuclear power plant in Pennsylvania, a filing showed on Friday. Members of the Federal Energy Regulatory Commission said the agreement to increase the capacity of the data center located on the site of Talen Energy’s Susquehanna nuclear generating facility could raise power bills for the public and affect the grid’s reliability.

Amazon was not inventing a function modular nuclear reactor using the better option thorium. No. Amazon just wanted to fun a few of those innocuous high voltage transmission line, plug in a converter readily available from one of Amazon’s third party merchants, and let a data center chock full of dolphin loving servers, storage devices, and other gizmos. What’s the big deal?

The write up does not explain what “reliability” and “national security” mean. Let’s just accept these as words which roughly translate to “unlikely.”

Is this an issue that will go away? My view is, “No.” Nuclear engineers are not widely represented among the technical professionals engaged in selling third-party vendors’ products, figuring out how to make Alexa into a barn burner of a product, or forcing Kindle users to smash their devices in frustration when trying to figure out what’s on their Kindle and what’s in Amazon’s increasingly bizarro cloud system.

Can these companies become nuclear adepts? Sure. Will that happen quickly? Nope. Why? Nuclear is specialized field and involves a number of quite specific scientific disciplines. But Amazon can always ask Alexa and point to its Ring door bell system as the solution to security concerns. The approach will impress regulatory authorities.

Stephen E Arnold, November 11, 2024

Written by Stephen E. Arnold · Filed Under Business strategy, Government, News | Comments Off on The Bezos Bulldozer Could Stalls in a Nuclear Fuel Pool

Consistency Manifested by Mr. Musk and the Delightfully Named X.com

September 25, 2024

This essay is the work of a dumb dinobaby. No smart software required.

You know how to build credibility: Be consistent, be sort of nice, be organized. I found a great example of what might be called anti-credibility in “Elon Rehires lawyers in Brazil, Removes Accounts He Insisted He Wouldn’t Remove.” The write up says:

Elon Musk fought the Brazilian law, and it looks like the Brazilian law won. After making a big show of how he was supposedly standing up for free speech, Elon caved yet again.

The article interprets the show of inconsistency and the abrupt about face this way:

So, all of this sounds like Elon potentially realizing that he did his “oh, look at me, I’m a free speech absolutist” schtick, it caused ExTwitter to lose a large chunk of its userbase, and now he’s back to playing ball again. Because, like so much that he’s done since taking over Twitter, he had no actual plan to deal with these kinds of demands from countries.

I agree, but I think the action illustrates a very significant point about Mr. Musk and possibly sheds light on how other US tech giants who get in regulatory trouble and lose customers will behave. Specifically, they knock off the master of the universe attitude and adopt the “scratch my belly” demeanor of a French bulldog wanting to be liked.

The failure to apply sanctions on companies which willfully violate a nation state’s laws has been one key to the rise of the alleged monopolies spawned in the US. Once a country takes action, the trilling from the French bulldog signals a behavioral change.

Now flip this around. Why do some regulators have an active dislike for some US high technology firms? The lack of respect for the law and the attitude of US super moguls might help answer the question.

I am certain many government officials find the delightfully named X.com and the mercurial Mr. Musk a topic of conversation. No wonder some folks love X.com so darned much. The approach used in Brazil and France hopefully signals consequences for those outfits who believe no mere nation state can do anything significant.

Stephen E Arnold, September 25, 2024

Written by Stephen E. Arnold · Filed Under Government, Legal matters, News, Social Media | Comments Off on Consistency Manifested by Mr. Musk and the Delightfully Named X.com

When Egos Collide in Brazil

September 10, 2024

Why the Supreme Federal Court of Brazil has Suspended X

It all started when Brazilian Supreme Court judge Alexandre de Moraes issued a court order requiring X to block certain accounts for spewing misinformation and hate speech. Notably, these accounts belonged to right-wing supporters of former Brazilian President Jair Bolsonaro. After taking his ball and going home, Musk responded with some misinformation and hate speech of his own. He published some insulting AI-generated images of de Moraes, because apparently that is a thing he does now. He has also blatantly refused to pay the fines and appoint the legal representative required by the court. Musk’s tantrums would be laughable if his colossal immaturity were not matched by his dangerous wealth and influence.

But De Moraes seems to be up for the fight. The judge has now added Musk to an ongoing investigation into the spread of fake news and has launched a separate probe into the mogul for obstruction of justice and incitement to crime. We turn to Brazil’s Globo for de Moraes’ perspective in the article, “Por Unanimidade, 1a Turma do STF Mantém X Suspenso No Brasil.” Or in English, “Unanimously, 1^st Court of the Supreme Federal Court Maintains X Suspension in Brazil.” Reporter Márcio Falcão writes (in Google Translate’s interpretation):

“Moraes also affirmed that Elon Musk confuses freedom of expression with a nonexistent freedom of aggression and deliberately confuses censorship with the constitutional prohibition of hate speech and incitement to antidemocratic acts. The minister said that ‘the criminal instrumentalization of various social networks, especially network X, is also being investigated in other countries.’ I quote an excerpt from the opinion of Attorney General Paulo Gonet, who agrees with the decision to suspend In this sixth edition. Alexandre de Moraes also affirmed that there have been ‘repeated, conscious, and voluntary failures to comply with judicial orders and non-implementation of daily fines applied, in addition to attempts not to submit to the Brazilian legal system and Judiciary, to ‘Instituting an environment of total impunity and ‘terra sem lei’ [‘lawless land’] in Brazilian social networks, including during the 2024 municipal elections.’”

“A nonexistent freedom of aggression” is a particularly good burn. Chef’s kiss. The article also shares viewpoints from the four other judges who joined de Moraes to suspend X. The court also voted to impose huge fines for any Brazilians who continue to access the platform through a VPN, though The Federal Council of Advocates of Brazil asked de Moraes to reconsider that measure. (Here’s Google’s translation of that piece.) What will be next in this dramatic standoff? And what precedent(s) will be set?

Cynthia Murrell, September 10, 2024

Written by Stephen E. Arnold · Filed Under Censorship, Government, News | Comments Off on When Egos Collide in Brazil

Stop Indexing! And Pay Up!

July 17, 2024

This essay is the work of a dinobaby. Unlike some folks, no smart software improved my native ineptness.

I read “Apple, Nvidia, Anthropic Used Thousands of Swiped YouTube Videos to Train AI.” The write up appears in two online publications, presumably to make an already contentious subject more clicky. The assertion in the title is the equivalent of someone in Salem, Massachusetts, pointing at a widower and saying, “She’s a witch.” Those willing to take the statement at face value would take action. The “trials” held in colonial Massachusetts. My high school history teacher was a witchcraft trial buff. (I think his name was Elmer Skaggs.) I thought about his descriptions of the events. I recall his graphic depictions and analysis of what I recall as “dunking.” The idea was that if a person was a witch, then that person could be immersed one or more times. I think the idea had been popular in medieval Europe, but it was not a New World innovation. Me-too is a core way to create novelty. The witch could survive being immersed for a period of time. With proof, hanging or burning were the next step. The accused who died was obviously not a witch. That’s Boolean logic in a pure form in my opinion.

The Library in Alexandria burns in front of people who wanted to look up information, learn, and create more information. Tough. Once the cultural institution is gone, just figure out the square root of two yourself. Thanks, MSFT Copilot. Good enough.

The accusations and evidence in the article depict companies building large language models as candidates for a test to prove that they have engaged in an improper act. The crime is processing content available on a public network, indexing it, and using the data to create outputs. Since the late 1960s, digitizing information and making it more easily accessible was perceived as an important and necessary activity. The US government supported indexing and searching of technical information. Other fields of endeavor recognized that as the volume of information expanded, the traditional methods of sitting at a table, reading a book or journal article, making notes, analyzing the information, and then conducting additional research or writing a technical report was simply not fast enough. What worked in a medieval library was not a method suited to put a satellite in orbit or perform other knowledge-value tasks.

Thus, online became a thing. Remember, we are talking punched cards, mainframes, and clunky line printers one day there was the Internet. The interest in broader access to online information grew and by 1985, people recognized that online access was useful for many tasks, not just looking up information about nuclear power technologies, a project I worked on in the 1970s. Flash forward 50 years, and we are upon the moment one can read about the “fact” that Apple, Nvidia, Anthropic Used Thousands of Swiped YouTube Videos to Train AI.

The write up says:

AI companies are generally secretive about their sources of training data, but an investigation by Proof News found some of the wealthiest AI companies in the world have used material from thousands of YouTube videos to train AI. Companies did so despite YouTube’s rules against harvesting materials from the platform without permission. Our investigation found that subtitles from 173,536 YouTube videos, siphoned from more than 48,000 channels, were used by Silicon Valley heavyweights, including Anthropic, Nvidia, Apple, and Salesforce.

I understand the surprise some experience when they learn that a software script visits a Web site, processes its content, and generates an index (a buzzy term today is large language model, but I prefer the simpler word index.)

I want to point out that for decades those engaged in making information findable and accessible online have processed content so that a user can enter a query and get a list of indexed items which match that user’s query. In the old days, one used Boolean logic which we met a few moments ago. Today a user’s query (the jazzy term is prompt now) is expanded, interpreted, matched to the user’s “preferences”, and a result generated. I like lists of items like the entries I used to make on a notecard when I was a high school debate team member. Others want little essays suitable for a class assignment on the Salem witchcraft trials in Mr. Skaggs’s class. Today another system can pass a query, get outputs, and then take another action. This is described by the in-crowd as workflow orchestration. Others call it, “taking a human’s job.”

My point is that for decades, the index and searching process has been without much innovation. Sure, software scripts can know when to enter a user name and password or capture information from Web pages that are transitory, disappearing in the blink of an eye. But it is still indexing over a network. The object remains to find information of utility to the user or another system.

The write up reports:

Proof News contributor Alex Reisner obtained a copy of Books3, another Pile dataset and last year published a piece in The Atlantic reporting his finding that more than 180,000 books, including those written by Margaret Atwood, Michael Pollan, and Zadie Smith, had been lifted. Many authors have since sued AI companies for the unauthorized use of their work and alleged copyright violations. Similar cases have since snowballed, and the platform hosting Books3 has taken it down. In response to the suits, defendants such as Meta, OpenAI, and Bloomberg have argued their actions constitute fair use. A case against EleutherAI, which originally scraped the books and made them public, was voluntarily dismissed by the plaintiffs. Litigation in remaining cases remains in the early stages, leaving the questions surrounding permission and payment unresolved. The Pile has since been removed from its official download site, but it’s still available on file sharing services.

The passage does a good job of making clear that most people are not aware of what indexing does, how it works, and why the process has become a fundamental component of many, many modern knowledge-centric systems. The idea is to find information of value to a person with a question, present relevant content, and enable the user to think new thoughts or write another essay about dead witches being innocent.

The challenge today is that anyone who has written anything wants money. The way online works is that for any single user’s query, the useful information constitutes a tiny, miniscule fraction of the information in the index. The cost of indexing and responding to the query is high, and those costs are difficult to control.

But everyone has to be paid for the information that individual “created.” I understand the idea, but the reality is that the reason indexing, search, and retrieval was invented, refined, and given numerous life extensions was to perform a core function: Answer a question or enable learning.

The write up makes it clear that “AI companies” are witches. The US legal system is going to determine who is a witch just like the process in colonial Salem. Several observations are warranted:

Modifying what is a fundamental mechanism for information retrieval may be difficult to replace or re-invent in a quick, cost-efficient, and satisfactory manner. Digital information is loosey goosey; that is, it moves, slips, and slides either by individual’s actions or a mindless system’s.
Slapping fines and big price tags on what remains an access service will take time to have an impact. As the implications of the impact become more well known to those who are aggrieved, they may find that their own information is altered in a fundamental way. How many research papers are “original”? How many journalists recycle as a basic work task? How many children’s lives are lost when the medical reference system does not have the data needed to treat the kid’s problem?
Accusing companies of behaving improperly is definitely easy to do. Many companies do ignore rules, regulations, and cultural norms. Engineering Index’s publisher leaned that bootleg copies of printed Compendex indexes were available in China. What was Engineering Index going to do when I learned this almost 50 years ago? The answer was give speeches, complain to those who knew what the heck a Compendex was, and talk to lawyers. What happened to the Chinese content pirates? Not much.

I do understand the anger the essay expresses toward large companies doing indexing. These outfits are to some witches. However, if the indexing of content is derailed, I would suggest there are downstream consequences. Some of those consequences will make zero difference to anyone. A government worker at a national lab won’t be able to find details of an alloy used in a nuclear device. Who cares? Make some phone calls? Ask around. Yeah, that will work until the information is needed immediately.

A student accustomed to looking up information on a mobile phone won’t be able to find something. The document is a 404 or the information returned is an ad for a Temu product. So what? The kid will have to go the library, which one hopes will be funded, have printed material or commercial online databases, and a librarian on duty. (Good luck, traditional researchers.) A marketing team eager to get information about the number of Telegram users in Ukraine won’t be able to find it. The fix is to hire a consultant and hope those bright men and women have a way to get a number, a single number, good, bad, or indifferent.)

My concern is that as the intensity of the objections about a standard procedure for building an index escalate, the entire knowledge environment is put at risk. I have worked in online since 1962. That’s a long time. It is amazing to me that the plumbing of an information economy has been ignored for a long time. What happens when the companies doing the indexing go away? What happens when those producing the government reports, the blog posts, or the “real” news cannot find the information needed to create information? And once some information is created, how is another person going to find it. Ask an eighth grader how to use an online catalog to find a fungible book. Let me know what you learn? Better yet, do you know how to use a Remac card retrieval system?

The present concern about information access troubles me. There are mechanisms to deal with online. But the reason content is digitized is to find it, to enable understanding, and to create new information. Digital information is like gerbils. Start with a couple of journal articles, and one ends up with more journal articles. Kill this access and you get what you wanted. You know exactly who is the Salem witch.

Stephen E Arnold, July 17, 2024

Written by Stephen E. Arnold · Filed Under AI, Business process, Copyright, Indexing, News, Online (general) | Comments Off on Stop Indexing! And Pay Up!

Can the Bezos Bulldozer Crush Temu, Shein, Regulators, and AI?

June 27, 2024

This essay is the work of a dumb dinobaby. No smart software required.

The question, to be fair, should be, “Can the Bezos-less bulldozer crush Temu, Shein, Regulators, Subscriptions to Alexa, and AI?” The article, which appeared in the “real” news online service Venture Beat, presents an argument suggesting that the answer is, “Yes! Absolutely.”

Thanks MSFT Copilot. Good bulldozer.

The write up “AWS AI Takeover: 5 Cloud-Winning Plays They’re [sic] Using to Dominate the Market” depends upon an Amazon Big Dog named Matt Wood, VP of AI products at AWS. The article strikes me as something drafted by a small group at Amazon and then polished to PR perfection. The reasons the bulldozer will crush Google, Microsoft, Hewlett Packard’s on-premises play, and the keep-on-searching IBM Watson, among others, are:

Covering the numbers or logo of the AI companies in the “game”; for example, Anthropic, AI21 Labs, and other whale players
Hitting up its partners, customers, and friends to get support for the Amazon AI wonderfulness
Engineering AI to be itty bitty pieces one can use to build a giant AI solution capable of dominating D&B industry sectors like banking, energy, commodities, and any other multi-billion sector one cares to name
Skipping the Google folly of dealing with consumers. Amazon wants the really big contracts with really big companies, government agencies, and non-governmental organizations.
Amazon is just better at security. Those leaky S3 buckets are not Amazon’s problem. The customers failed to use Amazon’s stellar security tools.

Did these five points convince you?

If you did not embrace the spirit of the bulldozer, the Venture Beat article states:

Make no mistake, fellow nerds. AWS is playing a long game here. They’re not interested in winning the next AI benchmark or topping the leaderboard in the latest Kaggle competition. They’re building the platform that will power the AI applications of tomorrow, and they plan to power all of them. AWS isn’t just building the infrastructure, they’re becoming the operating system for AI itself.

Convinced yet? Well, okay. I am not on the bulldozer yet. I do hear its engine roaring and I smell the no-longer-green emissions from the bulldozer’s data centers. Also, I am not sure the Google, IBM, and Microsoft are ready to roll over and let the bulldozer crush them into the former rain forest’s red soil. I recall researching Sagemaker which had some AI-type jargon applied to that “smart” service. Ah, you don’t know Sagemaker? Yeah. Too bad.

The rather positive leaning Amazon write up points out that as nifty as those five points about Amazon’s supremacy in the AI jungle, the company has vision. Okay, it is not the customer first idea from 1998 or so. But it is interesting. Amazon will have infrastructure. Amazon will provide model access. (I want to ask, “For how long?” but I won’t.), and Amazon will have app development.

The article includes a table providing detail about these three legs of the stool in the bulldozer’s cabin. There is also a run down of Amazon’s recent media and prospect directed announcements. Too bad the article does not include hyperlinks to these documents. Oh, well.

And after about 3,300 words about Amazon, the article includes about 260 words about Microsoft and Google. That’s a good balance. Too bad IBM. You did not make the cut. And HP? Nope. You did not get an “Also participated” certificate.

Net net: Quite a document. And no mention of Sagemaker. The Bezos-less bulldozer just smashes forward. Success is in crushing. Keep at it. And that “they” in the Venture Beat article title: Shouldn’t “they” be an “it”?

Stephen E Arnold, June 27, 2024

Written by Stephen E. Arnold · Filed Under AI, Amazon, Business strategy, Cloud computing, Marketing, News | Comments Off on Can the Bezos Bulldozer Crush Temu, Shein, Regulators, and AI?

Generative AI and College Application Essays: College Presidents Cheat Too

February 19, 2024

This essay is the work of a dumb dinobaby. No smart software required.

The first college application season since ChatGPT hit it big is in full swing. How are admissions departments coping with essays that may or may not have been written with AI? It depends on which college one asks. Forbes describes various policies in, “Did You Use ChatGPT on your School Applications? These Words May Tip Off Admissions.” The paper asked over 20 public and private schools about the issue. Many dared not reveal their practices: as a spokesperson for Emory put it, “it’s too soon for our admissions folks to offer any clear observations.” But the academic calendar will not wait for clarity, so schools must navigate these murky waters as best they can.

Reporters Rashi Shrivastava and Alexandra S. Levine describe the responses they did receive. From “zero tolerance” policies to a little wiggle room, approaches vary widely. Though most refused to reveal whether they use AI detection software, a few specified they do not. A wise choice at this early stage. See the article for details from school to school.

Shrivastava and Levine share a few words considered most suspicious: Tapestry. Beacon. Comprehensive curriculum. Esteemed faculty. Vibrant academic community. Gee, I think I used a one or two of those on my college essays, and I wrote them before the World Wide Web even existed. On a typewriter. (Yes, I am ancient.) Will earnest, if unoriginal, students who never touched AI get caught up in the dragnets? At least one admissions official seems confident they can tell the difference. We learn:

“Ben Toll, the dean of undergraduate admissions at George Washington University, explained just how easy it is for admissions officers to sniff out AI-written applications. ‘When you’ve read thousands of essays over the years, AI-influenced essays stick out,’ Toll told Forbes. ‘They may not raise flags to the casual reader, but from the standpoint of an admissions application review, they are often ineffective and a missed opportunity by the student.’ In fact, GWU’s admissions staff trained this year on sample essays that included one penned with the assistance of ChatGPT, Toll said—and it took less than a minute for a committee member to spot it. The words were ‘thin, hollow, and flat,’ he said. ‘While the essay filled the page and responded to the prompt, it didn’t give the admissions team any information to help move the application towards an admit decision.’”

That may be the key point here—even if an admissions worker fails to catch an AI-generated essay, they may reject it for being just plain bad. Students would be wise to write their own essays rather than leave their fates in algorithmic hands. As Toll put it:

“By the time a student is filling out their application, most of the materials will have already been solidified. The applicants can’t change their grades. They can’t go back in time and change the activities they’ve been involved in. But the essay is the one place they remain in control until the minute they press submit on the application. I want students to understand how much we value getting to know them through their writing and how tools like generative AI end up stripping their voice from their admission application.”

Disqualified or underwhelming—either way, relying on AI to write one’s application essay could spell rejection. Best to buckle down and write it the old-fashioned way. (But one can skip the typewriter.)

Cynthia Murrell, February 19, 2024

Written by Stephen E. Arnold · Filed Under Education, Governance, News | Comments Off on Generative AI and College Application Essays: College Presidents Cheat Too

Problematic Smart Algorithms

December 12, 2023

This essay is the work of a dumb dinobaby. No smart software required.

We already know that AI is fundamentally biased if it is trained with bad or polluted data models. Most of these biases are unintentional due ignorance on the part of the developers, I.e. lack diversity or vetted information. In order to improve the quality of AI, developers are relying on educated humans to help shape the data models. Not all of the AI projects are looking to fix their polluted data and ZD Net says it’s going to be a huge problem: “Algorithms Soon Will Run Your Life-And Ruin It, If Trained Incorrectly.”

Our lives are saturated with technology that has incorporated AI. Everything from an application used on a smartphone to a digital assistant like Alexa or Siri uses AI. The article tells us about another type of biased data and it’s due to an ironic problem. The science team of Aparna Balagopalan, David Madras, David H. Yang, Dylan Hadfield-Menell, Gillian Hadfield, and Marzyeh Ghassemi worked worked on an AI project that studied how AI algorithms justified their predictions. The data model contained information from human respondents who provided different responses when asked to give descriptive or normative labels for data.

Normative data concentrates on hard facts while descriptive data focuses on value judgements. The team noticed the pattern so they conducted another experiment with four data sets to test different policies. The study asked the respondents to judge an apartment complex’s policy about aggressive dogs against images of canines with normative or descriptive tags. The results were astounding and scary:

"The descriptive labelers were asked to decide whether certain factual features were present or not – such as whether the dog was aggressive or unkempt. If the answer was "yes," then the rule was essentially violated — but the participants had no idea that this rule existed when weighing in and therefore weren’t aware that their answer would eject a hapless canine from the apartment.

Meanwhile, another group of normative labelers were told about the policy prohibiting aggressive dogs, and then asked to stand judgment on each image.

It turns out that humans are far less likely to label an object as a violation when aware of a rule and much more likely to register a dog as aggressive (albeit unknowingly ) when asked to label things descriptively.

The difference wasn’t by a small margin either. Descriptive labelers (those who didn’t know the apartment rule but were asked to weigh in on aggressiveness) had unwittingly condemned 20% more dogs to doggy jail than those who were asked if the same image of the pooch broke the apartment rule or not.”

The conclusion is that AI developers need to spread the word about this problem and find solutions. This could be another fear mongering tactic like the Y2K implosion. What happened with that? Nothing. Yes, this is a problem but it will probably be solved before society meets its end.

Whitney Grace, December 12, 2023

Written by Stephen E. Arnold · Filed Under AI, Indexing, Natural language processing, News | Comments Off on Problematic Smart Algorithms

Amazon Offers AI-Powered Review Consolidation for Busy Shoppers

September 6, 2023

I read the reviews for a product. I bought the product. Reality was — how shall I frame it — different from the word pictures. Trust those reviews. ? Hmmm. So far, Amazon’s generative AI focus has been on supplying services to developers on its AWS platform. Now, reports ABC News, “Amazon Is Rolling Out a Generative AI Feature that Summarizes Product Reviews.” Writer Haleluya Hadero tells us:

“The feature, which the company began testing earlier this year, is designed to help shoppers determine at a glance what other customers said about a product before they spend time reading through individual reviews. It will pick out common themes and summarize them in a short paragraph on the product detail page.”

A few mobile shoppers have early access to the algorithmic summaries while Amazon tweaks the tool with user feedback. Eventually, the company said, shoppers will be able to surface common themes in reviews. Sounds nifty, but there is one problem: Consolidating reviews that are fake, generated by paid shills, or just plain wrong does nothing to improve their accuracy. But Amazon is more eager to jump on the AI bandwagon than to perform quality control on its reviews system. We learn:

“The Seattle-based company has been looking for ways to integrate more artificial intelligence into its product offerings as the generative AI race heats up among tech companies. Amazon hasn’t released its own high-profile AI chatbot or imaging tool. Instead, it’s been focusing on services that will allow developers to build their own generative AI tools on its cloud infrastructure AWS. Earlier this year, Amazon CEO Andy Jassy said in his letter to shareholders that generative AI will be a ‘big deal’ for the company. He also said during an earnings call with investors last week that ‘every single one’ of Amazon’s businesses currently has multiple generative AI initiatives underway, including its devices unit, which works on products like the voice assistant Alexa.”

Perhaps one day Alexa will recite custom poetry or paint family portraits for us based on the eavesdropping she’s done over the years. Heartwarming. One day, sure.

Cynthia Murrell, September 19, 2023

Written by Stephen E. Arnold · Filed Under AI, Amazon, News | Comments Off on Amazon Offers AI-Powered Review Consolidation for Busy Shoppers

« Previous Page — Next Page »

Search the site
Subscribe to Beyond Search
Feature archive
News archive

Stephen E. Arnold monitors search, content processing, text mining and related topics from his high-tech nerve center in rural Kentucky. He tries to winnow the goose feathers from the giblets. He works with colleagues worldwide to make this Web log useful to those who want to go "beyond search". Contact him at sa [at] arnoldit.com. His Web site with additional information about search is arnoldit.com.

Categories
- 3D-Printing
- Acquisition
- Advertising
- Aggregation
- AI
- Alexa
- algorithms
- Amazon
- Amazonia
- Analytics
- Appliance
- Applications
- Audio
- Augmented Reality
- Big data
- Bing
- Bitcoin
- Bitext
- Book review
- Business intelligence
- Business process
- Business strategy
- Censorship
- Cloud computing
- Company Profile
- Conferences
- Connectors
- Consulting
- Consumer
- Content processing
- Copyright
- Corporate Concerns
- Cost
- Crawl
- Crowdfunding
- cryptocurrency
- Customer support
- Cyber OSINT
- cybercrime
- cybersecurity
- Dark Web
- DarkCyber
- Data
- Data mining
- Database
- Deepfakes
- Digital Assistant
- Digital Library
- E2EE
- ECommerce
- EDiscovery
- Editorial opinion
- Education
- Emoticons
- Enterprise
- Enterprise search
- Entity extraction
- Ethics
- Facebook
- Faceted search
- Factualities
- Feature
- Federated search
- Financial
- Fogint
- Google
- Governance
- Government
- Hackers
- healthcare
- IBM Watson
- Image search
- Indexing
- Infrastructure
- Innovation
- Integration
- intelware
- Interface
- Internet
- Interview
- Investment
- law enforcement
- Legal matters
- Library automation
- Management
- Marketing
- Mathematics
- Metadata
- Microsoft
- Mobile
- Natural language processing
- News
- NGIA
- Online (general)
- Open Access
- Open source
- OSINT
- Osint Radar
- Overflight
- Palantir
- Patents
- Personnel
- Podcast
- Policeware
- Portals
- Predictive coding
- Privacy
- Profile
- Publishing
- Quotation
- Real time search
- Reference tool
- Rich media
- Robot Writer
- Search
- Search enabled applications
- search engine
- Search quality
- Security
- Semantic
- Sentiment analysis
- SEO
- SharePoint
- Short Honks
- Smart Technology
- Social
- Social Media
- software
- Statistics
- Taxonomy
- Technology
- Text analytics
- Text processing
- Tools
- Tor
- Training
- Translation
- Twitter
- Uncategorized
- Unstructured Data
- User experience
- User Interface
- Vertical search
- Video
- visualization
- Voice search
- Voice technology
- Web 3
- Web Services
- Webinar
- Windows
- Work flow
- XML
- Yahoo

Beyond Search

Directories Have Value

FOGINT: Security Tools Over Promise & Under Deliver

The Bezos Bulldozer Could Stalls in a Nuclear Fuel Pool

Consistency Manifested by Mr. Musk and the Delightfully Named X.com

When Egos Collide in Brazil

Stop Indexing! And Pay Up!

Can the Bezos Bulldozer Crush Temu, Shein, Regulators, and AI?

Generative AI and College Application Essays: College Presidents Cheat Too

Problematic Smart Algorithms

Amazon Offers AI-Powered Review Consolidation for Busy Shoppers

Search the site

Categories

Archives

Recent Posts

Meta

Share this:

Share this:

Share this:

Share this:

Share this:

Share this:

Share this:

Share this:

Share this:

Share this:

Search the site

Categories

Archives

Recent Posts

Meta