A Xoogler May Question the Google about Responsible and Ethical Smart Software

December 2, 2021

Write a research paper. Get colleagues to provide input. Well, ask colleagues do that work and what do you get. How about “Looks good.” Or “Add more zing to that chart.” Or “I’m snowed under so it will be a while but I will review it…” Then the paper wends its way to publication and a senior manager type reads the paper on a flight from one whiz kid town to another whiz kid town and says, “This is bad. Really bad because the paper points out that we fiddle with the outputs. And what we set up is biased to generate the most money possible from clueless humans under our span of control.” Finally, the paper is blocked from publication and the offending PhD is fired or sent signals that your future lies elsewhere.


Will this be a classic arm wrestling match? The winner may control quite a bit of conceptual territory along with knobs and dials to shape information.

Could this happen? Oh, yeah.

Ex Googler Timnit Gebru Starts Her Own AI Research Center” documents the next step, which may mean that some wizards undergarments will be sprayed with eau de poison oak for months, maybe years. Here’s one of the statements from the Wired article:

“Instead of fighting from the inside, I want to show a model for an independent institution with a different set of incentive structures,” says Gebru, who is founder and executive director of Distributed Artificial Intelligence Research (DAIR). The first part of the name is a reference to her aim to be more inclusive than most AI labs—which skew white, Western, and male—and to recruit people from parts of the world rarely represented in the tech industry. Gebru was ejected from Google after clashing with bosses over a research paper urging caution with new text-processing technology enthusiastically adopted by Google and other tech companies.

The main idea, which Wired and Dr. Gebru delicately sidestep, is that there are allegations of an artificial intelligence or machine learning cabal drifting around some conference hall chatter. On one side is the push for what I call the SAIL approach. The example I use to illustrate how this cost effective, speedy, and clever short cut approach works is illustrated in some of the work of Dr. Christopher Ré, the captain of the objective craft SAIL. Oh, is the acronym unfamiliar to you? SAIL is short version of Stanford Artificial Intelligence Laboratory. SAIL fits on the Snorkel content diving gear I think.

On the other side of the ocean, are Dr. Timnit Gebru’s fellow travelers. The difference is that Dr. Gebru believes that smart software should not reflect the wit, wisdom, biases, and general bro-ness of the high school science club culture. This culture, in my opinion, has contributed to the fraying of the social fabric in the US, caused harm, and erodes behaviors that are supposed to be subordinated to “just what people do to make a social system function smoothly.”

Does the Wired write up identify the alleged cabal? Nope.

Does the write up explain that the Ré / Snorkel methods sacrifice some precision in the rush to generate good enough outputs? (Good enough can be framed in terms of ad revenue, reduced costs, and faster time to market testing in my opinion.) Nope.

Does Dr. Gebru explain how insidious the short cut training of models is and how it will create systems which actively harm those outside the 60 percent threshold of certain statistical yardsticks? Heck, no.

Hopefully some bright researchers will explain what’s happening with a “deep dive”? Oh, right, Deep Dive is the name of a content access company which uses Dr. Ré’s methods. Ho, ho, ho. You didn’t know?

Beyond Search believes that Dr. Gebru has important contributions to make to applied smart software. Just hurry up already.

Stephen E Arnold, December 2, 2021

Want High-Value Traffic? Buy Ads

November 15, 2021

I read a number of stories pivoting on “Apple Quietly Buying Ads Via Google For High-Value Subscription Apps To Capture App Publisher Revenue.” The main idea is that someone — maybe Apple — is buying ads for hot services; for example, Babble and HBO. The user clicks on the ad and is promptly delivered to the App app store. Who is buying the ads? The information is not the type that convinces me that Apple is punching buttons directly.

The main point for me is that “organic” traffic and the baloney about search engine optimization is filled with ground up pine cones. If Apple can’t generate “organic” traffic from its walled garden orchard, who can?

Answer: Those with the money to buy ads, make them look as if the Bumblers and HBO wizards were behind the ads. The customer gets conveyed to the Apple store where those tasty fruits are marked up.

How does one find “objective” ads? The same way one finds “objective” information. One doesn’t unless one invests considerable time and money in research and analysis.

Disinformation, misinformation, and reformation are the currency of success perhaps?

Stephen E Arnold, November 15, 2021

Learning about Advertising Executives: A Google Lesson

October 27, 2021

I spotted a story about Google’s systems and methods for capturing advertising revenue. “Ad Execs Dismayed, But Not Surprised by Tactics Google Allegedly Used to Control Digital Ad Dollars.” The information about Google was not particularly interesting. The company has been operating in ways which make it difficult for those who just love free services and the Googley glitz to discern what’s shakin’ in the management meetings.

The write up states one point which I found intriguing:

Trade bodies are quiet while industry insiders shrug as if to say “what did you expect.” They’ve long accepted the harsh truths of online advertising in the platform era.

Notice the words “insiders,” “shrug,” “harsh truths,” and “platform.”

I interpreted these two sentences to suggest ad execs know the game is rigged. Why, pray tell? Commissions, the value of being Google certified, and getting the insider scoop on opportunities to help ad execs’ customers sell their products (at least one hopes something besides ad inventory sells).

This article adds little to the Google ad lore, but it says quite a bit about the brokers or facilitators of ad sales.

Commissions, consulting fees, and the lure of search engine optimization runways to for fee Google ads — yep, the ad execs are in the game.

Perhaps the hot topic of ad fraud will be discussed? Perhaps not?

Stephen E Arnold, October 27, 2021

A Sporty Allegation: One Person Is Two on the Zuckmetabook Thing?

October 25, 2021

If you are interested in an “indie hacker’s” view of Zuckbook. Ooops. Sorry. I meant Facebook, you will want to read “Facebook Will Count One Person as Two on Its Platform.” I found the write up interesting. Darko has a way with words.

Here’s the statement from the Zuckbook which caught his attention:

Starting today, if someone does not have their Facebook and Instagram accounts linked in Accounts Center, we will consider those accounts as separate people for ads planning and measurement.

Darko then clarifies this corporate Zuck speak:

Essentially, Facebook will count one person as two on its platform for advertisers, unless the users have explicitly linked their accounts in “Account Center”. [Emphasis in the original text}

The write up identifies other murkiness; for example, the machinations of the “Account Center” and how the Zuckbook presents some ad effectiveness data.

Darko points out that the Zuckbook may be doing the Darwin adaptation to the Tim Apple privacy play. Plus, Zuckbook ad rates are “skyrocketing” to use Darko’s term.

What’s the impact of the Zuckbook’s new ad finery? Darko says:

Fortunately, there are new channels that are emerging and some founders already started having success with them. These recent interviews I did on using TikTok influencers to grow a SaaS and using Reddit outreach are just some examples. Decentralized social networking is also on the way, according to people like Naval, and is just waiting for its Satoshi moment.

I think I understand. Bad news for the Zuckbook. Maybe.

Stephen E Arnold, October 25, 2021

Useless Search Results? Thank Advertising

September 17, 2021

We thought this was obvious. The Conversation declares, “Google’s ‘Pay-Per-Click’ Ad Model Makes it Harder to Find What You’re Looking For.” Writers Mohiuddin Ahmed and Paul Haskell-Dowland begin by pointing out “to google” has literally become synonymous with searching online via any online search platform. Indeed, Google has handily dominated the online search business, burying some competitors and leaving the rest in the dust. Not coincidentally, the company also rules the web browser and online advertising markets. As our dear readers know, Google is facing pushback from competition and antitrust regulators in assorted countries. However, this article addresses the impact on search results themselves. The authors report:

“More than 80% of Alphabet’s revenue comes from Google advertising. At the same time, around 85% of the world’s search engine activity goes through Google. Clearly there is significant commercial advantage in selling advertising while at the same time controlling the results of most web searches undertaken around the globe. This can be seen clearly in search results. Studies have shown internet users are less and less prepared to scroll down the page or spend less time on content below the ‘fold’ (the limit of content on your screen). This makes the space at the top of the search results more and more valuable. In the example below, you might have to scroll three screens down before you find actual search results rather than paid promotions. While Google (and indeed many users) might argue that the results are still helpful and save time, it’s clear the design of the page and the prominence given to paid adverts will influence behavior. All of this is reinforced by the use of a pay-per-click advertising model which is founded on enticing users to click on adverts.”

We are reminded Google-owned YouTube is another important source of information for billions of users, and it is perhaps the leading platform for online ads. In fact, these ads now intrude on videos at a truly annoying rate. Unless one pays for a Premium subscription, of course. Ahmed and Haskell-Dowland remind us alternatives to Google Search exist, with the usual emphasis on privacy-centric DuckDuckGo. They conclude by pointing out other influential areas in which Google plays a lead role: AI, healthcare, autonomous vehicles, cloud computing, computing devices, and the Internet of Things. Is Google poised to take over the world? Why not?

Cynthia Murrell, September September 17, 2021, 2021

Ad-Rich User Tracking GPS Maps: Can You Be Guided Off a Cliff?

September 3, 2021

It is easy to go on autopilot when using a GPS device like Waze or Google Maps, but never Apple Maps because it is never accurate. Unfortunately even trusted apps are not error free and Auto Evolution explains why in: “Don’t Trust Google Maps And Waze, Colorado Officials Say.” While many GPS apps are reliable, they are notorious for containing inaccurate data, especially in rural or foreign countries.

GPS horror stories haunt the Internet like old MySpace accounts. Two Russians blindly followed Google Maps to a location, where they lost their signal. They spent the night in frigid temperatures, one of them died. Other drivers use roads that are only meant for off-road vehicles or tractors. Local authorities end up rescuing these drivers, because they are often stranded.

In Colorado, drivers are trusting Waze, Google Maps, and other GPS apps way too much. The Colorado Department of Transportation even issued a statement telling drivers not to use these apps, because it could take them down dangerous or dead end roads.

“‘Don’t trust your cell phones, they are really getting people into trouble,’ Amber Barrett, the Eagle County Sheriff’s Office Public Information Officer, has been quoted as saying.

Trusting apps like Google Maps and Waze is a big issue, she said, though, in theory, all these solutions should be updated by map editors or volunteers with accurate data. But of course, no app is bulletproof, yet we wouldn’t go as far as not using these apps at all.”

The department advises people to not take GPS directions as set in stone. If a road appears that it is not meant for regular travel, then do not follow it. The GPS will persuade drivers to make a U-turn or turn around at the next possible place.

Ad supported map makers have an incentive to keep their users from killing themselves by following incorrect directions. Dead people cannot buy advertisers’ products nor can they deliver useful real time data to the map providers.

Whitney Grace, August xx,  2021

Big Data, Algorithmic Bias, and Lots of Numbers Will Fix Everything (and Your Check Is in the Mail)

August 20, 2021

We must remember, “The check is in the mail” and “I will always respect you” and “You can trust me.” Ah, great moments in the University of Life’s chapbook of factoids.

I read “Moving Beyond Algorithmic Bias Is a Data Problem”. I was heartened by the essay. First, the document has a document object identifier and a link to make checking updates easy. Very good. Second, the focus of the write up is the inherent problem of most of the Fancy Dan baloney charged big data marketing to which I have been subjected in the last six or seven years. Very, very good.

I noted this statement in the essay:

Why, despite clear evidence to the contrary, does the myth of the impartial model still hold allure for so many within our research community? Algorithms are not impartial, and some design choices are better than others.

Notice the word “myth”. Notice the word “choices.” Yep, so much for the rock solid nature of big data, models, and predictive silliness based on drag-and-drop math functions.

I also starred this important statement by Donald Knuth:

Donald Knuth said that computers do exactly what they are told, no more and no less.

What’s the real world behavior of smart anti-phishing cyber security methods? What about the autonomous technology in some nifty military gear like the Avenger drone?

Google may not be thrilled with the information in this essay nor thrilled about the nailing of the frat bros’ tail to the wall; for example:

The belief that algorithmic bias is a dataset problem invites diffusion of responsibility. It absolves those of us that design and train algorithms from having to care about how our design choices can amplify or curb harm. However, this stance rests on the precarious assumption that bias can be fully addressed in the data pipeline. In a world where our datasets are far from perfect, overall harm is a product of both the data and our model design choices.

Perhaps this explains why certain researchers’ work is not zipping around Silicon Valley at the speed of routine algorithm tweaks? The statement could provide some useful insight into why Facebook does not want pesky researchers at NYU’s Ad Observatory digging into how Facebook manipulates perception and advertisers.

The methods for turning users and advertisers into puppets is not too difficult to figure out. That’s why certain companies obstruct researchers and manufacture baloney, crank up the fog machine, and offer free jargon stew to everyone including researchers. These are the same entities which insist they are not monopolies. Do you believe that these are mom-and-pop shops with a part time mathematician and data wrangler coming in on weekends? Gee, I do.

The “Moving beyond” article ends with a snappy quote:

As Lord Kelvin reflected, “If you cannot measure it, you cannot improve it.”

Several observations are warranted:

  1. More thinking about algorithmic bias is helpful. The task is to get people to understand what’s happening and has been happening for decades.
  2. The interaction of math most people don’t understand and very simple objectives like make more money or advance this agenda is a destabilizing force in human behavior. Need an example. The Taliban and its use of WhatsApp is interesting, is it not?
  3. The fix to the problems associated with commercial companies using algorithms as monetary and social weapons requires control. The question is from whom and how.

Stephen E Arnold, August 20, 2021

More Ad-Citement: Juicing Video Piracy

August 13, 2021

I read “Pirated-Entertainment Sites Are Making Billions From Ads.” My immediate reaction: “What? Bastions of ad integrity helping out video pirates? Impossible?”

According to the pay walled write up, the flagships of integrity seem to be unfurling the jib to speed toward this type of revenue. I learned something I did not know and which may be semi-accurate:

Websites and apps featuring pirated movies and TV shows make about $1.3 billion from advertising each year, including from major companies like Amazon.com Inc., according to a study.

The write up noted:

The piracy operations are also a key source of malware, and some ads placed on the sites contain links that hackers use to steal personal information or conduct ransomware attacks…

Some of these video services provide links to interesting online gambling sites as well.

This quote, attributed to the founder of White Bullet (an anti piracy outfit) is thought provoking:

Failure to choose tools that assess piracy risk in real-time means advertisers fund criminals – and it’s a billion-dollar problem,” said Peter Szyszko, CEO and Founder of White Bullet, in an email. “At best, this is negligent. At worst, this is deliberate funding of IP crime.

Just one question: Aren’t filters available to block this type of activity in the ad systems of estimable firms?

Apparently that’s just too darned difficult.

Stephen E Arnold, August 13, 2021

DuckDuckGo Produces Privacy Income

August 10, 2021

DuckDuckGo advertises that it protects user privacy and does not have targeted ads in search results.  Despite its small size, protecting user privacy makes DuckDuckGo a viable alternative to Google.  TechRepublic delves into DuckDuckGo’s profits and how privacy is a big money maker in the article, “How DuckDuckGo Makes Money Selling Selling Search, Not Privacy.”  DuckDuckGo has had profitable margins since 2014 and made over $100 million in 2020.

Google, Bing, and other companies interested in selling personal data say that it is a necessary evil in order for search and other services to work.  DuckDuckGo says that’s not true and the company’s CEO Gabriel Weinberg said:

“It’s actually a big myth that search engines need to track your personal search history to make money or deliver quality search results. Almost all of the money search engines make (including Google) is based on the keywords you type in, without knowing anything about you, including your search history or the seemingly endless amounts of additional data points they have collected about registered and non-registered users alike. In fact, search advertisers buy search ads by bidding on keywords, not people….This keyword-based advertising is our primary business model.”

Weinberg continued that search engines do not need to track as much personal information as they do to personalize customer experiences or make money.  Search engines and other online services could limit the amount of user data they track and still generate a profit.

Google made over $147 billion in 2020, but DuckDuckGo’s $100 million is not a small number either.  DuckDuckGo’s market share is greater than Bing’s and, if limited to the US market, its market share is second to Google.  DuckDuckGo is a like the Little Engine That Could.  It is a hard working marketing operation and it keeps chugging along while batting the privacy beach ball along the Madison Avenue sidewalk.

Whitney Grace, August 10, 2021

Google Search: An Intriguing Observation

August 9, 2021

I read “It’s Not SEO: Something Is Fundamentally Broken in Google Search.” I spotted this comment:

Many will remember how remarkably accurate searches were at initial release c. 2017; songs could be found by reciting lyrics, humming melodies, or vaguely describing the thematic or narrative thrust of the song. The picture is very different today. It’s almost impossible to get the system to return even slightly obscure tracks, even if one opens YouTube and reads the title verbatim. 

The idea is that the issue resides within Google’s implementation of search and retrieval. I want to highlight this comment offered in the YCombinator Hacker News thread:

While the old guard in Google’s leadership had a genuine interest in developing a technically superior product, the current leaders are primarily concerned with making money. A well-functioning ranking algorithm is only one small part of the whole. As long as the search engine works well enough for the (money-making) main-stream searches, no one in Google’s leadership perceives a problem.

I have a different view of Google search. Let me offer a handful of observations from my shanty in rural Kentucky.

To begin, the original method for determining precision and recall is like a page of text photocopied with that copy then photocopied. After a couple of hundred photocopies, image of the page has degraded. Photocopy for a couple of decades and the document copy is less than helpful. Degradation in search subsystems is inevitable, and it takes place in search as layers or wrappers have been added around systems and methods.

Second, Google must generate revenue; otherwise, the machine will lose velocity, maybe suffer cash depravation. The recent spectacular financial payoffs are not directed at what I call “precision and recall search.” What’s happening, in my opinion, is that accelerated relaxation of queries makes it easier to “match” an ad. More — not necessarily more relevant — matching provides quicker depletion of the ad inventory, more revenue, more opportunities for Google sales partners to pitch ads, and more users believing Google results are the cat’s pajamas. To “go back” to antiquated ideas like precision and recall, relevance, and old-school Boolean breaks the money flow, adds costs, and a forces distasteful steps for those who want big paydays, bonuses, and the cash to solve death and other childish notions.

Third, this comment from Satellite2 is on the money:

Power users as a proportion of Internet’s total user count probably followed an inverted zipf distribution over time. At the begining 100%, then 99, 90%, 9% and now less than one percent. Assuming power users formulate search in ways that are irreconcilable from those of the average user, and assuming Google adapted their models, metrics to the average user and retrained them at each step,then, we are simply no longer a target market of Google.

I interpret this as implying that Google is no longer interested in delivering on point results. I now run the same query across a number of Web search systems and hunt for results which warrant direct inspection. I use, for example, iseek.com, swisscows.ch, yandex.ru, and a handful of other systems.

Net net: The degradation of Google began around 2005 and 2006. In the last 15 years, Google has become a golden goose for some stakeholders. The company’s search systems — where is that universal search baloney, please? — are going to be increasingly difficult to refine so that a user’s query is answered in a user-useful way.

Messrs. Brin and Page bailed, leaving a consultant-like management team. Was their a link between increased legal scrutiny, friskiness in the Google legal department, antics involving hard drugs and death on a Googler’s yacht, and “effciency oriented” applied technologies which have accelerated the cancer of relevance-free content. Facebook takes bullets for its high school management approach. Google, in my view, may be the pinnacle of the ethos of high school science club activities.

What’s the fix? Maybe a challenger from left field will displace the Google? Maybe a for-fee outfit like Infinity will make it to the big time? Maybe Chinese style censorship will put content squabbles in the basement? Maybe Google will simply become more frustrating to users?

The YouTube search case in the essay in Hacker News is spot on. But Google search — both basic and advanced search — is a service which poses risks to users. Where’s a date sort? A key word search? File type search? A federated search across blogs and news? What happened to file type search? Yada yada yada.

Like the long-dead dinosaurs, Googzilla is now watching the climate change. Snow is beginning to fall because the knowledge environment is changing. Hello, Darwin!

Stephen E Arnold, August 9, 2021

Next Page »

  • Archives

  • Recent Posts

  • Meta