Google Stop Words: Close Enough for the Mom and Pop Online Ad Vendor

April 15, 2021

I remember from a statistics lecture given by a fellow named Dr. Peplow maybe that fuzzy is one of the main characteristics of statistics. The idea is that a percentage is not a real entity; for example, the average number of lions in a litter is three, give or take a couple of the magnets for hunters and poachers. Depending upon the data set, the “real” number maybe 3.2 cubs in a litter. Who has ever seen a fractional lion? Certainly not me.

Why am I thinking fuzzy? Google is into data. The company collects, counts, and transform “real” data into actions. Whip in some smart software, and the company has processes which transform an advertiser’s need to reach eyeballs with some statistically validated interest in whatever the Mad Ave folks are trying to sell.

Google Has a Secret Blocklist that Hides YouTube Hate Videos from Advertisers—But It’s Full of Holes” suggests that some of the Google procedures are fuzzy. The uncharitable might suggest that Google wants to get close enough to collect ad money. Horse shoe aficionados use the phrase “close enough for horse shoes” to indicate a toss which gets a point or blocks an opponent’s effort. That seems to be one possible message from the Mark Up article.

I noted this passage in the essay:

If you want to find YouTube videos related to “KKK” to advertise on, Google Ads will block you. But the company failed to block dozens of other hate and White nationalist terms and slogans, an investigation by The Markup has found. Using a list of 86 hate-related terms we compiled with the help of experts, we discovered that Google uses a blocklist to try to stop advertisers from building YouTube ad campaigns around hate terms. But less than a third of the terms on our list were blocked when we conducted our investigation.

What seems to be happening is that Google’s methods for taking a term and then “broadening” it so that related terms are identified is not working. The idea is that related terms with a higher “score” are more directly linked to the original term. Words and phrases with lower “scores” are not closely related. The article uses the example of the term KKK.

I learned:

Google Ads suggested millions upon millions of YouTube videos to advertisers purchasing ads related to the terms “White power,” the fascist slogan “blood and soil,” and the far-right call to violence “racial holy war.” The company even suggested videos for campaigns with terms that it clearly finds problematic, such as “great replacement.” YouTube slaps Wikipedia boxes on videos about the “the great replacement,” noting that it’s “a white nationalist far-right conspiracy theory.” Some of the hundreds of millions of videos that the company suggested for ad placements related to these hate terms contained overt racism and bigotry, including multiple videos featuring re-posted content from the neo-Nazi podcast The Daily Shoah, whose official channel was suspended by YouTube in 2019 for hate speech.

It seems to me that Google is filtering specific words and phrases on a stop word list. Then the company is not identifying related terms, particularly words which are synonyms for the word on the stop list.

Is it possible that Google is controlling how it does fuzzification. In order to get clicks and advertising, Google blocks specifics and omits the term expansion and synonym identification settings to eliminate the words and phrases identified by the Mark Up’s investigative team?

These references to synonym expansion and reference to query expansion are likely to be unfamiliar to some people. Nevertheless, fuzzy is in the hands of those who set statistical thresholds.

Fuzzy is not real, but the search results are. Ad money is a powerful force in some situations. The article seems to have uncovered a couple of enlightening examples. String matching coupled with synonym expansion seem to be out of step. Some fuzzification may be helpful in the hate speech methods.

Stephen E Arnold, April 12, 2021

India May Use AI to Remove Objectionable Online Content

April 7, 2021

India’s Information Technology Act, 2000 provides for the removal of certain unlawful content online, like child pornography, private images of others, or false information. Of course, it is difficult to impossible to keep up with identifying and removing such content using just human moderators. Now we learn from the Orissa Post that the “Govt Mulls Using AI to Tackle Social Media Misuse.” The write-up states:

“This step was proposed after the government witnessed widespread public disorder because of the spread of rumours in mob lynching cases. The Ministry of Home Affairs has taken up the matter and is exploring ways to implement it. On the rise in sharing of fake news over social media platforms such as Facebook, Twitter and WhatsApp, Minister of Electronics and Information Technology Ravi Shankar Prasad had said in Lok Sabha that ‘With a borderless cyberspace coupled with he possibility of instant communication and anonymity, the potential for misuse of cyberspace and social media platforms for criminal activities is a global issue.’ Prasad explained that cyberspace is a complex environment of people, software, hardware and services on the internet. He said he is aware of the spread of misinformation. The Information Technology (IT) Act, 2000 has provisions for removal of objectionable content. Social media platforms are intermediaries as defined in the Act. Section 79 of the Act provides that intermediaries are required to disable/remove unlawful content on being notified by the appropriate government or its agency.”

The Ministry of Home Affairs has issued several advisories related to real-world consequences of online content since the Act passed, including one on the protection of cows, one on the prevention of cybercrime, and one on lynch mobs spurred on by false rumors of child kidnappings. The central government hopes the use of AI will help speed the removal of objectionable content and reduce its impact on its citizens. And cows.

Cynthia Murrell, April 7, 2021

Historical Revisionism: Twitter and Wikipedia

March 24, 2021

I wish I could recall the name of the slow talking wild-eyed professor who lectured about Mr. Stalin’s desire to have the history of the Soviet Union modified. The tendency was evident early in his career. Ioseb Besarionis dz? Jughashvili became Stalin, so fiddling with received wisdom verified by Ivory Tower types should come as no surprise.

Now we have Google and the right to be forgotten. As awkward as deleting pointers to content may be, digital information invites “reeducation”.

I learned in “Twitter to Appoint Representative to Turkey” that the extremely positive social media outfit will interact with the country’s government. The idea is to make sure content is just A-Okay. Changing tweets for money is a pretty good idea. Even better is coordinating the filtering of information with a nation state is another. But Apple and China seem to be finding a path forward. Maybe Apple in Russia will be a  similar success.

A much more interesting approach to shaping reality is alleged in “Non-English Editions of Wikipedia Have a Misinformation Problem.” Wikipedia has a stellar track record of providing fact rich, neutral information I believe. This “real news” story states:

The misinformation on Wikipedia reflects something larger going on in Japanese society. These WWII-era war crimes continue to affect Japan’s relationships with its neighbors. In recent years, as Japan has seen an increase in the rise of nationalism, then­–Prime Minister Shinzo Abe argued that there was no evidence of Japanese government coercion in the comfort women system, while others tried to claim the Nanjing Massacre never happened.

I am interested in these examples because each provides some color to one of my information “laws”. I have dubbed these “Arnold’s Precepts of Online Information.” Here’s the specific law which provides a shade tree for these examples:

Online information invites revisionism.

Stated another way, when “facts” are online, these are malleable, shapeable, and subjective.

When one runs a query on swisscows.com and then the same query on bing.com, ask:

Are these services indexing the same content?

The answer for me is, “No.” Filters, decisions about what to index, and update calendars shape the reality depicted online. Primary sources are a fine idea, but when those sources are shaped as well, what does one do?

The answer is like one of those Borges stories. Deleting and shaping content is more environmentally friendly than burning written records. A python script works with less smoke.

Stephen E Arnold, March24, 2021

Social Audio Service Clubhouse Blocked in Oman

March 15, 2021

Just a quick note to document Oman’s blocking of the social audio service Clubhouse. The story “Oman Blocks Clubhouse, App Used for Free Debates in Mideast” appeared on March 15, 2021. The invitation only service has hosted Silicon Valley luminaries and those who wrangled an invitation via connections or social engineering. The idea is similar to the CB radio chats popular with over-the-road truckers in the United States. There’s no motion picture dramatizing the hot service, but a “Smokey and the Bandit” remake starring the hot stars in the venture capital game and the digital movers and shakers could be in the works. Elon Musk’s character could be played by Brad Pitt. Instead of a Pontiac Firebird, the Tesla is the perfect vehicle for movers and shakers in the Clubhouse.

Stephen E Arnold, March 15, 2021

DarkCyber for February 23, 2021 Is Now Available

February 23, 2021

DarkCyber, Series 3, Number 4 includes five stories. The first summarizes the value of an electronic game’s software. Think millions. The second explains that Lokinet is now operating under the brand Oxen. The idea is that the secure services’ offerings are “beefier.” The third story provides an example of how smaller cyber security startups can make valuable contributions in the post-SolarWinds’ era. The fourth story highlights a story about the US government’s getting close to an important security implementation, only to lose track of the mission. And the final story provides some drone dope about the use of unmanned aerial systems on Super Bowl Sunday as FBI agents monitored an FAA imposed no fly zone. You could download the video at this url after we uploaded it to YouTube.

But…

YouTube notified Stephen E Arnold that his interview with Robert David Steele, a former CIA professional, was removed from YouTube. The reason was “bullying.” Mr. Arnold is 76 or 77, and he talked with Mr. Steele about the Jeffrey Epstein allegations. Mr. Epstein was on the radar of Mr. Steele because the legal allegations were of interest to an international tribunal about human trafficking and child sex crime. Mr. Steele is a director of that tribunal. Bullying about a deceased person allegedly involved in a decades long criminal activity? What? 

What’s even more interesting is that the DarkCyber videos, which appear every 14 days focus on law enforcement, intelligence, and cyber crime issues. One law enforcement professional told Mr. Arnold after his Dark Web lecture at the National Cyber Crime Conference in 2020, you make it clear that investigators have to embrace new technology and not wait for budgets to accommodate more specialists.

Mr. Arnold told me that he did not click the bright red button wanting Google / YouTube to entertain an appeal. I am not certain about his reasoning, but I assume that Mr. Arnold, who was an advisor to the world’s largest online search system, was indifferent to the censorship. My perception is that Mr. Arnold recognizes that Alphabet, Google, and YouTube are overwhelmed with management challenges, struggling to figure out how to deal with copyright violations, hate content, and sexually related information. Furthermore, Alphabet, Google, and YouTube face persistent legal challenges, employee outcries about discrimination, and ageing systems and methods.

What does this mean? In early March 2021, we will announce other video services which will make the DarkCyber video programs available.

The DarkCyber team is composed of individuals who are not bullies. If anything, the group is more accurately characterized as researchers and analysts who prefer the libraries of days gone by to the zip zip world of thumbtypers, smart software, and censorship of content related to law enforcement and intelligence professionals.

Mr. Arnold was discussing online clickfraud at lunch next week. Would that make an interesting subject for a DarkCyber story? With two firms controlling more than two thirds of the online advertising, click fraud is a hot potato topic. How does it happen? What’s done to prevent it? What’s the cost to the advertisers? What are the legal consequences of the activity?

Kenny Toth, February 23, 2021

YouTube Censors a Government Hearing in Ohio

February 2, 2021

It is a strange world we live in. Google’s efforts to curb misinformation on YouTube have led it to take down footage of legislative testimony in Ohio. Cincinnati’s WLWT5 News reports, “YouTube Removes Ohio Committee Video, Citing Misinformation.” We are not surprised the misinformation at hand relates to COVID-19. Digital editor Brian Wiechert writes:

“The video showed Thomas Renz, an attorney for Ohio Stands Up, a citizen group, make the opening testimony during a House committee hearing on a bill that would allow lawmakers to vote down public health orders during the pandemic. In the more than 30-minute testimony, Renz made a number of debunked or baseless claims, including that no Ohioans under the age of 19 have died from COVID-19 – a claim that has been debunked by state data. … “The removal, first reported by Ohio Capital Journal, comes days after the Republican lawmakers in the Senate passed a bill that would establish ‘checks and balances’ on fellow GOP Gov. Mike DeWine’s ability to issue and keep in place executive action during the coronavirus pandemic. Proponents of the bills in the House and Senate believe DeWine and the state health department have issued orders during the last 11 months of the pandemic that have remained enacted for longer than necessary and, as a result, have unduly damaged small businesses and the state’s economy. Opponents called it unconstitutional and warned it would decentralize the state’s response during an emergency and cost lives in the process.”

Checks and balances on lifesaving measures during a pandemic—I am sure this is not what our founders had in mind. Good move, Google. Ohio is a fly over state, so maybe it is devalued because it is not intellectually as capable as the Left and Right coasts of the USA? If residents of the state disagree with that assessment, they may wish to do something about the current occupants of their Senate chamber.

Can we blame it on the Google artificial intelligence software?

Cynthia Murrell, March 2, 2021

Bitchute: Still Powering Those Ultra Bits

January 28, 2021

Republicans view Democrats with suspicion. Democrats stare back at Republicans. Both political parties have media outlets that support each of their political ideologies. The only problem for either party are the extremists (and conspiracy theorists) that haunt their ranks. That being said welcome to BitChute, a conservative video streaming platform that allows frisky speech, conspiracy theorists, and Web 3.0 thinkers.

Mashable deep dives into the platform in: “BitChute Welcomes The Dangerous Hate Speech That YouTube Bans.” BitChute has not received as much attention as other alternative social media Web sites. British citizen Ray Vahey, a Web developer, founded BitChute as a free speech platform when Google banned certain contentious speech and extremist content on YouTube. Vahey lives in Thailand and he actively supports conspiracy theories.

BitChute is funded by donations and will start playing ads from the advertising company Criteo. Most of BitChute’s content comes from YouTube and it is not owned by the uploads. Reuters, for example, has a channel, but Reuters does not own it. There have been takedown allegations, although they were copyright infringement and not community guidelines.

“As HOPE not hate’s report puts it: “BitChute exists to circumvent the moderation of mainstream platforms.”  BitChute really seems like the Wild West. The company lists basic community guidelines on the site, but users can easily find videos that violate them. And it’s not like there’s so much content that BitChute couldn’t moderate it all. “

There are fewer uploaders to BitChute than YouTube enjoys, but that does not limit the depth of unusual factoids shared in videos. BitChute’s guidelines state terrorism recruitment videos were not allowed, yet there are many available as well as mass shooting videos.

BitChute may be poised for growth.

Whitney Grace, January 28, 2020

China: Control and Common Sense. Common Sense?

November 25, 2020

I must admit that I saw some darned troubling things when I last visited China and Hong Kong. However, I spotted an allegedly accurate factoid in “China Bans Spending by Teens in New Curbs on Livestreaming.” In one of my lectures about the Dark Web I pointed out livestreaming sites which permitted gambling, purchase of merchandise which is now called by the somewhat jarring term “merch,” and buying “time” with an individual offering “private sessions.” I pointed out examples on Amazon Twitch and on a service called ManyVids, an online outfit operating from Canada. (Yep, dull, maple crazed Canada.)

Here’s the passage of particular significance in my opinion:

Livestreaming platforms now must limit the amount of money a user can give hosts as a tip. Users must register their real names to buy the virtual gifts, in addition to the ban on teens giving such gifts. The administration also asked the platforms to strengthen training for employees who screen content and encouraged the companies to hire more censors, who also will need to register with regulators. The media regulator will create a blacklist of hosts who frequently violate the rules, and ban them from hosting livestreaming programs on any platform. [Emphasis added by Beyond Search]

Okay, spending controls will force buyers (sometimes known as “followers”) to be more creative in the buying time function.

But the killer point is “real names.”

No doubt there are online consumers who will bristle at censorship, registration, and blacklisting. Nevertheless, “real names” might be a useful concept for online services not under the watchful eye of party faithful grandmas in a digital hotong. What a quaint idea for outfits like Facebook, Twitter, YouTube, and other online content outputters to consider.

Stephen E Arnold, November 25, 2020

Facebook: Slipslidin’ Away from the Filterin’ Thing

May 28, 2020

Censorship, flagged tweets, and technology companies trying to be a nervous parent? Sound familiar. DarkCyber finds the discussion interesting. One of the DarkCyber team spotted “Facebook’s Mark Zuckerberg Says Platform Policing Should Be Limited To Avoiding Imminent Harm.” The main point of the write up contains this statement:

… the platform’s criteria for removing content remains “imminent harm” — not harm “down the line.”

The article provides some training wheels for the DarkCyber researcher:

Zuckerberg said several times that, in the balance, he thinks of himself “as being on the side of giving people a voice and pushing back on censorship.”

Some of the companies powering the digital economy appear to be willing to make decisions about what the product (those who use the services) or the customers (advertisers) can access.

The article provides a context for Facebook’s “imminent harm”; for example:

Facebook’s 2.6 billion users give it unprecedented reach, noted Susan Perez, a portfolio manager at Harrington Investments, who brought up the issue of political interference and fraudulent content on the platform. “Society’s risk is also the company’s risk,” she said.

The article includes a “Yes, but…”; to wit:

Nick Clegg, Facebook’s president of global affairs and communications, said during a question and answer session, said the company doesn’t think a private tech company “should be in the position of vetting what politicians say. We think people should be allowed to hear what politicians say so they can make up their own mind and hold the politician to account.”

As censorship becomes an issue in the datasphere, is Facebook “slip sliding away”? Is the senior management of Facebook climbing a rock face using an almost invisible path, a path that other digital climbers have not discerned?

But wait? Didn’t that pop song say?

You know the nearer your destination
The more you’re slip slidin’ away

Sure, but what if Facebook’s slip slidin’ is movin’ closer?

Stephen E Arnold, May 28, 2020

Big Tech: Adulting Arrives But A Global Challenge Proved Stronger Than Silicon Shirkers

March 29, 2020

Interesting item from NBC News: “Coronavirus Misinformation Makes Neutrality a Distant Memory for Tech Companies.” DarkCyber thinks the the write up should have used the phrase “finally adulting,” but, alas, the real news story states:

Most major consumer technology platforms embraced the idea that they were neutral players, leaving the flow of information up to users. Now, facing the prospect that hoaxes or misinformation could worsen a global pandemic, tech platforms are taking control of the information ecosystem like never before. It’s a shift that may finally dispose of the idea that Big Tech provides a “neutral platform” where the most-liked idea wins, even if it’s a conspiracy theory.

The recursive nature of the click loops creates some interesting phenomena. Among the outcomes is the myth of Silicon Valley bros, the mantra “Ask for forgiveness, not permission,” and the duplicity of executives explaining how their ad-fueled money systems have chopped through the fabric of society like a laser cutter in an off shore running shoe factory.

The write up includes some good quotes; for example:

“Neutrality — there’s no such thing as that, because taking a neutral stance on an issue of public health consequence isn’t neutral,” said Whitney Phillips, a professor of communication at Syracuse University who researches online harassment and disinformation. “Choosing to be neutral is a position,” she said. “It’s to say, ‘I am not getting involved because I do not believe it is worth getting involved.’ It is internally inconsistent. It is illogical. It doesn’t work as an idea. “So these tech platforms can claim neutrality all they want, but they have never been neutral from the very outset,” she added.

Okay, interesting. One question:

Why has it taken a real news outfit such a long time to focus on a problem?

Answer:

We wanted a free mouse pad.

The problem is that undoing the digital damage may be a more difficult job than some anticipate.

Adulting permits a number of behaviors. Example: Falling off the responsibility wagon. Perhaps a recovery program is needed for dataholics?

Stephen E Arnold, March 29, 2020

« Previous PageNext Page »

  • Archives

  • Recent Posts

  • Meta