Google Accused of Favoritism by an Outfit with Google Envy?

August 10, 2019

I read in the Jeff Bezos owned Washington Post this story: “YouTube’s Arbitrary Standards: Stars Keep Making Money Even after Breaking the Rules.” The subtitle is a less than subtle dig at what WaPo perceives as the soft, vulnerable underbelly of Googzilla:

Moderators describe a chaotic workplace where exceptions for lucrative influencers are the norm.

What is the story about? The word choice in the headlines make the message clear: Google is a corrupt, Wild West. The words in the headline and subhead I noted are:

arbitrary

money

breaking

chaotic

exceptions

lucrative

norm.

Is it necessary to work through the complete write up? I have the frame. This is “real news”, which may be as problematic as the high school management methods in operation at Google.

Let’s take a look at a couple of examples of “real news”:

Here’s the unfair angle:

With each crisis, YouTube has raced to update its guidelines for which types of content are allowed to benefit from its powerful advertising engine — depriving creators of those dollars if they break too many rules. That also penalizes YouTube, which splits the advertising revenue with its stars.

Nifty word choice: crisis, race, powerful, dollars, break, and the biggie “advertising revenue.”

That’s it. Advertising revenue. Google has. WaPo doesn’t. Perhaps, just perhaps, Amazon wants. Do you think?

Now the human deciders. Do they decide? WaPo reports the “real news” this way:

But unlike at rivals like Facebook and Twitter, many YouTube moderators aren’t able to delete content themselves. Instead, they are limited to recommending whether a piece of content is safe to run ads, flagging it to higher-ups who make the ultimate decision.

The words used are interesting:

unlike

Facebook

Twitter

aren’t

limited

recommending

higher ups

Okay, that’s enough for me. I have the message.

What if WaPo compared and contrasted YouTube with Twitch, an Amazon owned gaming platform. In my lectures at the TechnoSecurity & Digital Forensics Conference, I showed LE and intel professionals, Twitch’s:

online gambling

soft porn

encoded messages

pirated first run motion pictures

streaming US television programs

Twitch talent can be banned; for example, SweetSaltyPeach. But this star resurfaced with ads a few days later as RachelKay. Same art. Same approach which is designed to appeal the the Twitch audience. How do I know? Well, those pre roll ads and the prompt removal of the ban. Why put RachelKay back on the program? Maybe ad revenue?

My question is, “Why not dive into the toxic gaming culture and the failure of moderation on Twitch?” The focus on Google is interesting, but explaining that problems are particular to Google is interesting.

One thing is certain: The write up is so blatantly anti Google that it is funny.

Why not do a bit of research into the online streaming service of the WaPo’s owner?

Oh, right, that’s not “real news.”

What’s my point? Amazon is just as Googley as Google. Perhaps an editor at the WaPo should check out Twitch before attacking what is not much different than Amazon’s own video service.

Stephen E Arnold, August 10, 2019

Arolsen Archives

May 22, 2019

Documents from concentration camps have been expanded. The Arolsen Archive (the new name of the International Tracing Service) makes available 13 million pertaining to more than two million people, according to the Daily Beast (Newsweek). This is the “the world’s most comprehensive archive on the Holocaust’s victims and survivors.” You can explore the collection at this link.

Stephen E Arnold, May 22, 2019

Google: History? Backfiles Do Not Sell Ads

April 29, 2019

We spotted a very interesting article in Tablix: “Google Index Coverage”. We weren’t looking for the article, but it turned up in a list of search results and one of the DarkCyber researchers called it to my attention.

Background: Years ago we did a bit of work for a company engaged in data analysis related to the health and medical sectors. We had to track down the names of the companies who were hired by the US government to do some outsourced fraud investigation. We were able to locate the government statements of work and even some of the documents related to investigations. We noticed a couple of years ago that our bookmarks to some government documents did not resolve. With USA.gov dependent on Bing, we checked that index. We tried US government Web sites related to the agencies involved. Nope. The information had disappeared, but in one case we did locate documents on a US government agency’s Web site. The data were “there” but the data were not in Bing, Exalead, Google, or Yandex. We also checked the recyclers of search results: Startpage, the DuckDuck thing, and MillionShort.

We had other information about content disappearing from sites like the Wayback Machine too. From our work for assorted search companies and our own work years ago on ThePoint.com, which we sold to Lycos, we had considerable insight into the realities of paying for indexing that did not generate traffic or revenue. The conclusion we had reached and we assumed that other vendors would reach was:

Online search is not a “free public library.”

A library is/was/should be an archiving entity; that is, someone has to keep track and store physical copies of books and magazines.

Online services are not libraries. Online services sell ads as we did to Zima who wanted their drink in front of our users. This means one thing:

Web indexes dump costs.

The Tablix article makes clear that some data are expendable. Delete them.

Our view is:

Get used to it.

There are some knock on effects from the simple logic of reducing costs and increasing the efficiency of the free Web search systems. I have written about many of these, and you can search the 12,000 posts on this blog or pay to search commercial indexes for information in my more than 100 published articles related to search. You may even have a copy of one of my more than a dozen monographs; for example, the original Enterprise Search Reports or The Google Legacy.

  1. Content is disappearing from indexes on commercial and government Web sites. Examples range from the Tablix experience to the loss of the MIC contracts which detail exclusives for outfits like Xerox.
  2. Once the content is not findable, it may cease to exist for those dependent on free search and retrieval services. Sorry, Library of Congress, you don’t have the content, nor does the National Archives. The situation is worse in countries in Asia and Eastern Europe.
  3. Individuals — particularly the annoying millennials who want me to provide information for free — do not have the tools at hand to locate high value information. There are services which provide some useful mechanisms, but these are often affordable only by certain commercial enterprises, some academic research organizations, and law enforcement and intelligence agencies. This means that most people are clueless about the “accuracy”, “completeness,” and “provenance” of certain information.

Net net: If data generate revenue, it may be available online and findable. If the data do not, hasta la vista. The situation is one that gives me and my research team considerable discomfort.

Imagine how smart software trained on available data will behave? Probably in a pretty stupid way? Information is not what people believe it to be. Now we have a generation or two of people who think research is looking something up on a mobile device. Quite a combo: Ill informed humans and software trained on incomplete data.

Yeah, that’s just great.

Stephen E Arnold, April 28, 2019

The EU, the Internet Archive Dust Up: One Fact Overlooked

April 11, 2019

I read “EU Tells Internet Archive That Much Of Its Site Is ‘Terrorist Content’.” The main point is that Europol’s European Union Internet Referral Unit pointed out that the Internet Archive contains problematic information. The article explains that the Internet Archive explains:

there’s simply no way that (1) the site could have complied with the Terrorist Content Regulation had it been law last week when they received the notices, and (2) that they should have blocked all that obviously non-terrorist content. [emphasis in the original]

DarkCyber wants to point out a fact that may be of interest to the EUIRU and the Internet Archive; to wit: The site has information, but the site’s search system and interface make it very difficult to locate information. For EUIRU, the inadequate search system makes finding the potentially harmful information a challenge. For the Internet Archive, the findability system makes it equally difficult for IA staff to locate items so each can be reviewed.

What will the Internet Archive do? The options are limited and some are unpalatable: Fight the EU? Ignore the request? Block access from Europe? Go out of business? Address the issue head on? Worth watching how this develops.

Stephen E Arnold, April 11, 2019

Centralizing and Concentrating: Works Great Until It Does Not

April 1, 2019

No joke or joke? Let’s assume the story is true.

US airlines are proving that centralizing and concentrating online services works great until the system fails. I read “Computer Outage Affecting Major US Airlines including Southwest, Delta and United Causes Hundreds of Flight Delays Nationwide.” (I first saw the news in a UK stream from the Daily Mail, a British newspaper.) As I write this at 910 am US Eastern (April 1, 2019), the story is now appearing in other feeds. The problem appears to be one with software called Aerodata. By 840 am US Eastern time, more than 700 flights were affected.

What seems to be lousy systems administration, engineering, or business processes have made April 1, 2019, into unpleasant anecdotes, not frothy jokes.

Aerodata’s Web site cheerfully reports my public IP address which, not surprisingly, is not what my IP address is. The Web site requires Flash, a super unsecure software in my opinion. I was not able to locate current news from the company. I noticed that VMWare mentions that the company uses VSAN to power a modern software defined data center.  You can read the marketing inspired explanation at this link or you could at 917 am US Eastern on April 1, 2019.

According the a Chicago NBC outlet, all is well again. You can get this take at this link.

What happens if a cyber attack takes down a concentrated service?

Stephen E Arnold, April 1, 2019

Hashing Videos and Images Explained

March 17, 2019

A quite lucid explanation of video and image identification appears in “How Hashing Could Stop Violent Videos from Spreading.” Here’s one passage from the article:

Video hashing works by breaking down a video into key frames and giving each a unique alphanumerical signature, or hash. That hash is collected into a central database, where every video or photo that is uploaded to a platform is then compared against that dataset. The system requires a database of images and doesn’t use artificial intelligence to identify what is in an image — it only identifies a match between images and videos.

CNN emphasizes Microsoft’s PhotoDNA technology. Information about that system may be found at this link. The write up points out that Facebook and Google use “this technology.”

One question is, “If the technology is available and in use, why are offensive videos and images finding their way into public facing, easily accessible systems?”

The answer according to an expert quoted in the CNN story is:

The decision not to do this [implement more effective hashing filter methods] is a question of will and policy—not a question of technology.”

The answer is that platforms are one way to avoid the editorial responsibility associated with old school methods of communication; for example, wire services, newspapers, and magazine. These types of communication were not perfect, but in many cases, an editorial process prevented certain types of information from appearing  in certain publications. So far, the hands off approach of some digital channels and the over hyped use of smart software have not been as effective as the hopelessly old fashioned processes used by some traditional media outlets.

So will? Policy?

Nah, money, expediency, and the high school science club approach to management.

Stephen E Arnold, March 17, 2019

GPS: Ubiquitous and Helpful in Surprising Ways

March 6, 2019

Here’s a little write-up that highlights the power of GPS and WiFi tracking. Digital Trends reports, “It Turns Out That Find My iPhone Is Really Good at Finding a Stolen Car, Too.” Writer Andy Boxall relates:

“After stopping at an intersection, Chase Richardson was carjacked by an armed man who shouted for him to get out of the vehicle. Sensibly complying, Richardson got out, but at the same time left his work-issued Apple iPhone in the car. The criminal also demanded Richardson’s wallet and his own personal phone, then got in the car and drove away. The police arrived after Richardson called 911 at a Walgreens store, which is when the Find My iPhone feature was called into action. The service uses GPS to generally locate a registered device, which in this case was the work phone. The police apparently used Find My iPhone in real time to track down the stolen car. A police helicopter was called in to assist after the car was located, as the thief tried to evade arrest.”

We are pleased to learn Mr. Richardson was not hurt during the carjacking. Boxall mentions other cases where Find My iPhone has led to arrests, and notes similar tools like Google’s Find My Device, Samsung’s device location service, and third-party companies like Cerberus Anti-Theft. Such tools can be a huge help if someone makes off with your phone—or your car. Just remember that tracking software can have unintended consequences; the article closes with this kind wish:

“Whichever you choose, we hope it will only ever be used to find your phone down the back of a couch, and nothing more serious.”

We agree.

Cynthia Murrell, March 6, 2019

Moving the Google: Right to Be Forgotten Has an Impact

January 23, 2019

I have heard that it can be difficult to reach a human at Google. It appears that a Dutch surgeon and her attorneys were successful. “Right to Be Forgotten Used to Force Google to Remove Medical Negligence Link” states:

Amsterdam’s district court has forced Google to remove search results relating to a Dutch surgeon’s past medical suspension…

The difference between printed information and digital information is becoming discernible. Print can exist in multiple copies in tangible form in many places; for example, university libraries, archives, and personal information collections. Making a change to a printed document can be tricky, but it can be done.

However, changing the digital record is a bit easier; for example, deleting a pointer in an index.

The question becomes, “What happens when a person wants to reconstruct the details of a particular matter?”

The answer is that information is relative. Figuring out what happened becomes a bit more difficult and expensive.

What happens?

I can’t look up the answer online, but I could ask IBM Watson. These types of answers may have to suffice with Silly Putty information.

A court decision may leave behind a paper trail. But the actions of a single system administrator may be impossible to identify and verify.

Epistemology may be due for a renascence when setting the record straight.

Stephen E Arnold, January 23, 2019

About Those VPNs

December 26, 2018

News and chatter about VPNs are plentiful. We noted a flurry of stories about Chinese ownership of VPNs. We receive incredible deals for VPNs which are almost too good to be true. We noted this write up from AT&T (a former Baby Bell) and its Alienvault unit: “The Dangers of Free VPNs.”

The idea behind a VPN is hiding traffic from those able to gain access to that traffic. But there is a VPN provider in the mix. From that classic man in the middle position, the VPN may not be as secure as the user thinks.

The AT&T Alienvault viewpoint is slightly different: VPNs are the cat’s pajamas as long as the VPN is AT&T’s.

We learned from the write up:

Technically, VPN providers have the capacity to see everything you do while connected. If it really wanted to, a VPN company could see what videos you watched, read emails you send, or monitor your search history.

The write up points out without reference to lawful intercept orders, national security letters, and the ho hum everyday work in cheerful Ashburn, Virginia:

Thankfully, reputable providers don’t do this. A good provider shouldn’t take any logs of your activity, which means that although they could theoretically access your data, they discard it instead. These “no-log” companies don’t keep copies of your data, so even if they get subpoenaed by a government agency, they have no data that they can hand over. VPN providers may take different types of logs, so you need to be careful when reading the fine print of any potential provider. These logs can include your traffic, DNS requests, timestamps, bandwidth and IP address.

The write up includes a “How do I love thee” approach to the dangers of free VPNs.

Net net: Be scared. Just navigate to this link. AT&T provides VPN service with the goodness one expects.

By the way, note the reference to “logs.” Many gizmos in a data center offering VPN services maintain logs. Processing these auto generated files can yield quite useful information. Perhaps that’s why there are free and low cost services.

Zero logs strikes Beyond Search as something that is easy to say but undesirable and possibly difficult to achieve.

Are VPNs secure? Is Tor?

In January 2019, Beyond Search will cover more dark cyber related content. More news is forthcoming. Let’s face it enterprise search is a done deal. The Beyond Search goose is migrating to search related content plus adjacent issues like AT&T promoting its cheerful, unmonitored, we’re really great approach to online.

Stephen E Arnold, December 26, 2018

Deep Fakes: Technology Is Usually Neutral

December 18, 2018

Ferreting out fake news has become an obsession for search and AI jockeys around the globe. However, those jobs are nothing compared to the wave of fake photos and videos that grow increasingly convincing as technology helps to iron out the wrinkles. That’s a scary prospect to more than a few experts, as we discovered in a recent MIT Technology Review article, “Deepfake Busting Apps Can Spot Even A Single Pixel Out of Place.”

According to the story:

“That same technology is creating a growing class of footage and photos, called “deepfakes,” that have the potential to undermine truth, confuse viewers, and sow discord at a much larger scale than we’ve already seen with text-based fake news.”

Deepfakes are fun and possibly threatening to some. The “experts” at high tech firms will use their management expertise to reduce any anxieties the deepfakes spark. But some Luddites think these videos and images have the potential to disrupt governments and elections in countries where online is pervasive. Beyond Search is comforted by the knowledge that bright, objective, ethical minds are on the case. One question: What if these whiz kids are angling for a more selfish outcome?

Patrick Roland, December 18, 2018

« Previous PageNext Page »

  • Archives

  • Recent Posts

  • Meta