Search and the Bezos Bulldozer

April 13, 2021

For the last three years, I have been giving lectures about the lock in methods implemented by Amazon. I refer to the company as the online bookstore in order to remind those in my audiences that Amazon has a friendly facet. That’s exemplified by the smile logo. Amazon also has a Wall Street persona which is built upon the precepts of MBAism.

I will be talking about Amazon and its policeware strategy at the 2021 National Cyber Crime Conference. If you want a similar presentation tailored to commercial interests, let me know. I can be reached via benkent2020 at yahoo dot com. My LE and intel work are pro bono; commercial works incurs a fee.

I want to mention a subject I won’t be addressing directly in my upcoming lecture later this month. The subject is an Amazon blog post titled “Introducing OpenSearch.” I would also direct your attention to the comments submitted to the Ycombinator discussion of the announcement. You can find those interesting and varied remarks from hundreds of people at this link.

The news is that Amazon is taking quite predictable steps to recast search and retrieval so that it becomes another of the hundreds of functions, services, and features of Amazon Web Services. AWS hired people from Lucid Imagination (now LucidWorks) years ago. Many have forgotten that Amazon operated A9, a Web search system with a street view function, as well. There are other findability functions embedded in Amazon as well; for example, the “search” function in Amazon’s blockchain inventions. (Yes, I have a for fee lecture about that technology as well. Because money laundering is a growing problem, the Amazon methods are likely to become increasingly important to certain government agencies in the future.)

The little secret about open source software, which many overlook, is that the strongest supporters of FOSS and community supported code are large companies. I did a series of reports for the IDC outfit, and I am not sure what that now dismantled organization did with the data. A couple of chapters were sold on Amazon for $3,000, but the topic was not a magnet when we assembled the information six or seven years ago.

Since Amazon is engaged in a battle for one part of the “enterprise” with Microsoft, the online bookstore is actively seeking ways to attract large organizations as customers, lock them in, and then implement the tactics which benefit from Amazon’s knowledge of its customers’ behavior. The use of the “retail” tactic watch, duplicate, and leverage house brands is documented in the reports from vendors who have had their toes nipped by the bulldozer’s steel caterpillar traction system.

Why’s this germane to “search”? Here are the reasons:

  • Search and retrieval is an essential utility for modern work. Amazon wants to generate revenue and other business benefits by having a “better” and (if possible) community supported software base. Search will become part of the lubricant for other Amazon enterprise services; for example, locating tax avoiders.
  • Search becomes the glue and the circulatory system for information analysis and use. No search; no high value outputs. Machine learning is little more than a supporting technology to finding needed information. Many disagree with me, but marketing clouds many experts’ thinking. Search is a core function and requires many subsystems and technical methods.
  • Once users become habituated to search, change is difficult. Amazon is one of the few outfits to have undermined Google search. Product searches are increasingly under Amazon’s control. The ElasticSearch “play” is going to become the vehicle for a broader utility attack.

I have quipped that Amazon has targeted Elastic and the ElasticSearch “system” because it has the same name as some of Amazon’s services. If Amazon is successful in its search maneuver, Shay Banon’s findability play will be marginalized.

There are larger implications quite beyond a comment made to elicit a laugh at a reception at an enterprise search conference. These include:

  • Seamless integration with SageMaker and other advanced functionalities available from Amazon
  • A lever for technical and financial leverage for innovators who use Amazon as the plumbing for their start ups, not Microsoft technology
  • A model for Amazon and maybe other companies to use for shifting open source software into a variation on the FUD (fear, uncertainty, and doubt) approach to closing deals. The mantra could become “Nobody ever got fired for buying AWS.”

For the companies generating scorecards for enterprise search vendors, significant change is likely. The numerous vendors of proprietary enterprise search will have to make some changes in their approach to Amazon. Many of these Elastic alternatives use AWS for certain functions. What happens if the pricing structure, the legalese, or the access to certain AWS services “evolve”? What will start ups and Amazon partners do if access to search functions becomes free or requires contributions to the AWS version of open source?

Worth watching, right? The answer is, “Nah, you are way off base.” Yep, just as I was in my analysis of Google for BearStearns many years ago. I have a track record of getting thrown out as I head for second base.

Stephen E Arnold, April 13, 2021

An Amazon Easter Egg: Ethical Waste Inside a Plastic Shell

April 5, 2021

I read a “real news” story called “Amazon Apologizes for Lying about Pee — And Attempts to Shift the Blame.” The main point of the article is that a large, essentially unregulated company creates working conditions which deny employees the right to deal with bodily functions. You can plug in the words which would get an ordinary person banned by smart software designed to protect one’s sensibilities.

The write up, however, focuses on Amazon’s mendacious behavior. Is it a surprise that a giant corporation distorts information so that it can continue to behave in a way that would make the author of Nicomachean Ethics slug down hemlock before finishing research for his revered tome? [Note: I included the MIT version because MIT is so darned ethical in its willingness to accept Jeffrey Epstein’s blandishments and cash.] “Gimme that cup of poison. There’s no hope,” he might have said.

The write up states:

Amazon only apologizes for not being “accurate” enough, too — not for actually creating and contributing to situations where workers [censored] in bottles. In fact, Amazon goes so far as to suggest the whole pee bottle thing is simply a regrettable status quo, pointing out a handful of times when other companies’ delivery drivers were also caught peeing in bottles, as well as embedding a handful of random comments on Twitter that happen to support Amazon’s views. You can almost hear Jeff Bezos saying “Why aren’t these people blaming UPS and FedEx? Let’s get more people thinking about them instead.”

There are several issues within this Amazon Easter egg; for example:

  1. The ethical posture of exploitation. The US makes a big deal about China’s alleged exploitation of ethnic groups, yet seems okay with the Amazon and other US entities’ behavior. Interesting or hypocrisy?
  2. The failure of regulation. Is Amazon the exception or the norm? The answer is that Amazon like other big tech outfits is the norm. I want to suggest that the norm applies to sugar cane workers in Brazil as well as to street vendors in Zagreb. Look behind the day-to-day misery and there is a Boss type, probably with an MBA. Governments have failed to protect workers. In fact large companies are the government just as Boss types are the forces which matter in many countries. No elections, lobbyists, or education needed.
  3. Slow response to abuse. Many large tech companies have been chugging along for more than 20 years. Suddenly people care? What’s this say about journalists, union organizers, and non government organizations allegedly “into” fair and equal treatment?

Dropping the Amazon Easter egg on a Friday of a holiday weekend produces what?

Here’s my answer: Visible evidence that the Easter egg is made of the same indestructible plastic in which many Amazon products are packaged. Those eggs are unbreakable, and the behaviors are going to continue.

My suggestion? Stock up on large mouth plastic bottles or start innovating with modifying a Rocket Man beverage pack so it can be used as a Porta Potties using an indwelling urinary catheter instead of an outdwelling beer nozzle. Less hassle and an opportunity to become an entrepreneur selling the “innovation” on Etsy.

Stephen E Arnold, April 5, 2021

No Joke: Amazon and Social Media

April 1, 2021

A change at the top and Amazon gets wonky. Coincidence or just a new crew of social media advisors? Who knows. “Amazon Wanted Twitter Warriors with Great Sense of Humor, Leaked Doc Shows” reveals some allegedly accurate information about the humble online bookseller.

The write up states:

Amazon sought out warehouse staffers with a “great sense of humor” to build a squad of Twitter warriors to knock down criticism of its fulfillment centers, a leaked document reveals.

Yep, warehouse workers with the skills of a tweet master like the real Borat.

The story adds:

While Amazon wanted the workers to speak for themselves, the memo shows company officials wanted a standardized format for their Twitter handles and usernames. They mulled adding an emoji to the username to “give personality, for example a small box emoji…

bulldozer small

A happy Amazon worker surrounded with positive tweets is running with an “emergency necessary bottle” from the automated Bezos bulldozer and its business processes.

The write up includes what may be an April Fool joke:

“| work for Amazon and not sure about other facilities but I’ve never felt pressured to pee in a trash can,” one trainee wrote in a draft tweet. “My managers understand when you gotta’ go you gotta’ go.”

You can read the allegedly accurate document at this link.

Stephen E Arnold, April 1, 2021

No Joke: Academics Cheat Bit Time

April 1, 2021

It looks as though academic journals are finally addressing the scourge of fraudulent studies that plague their pages. Heck, that publisher Royal Society of Chemistry now publicly acknowledges the problem is a big step. Nature examines “The Fight Against Fake-Paper Factories that Churn Out Sham Science.” Prompted by outside investigations, journals have retracted hundreds of fraudulent papers since last January, with more under investigation. Nature has assembled a list of over 1,300 articles identified as possible paper-mill products over that time. Many of these suspect papers come from authors at Chinese hospitals, but China is not the only place fake research is churned out. Iran and Russia are also home to paper mills, for example. However, writers Holly Else and Richard Van Noorden report:

“China has long been known to have a problem with firms selling papers to researchers, says Xiaotian Chen, a librarian at Bradley University in Peoria, Illinois. As far back as 2010, a team led by Shen Yang, a management-studies researcher then at Wuhan University in China, warned of websites offering to ghostwrite papers on fictional research, or to bypass peer-review systems for payment. In 2013, Science reported on a market for authorships on research papers in China. In 2017, China’s Ministry of Science and Technology (MOST) said it would crack down on misconduct after a scandal in which 107 papers were retracted at the journal Tumor Biology; their peer reviews had been fabricated and a MOST investigation concluded that some had been produced by third-party companies. Physicians in China are a particular target market because they typically need to publish research articles to gain promotions, but are so busy at hospitals that they might not have time to do the science, says Chen. Last August, the Beijing municipal health authority published a policy stipulating that an attending physician wanting to be promoted to deputy chief physician must have at least two first-author papers published in professional journals; three first-author papers are required to become a chief physician.”

Here’s a thought—maybe remove these requirements. We’re told reports produced by physicians in these positions are already widely suspect and not taken seriously, so where is the value in maintaining such hoops? See the lengthy article for more details on how the pros detect fraudulent papers and what the industry is planning to do about it.

Cynthia Murrell, April 1, 2021

The Google and Web Indexing: An Issue of Control or Good, Old Fear?

March 29, 2021

I read “Google’s Got A Secret.” No kidding, but Google has many, many secrets. Most of them are unknown to today’s Googlers. After 20 plus years, even Xooglers are blissfully unaware of the “big idea,” the logic of low profiling data slurping, how those with the ability to make “changes” to search from various offices around the world can have massive consequences for those who “trust” the company, and the increasing pressure to control Googzilla’s appetite for cash. But enough of these long-ignored issues.

The “secret” in the article is that Google actively pushes as many buttons and pulls as many levers as its minions can to make it tough for competitors to “index” the publicly accessible Web. The write up states:

Only a select few crawlers are allowed access to the entire web, and Google is given extra special privileges on top of that.

The write up adds:

Only a select few crawlers are allowed access to the entire web, and Google is given extra special privileges on top of that. This isn’t illegal and it isn’t Google’s fault, but this monopoly on web crawling that has naturally emerged prevents any other company from being able to effectively compete with Google in the search engine market.

I am not in total agreement with these assertions. For example, consider the world of public relations distribution agencies. Do a search for “news release distribution” and you get a list of outfits. Now write a news release which reveals previously unknown and impactful information about a public traded company. These firms like PR Underground-type operations will explain that a one-day or longer review process is needed. The “reason” is that these firms have “rules of the road.” Is it possible that these distribution outlets are conforming to some vague guidelines imposed by Google. Crossing a blurry line means that the releases won’t be indexed. No indexing in Google means the agency failed and, of course, the information is effectively censored.

What about a company which publishes information on a consumer-type topic like automobiles. What if that Web site operator uses advertising from sources not linked to the Google combine? That Web site is indexed on a less frequent basis by the friendly Google crawler. Then those citations are suppressed for some unknown reason by a Google algorithm. (These are written by humans, but the Google never talks much about the capabilities of a person in a Google office laboring on core search to address issues.)

The ideas of the Knuckleheads’ Club are interesting. Implementing them, however, is going to require some momentum to overcome the Google habit which has become part of the online user’s DNA over the last 20 years.

The real questions remain. Is Google in control of the public Web? Are people fearful of irritating Mother Google (It’s not nice to anger Mother Google)?

Stephen E Arnold, March 29, 2021

How Is Your Web Traffic? Ah, What Web Traffic?

March 29, 2021

I read an interesting article called “In 2020, Two Thirds of Google Searches Ended Without a Click.” Where are those clicks? The clicks belong to the Google. And your Web traffic? Not so good unless one “pays to play”; that is, buy adds get traffic usually.

The write up explains that data from one outfit which has failed and another called SimilarWeb suggest that more than 60 percent of searches do not end with a click to another Web property.

There is fancy search engine optimization for this situation, but in simple terms, it means that once a query’s results are displayed, Google either answers the question, looks at Google’s in results “cards”, or the user clicks on a phone number. Hasta la vista Web site of yore.

One colorful way to explain this Google strategy of becoming the Web is “click cannibalization.” I like the term. Google is eating the clicks. The long term play is for Web sites to disappear, and the Google will die. But that is unlikely in the short term.

What is happening is that many Web sites no longer get traffic, flail away with search engine optimization craziness, and end up buying some ads. The challenge is for a small out fit to generate enough cash to buy sufficient ads to make it worth the GOOG’s while to direct some traffic to these tiny fish.

The SEO slant inn the write up does offer a solution. The reason? There isn’t one in my opinion. Googzilla likes cannibalization. Yum yum yum.

Stephen E Arnold, April xx, 2021

Amazon: Where Does It Get AI Technology?

March 26, 2021

I saw an interesting table from Global Data Financial Deals Database. What’s interesting is that Apple, Facebook, Google, and Microsoft were active purchasers of AI companies. I understand that “taking something off the table” is a sound business tactic. Even if the AI technology embodied in a takeover is wonky, a competitor cannot take advantage of the insights or the people in a particular firm.

I found the inclusion of Accenture in the table interesting. The line between “consulting” and “smart software” seems to be permeable. One wonders how other big dog consulting firms will address what appears to be their smart software gaps. I have long believed that blue chip consulting firms were 21st century publishing companies. The combination of renting smart people and providing smart technology to clients is the type of amalgamation which appears to meet certain needs of Fortune 1000 firms and major government entities and some non governmental organizations.

What jumped out at me as I looked at the data and scanned the comments about it was the absence or failure to include one key question:

What’s Amazon doing to get its artificial intelligence technology?

Buying AI technology to leverage it or take it away from competitors is one method. Has Amazon found another way? Which approach is “better” in terms of intellectual property and real world applications?

I will address this question and a couple of other equally obscure facts about what may be one of the smartest companies in the world in my lecture at the upcoming 2021 National Cyber Crime Conference.

Buying AI capabilities is, it seems, the go to method for some high profile outfits. But is it the only path to smart software? No, it is not.

Stephen E Arnold, March 26, 2021

Insights into Google AMP: A Glimpse Inside the Walled Garden

March 25, 2021

Google is talking privacy. Google is pushing accelerated Web pages. Google is doing what it can to stifle third party cookies. The datasphere will be a better, more tidy place, right?

Navigate to “Google AMP. A 70% Drop in Our Conversion Rate.” Notice these points:

  1. A technical adept tried to follow the Google rules and experienced a decline in conversion rates
  2. The procedure outlined in the write up will be a challenge for many online publishers to follow
  3. The write up does not explain why these positive initiatives of the Google have turned out to have a couple of negatives.

Google is moving access to content produced by certain third parties into its walled garden. The goal is to obtain control and extract maximum information. The silliness about relevance and consistency should be placed in the wooden shed in the corner of the walled garden behind the statue of Googzilla.

For many Facebook is the Internet. That Facebook content is what users generate, others want, and Facebook monetizes.

Google wants this set up too. Get amped up on that, gentle reader. Plus, with increased legal scrutiny, the mom and pop online ad company has to hustle along.

Stephen E Arnold, March 25, 2021

Google and Obfuscation: Who Cares?

March 24, 2021

Google began the process of obfuscating the source of Web pages years ago. There were services which would convert a weird Google version of a PDF’s url into something one could use in a footnote. I have one tool which performs this function now, but I am reluctant to identify it. Why? Google will kill it.

I noted the GitHub item “Addon Unavailable on Google Chrome” which explains that ClearURLs were blocked by Google seven hours ago. The short item explains:

ClearURLs has made it to its mission to prevent tracking via URLs and that’s how Google makes money.

Yep, this is a good statement.

However, let me add several observations:

  • Google, like Facebook, IS the Internet for many people. Both companies want to keep users within the sheep pen. The reasons include tracking, monetization, and, oh, did I mention, tracking?
  • Looking up search results adds computational cost; therefore, serving Google identified as relevant content from caches near a user makes economic sense. How does one know the “provenance” of a Google item? Well, it’s from Google, so it has to be good.
  • The walled garden has been part of the Google system and method for many years. I wrote about this years ago in the Google Legacy and expanded on some of the ideas in Google Version 2.0.

Net net: Thumbtypers, seniors in a haze, and most online “users” are blissfully unaware of the power of obfuscation. In the good old days, one had a url which provided the source domain and a mostly human readable string pointing to the page.

Hasta la vista. The corral is a clean, well lighted place for those who own the ranch. Who cares? Hey, visit a sheep ranch and let me know how many out of flock sheep there are. What happens to those sheep? Sheep dogs chase them back to the flock or wolves just eat the recalcitrant.

Why not think in terms of a digital delicacy?

Stephen E Arnold, March 24, 2021

Microsoft: Your Computer, Your Data. That Is a Good One

March 23, 2021

The online news stream is chock full of information about Microsoft’s swing-for-the-fences PR push for Discord. If you are not familiar with the service, I am not going to explain this conduit for those far more youthful than I. Like GitHub, Discord is going to be an interesting property if the Redmond crowd does the deal. If we anticipate Discord becoming part of the Xbox and Teams family, the alleged censorship of software posted to GitHub will be a glimpse of the content challenges in Microsoft’s future.

The more interesting development is the “real” news story “Microsoft Edge Could Soon Share Browsing Data with Windows 10.” The idea is that a person’s computer and the authorized users of the computing device will become one big, happy data family.

The article states:

Called share browsing data with other Windows features, it is designed to share data from Edge, such as Favorites or visited sites, with other Windows components. Search is a prime target, and highlighted by Microsoft at the time of writing. Basically, what this means is that users who run searches using the built-in search feature may get Edge results as well.

And what does Microsoft get? Possibilities include:

  • Federated, fine grained user behavior data
  • Click stream data matched to content on the user’s personal computer
  • Real-time information flows
  • Opportunities to share data with certain entities.

What happens to the user’s computer if said user does not accept such integration? The options range from loss of access to certain data to pro-active interaction to alter the functioning of the user’s computing device.

Why is this such a good idea? Microsoft, like Amazon, Facebook, and Google realize that the days of the Wild West are coming to an end. There are new sheriffs with new ideas about right and wrong.

Thus, get what one can while the gittin’ is good as the old times used to say.

But “What about security and privacy?” you ask? One response is, “That’s a good one.” Why not try stand up?

Stephen E Arnold, March 23, 2021

Next Page »

  • Archives

  • Recent Posts

  • Meta