Google Finds That Times Change: Privacy Redefined

October 21, 2016

I read “Google Has Quietly Dropped Ban on Personally Identifiable Web Tracking.” The main idea is that an individual can be mapped to just about anything in the Google-verse. The write up points out that in 2007, one of the chief Googlers said that privacy was a “number one priority when we [the Google] contemplate new kinds of advertising products.”

That was before Facebook saddled up with former Googlers (aka Xooglers) and started to ride the ad pony, detailed user information, and the interstellar beast of user generated content. Googlers knew that social was a big deal, probably more important than offering Boolean operators and time stamp metadata for users of its index. But that was then and this is now.

The write up reveals:

But this summer, Google quietly erased that last privacy line in the sand – literally crossing out the lines in its privacy policy that promised to keep the two pots of data separate by default. In its place, Google substituted new language that says browsing habits “may be” combined with what the company learns from the use Gmail and other tools. The change is enabled by default for new Google accounts. Existing users were prompted to opt-in to the change this summer.

I must admit that when I saw the information, I ignored it. I don’t use too many Google services, and I am not one of the cats in the bag that Google is carrying to and fro. I am old (73), happy with my BlackBerry, and I don’t use mobile search. But the shift is an important part of the “new” Alphabet Google thing.

Tracking users 24×7 is the new black in Sillycon Valley. The yip yap about privacy, ethics, and making explicit what data are gathered is noise. Buy a new Pixel phone and live the dream, gentle reader.

You can work through the story cited above for more details. My thoughts went a slightly different direction:

  1. Facebook poses a significant challenge to Google, and today it does not have a viable option to offer its users
  2. The shift to mobile means that Google has to — note the phrase “has to” — find a way to juice up ad revenues. Sure, these are okay, but to keep the Loon balloons aloft more dough is needed.
  3. Higher value data boils down to detailed information about specific users, their cohorts, their affinity groups, and their behaviors. As the economy continues to struggle, the Alphabet Google thing will have data to buttress the Google ad sales’ professionals pitches to customers.
  4. Offering nifty data to nation states like China-type countries may allow Google to enter a new market with the Pixel and mobile search as Trojan horses.

In my monograph “Google Version 2.0: The Calculating Predator,” I described some of the technical underpinnings of Google’s acquisitions and inventors. With more data, the value of these innovations may begin to pay off. If the money does not flow, Google Version 3.0 may be a reprise of the agonies of the Yahooligans. Those Guha and Halevy “inventions” are fascinating in their scope and capabilities. Think about an email for which one can know who wrote it, who received it, who read it, who changed, what the changes were, who the downstream recipients were, and other assorted informational gems.

Allow me to leave you with a single question:

Do you think the Alphabet Google thing was not collecting fine grained data prior to the official announcement?

Although out of print, I have a pre publication copy of the Google 2.0 monograph available as a PDF. If you want a copy, write my intrepid sales manager, Ben Kent at benkent2020 at yahoo dot com. Yep, Yahoo. Inept as it may be, Yahoo is not the GOOG. The Facebook, however, remains the Facebook, and that’s one of Google’s irritants.

Stephen E Arnold, October 21, 2016

Google and the Mobile Traffic Matter

October 20, 2016

I read a couple of writes up about “Google May Be Stealing Your Mobile Traffic.” Quite surprisingly there was a response to these “stealing” articles by Google. You can read the explanation in a comment by Malte Ubl in the original article (link here).

I noted these comments in the response to the stealing article:

  • Mr. Ubl says, ““stealing traffic” is literally the opposite of what AMP is for.”
  • Mr. Ubl says, “there are audience measurement platforms that attribute traffic to publishers. They might in theory wrongly attribute AMP traffic to the AMP Cache (not Google) rather than to a publisher because they primarily use referrer information. That is why we worked with them in worldwide outreach to get this corrected (where it was a problem), so that traffic is correctly attributed to the publisher. If this is still a problem anywhere, AMP treats it as a highest priority to get it resolved.”
  • Mr. Ubl says, “AMP supports over 60 ad networks (2 of them are owned by Google) with 2-3 coming on board every week and makes absolutely no change to business terms whatsoever. There is no special revenue share for AMP.”
  • Mr. Ubl says, “The Android users might have already noticed that it is now scrolling out of the way and the same is coming soon for iOS (we’re just fighting a few jank issues in Safari).”

AMP is, therefore, not stealing traffic.

I went back to my 2007 monograph “Google Version 2.0: The Calculating Predator,” and pulled out this diagram from a decade ago:

goog container 2007

The user interacts with the Google, not the Internet for certain types of content. The filtering is far from perfect, but it an attempt to gain control over the who, what, why, when, and where of information access and delivery. © Stephen E Arnold, 2007, All rights reserved.

I offer this diagram as a way to summarize my understanding of the architecture which Google had spelled out in its patent documents and open source technical documents. (Yep, the GOOG did pay me a small amount of money, but that is supposed to be something you cannot know.) However, my studies of Google — The Google Legacy, Google Version 2.0: The Calculating Predator, and Google: The Digital Gutenberg— were written with open source content only.

Now back to the diagram. My research suggested that Google, like Facebook, envisioned that it would be the “Internet” for most people. In order to reduce latency and derive maximum efficiency from its global infrastructure, users would interact with Google via services like search. The content or information would be delivered from Google’s servers. In its simplest form, there is a Google cache which serves content. The company understood the cost of passing every query back to data centers, running each query, and then serving the content. Common sense said, “Hey, let’s store this stuff and knock out unnecessary queries.” In a more sophisticated form, the inventions of Ramanathan Guha and others illustrated a system and method for creating a sliced-and-diced archive of factoids. A user query for digital cameras would be handled by pulling factoids from a semantic database. (I am simplifying here.,)

In one of my unpublished presentations, I show a mobile phone user interacting with Google’s caches in order to eliminate the need to send the user to the source of the factoid.

Perhaps I misunderstood the technical information my researchers and I analyzed.

I don’t think Google is doing anything different today. The “stealing” idea comes from a person who finally takes a look at how the Google systems maximize efficiency and control the users. In order to sell ads, Google has to know who does what, when, where, and under what circumstances.

Today’s Google is now a legacy system. I know this is heretical, but Google is not a search company. The firm is using its legacy platform to deliver revenue and maximize that revenue. Facebook (which has lots of Xooglers running around) is doing essentially the same thing but with plumbing variations.

I am probably wildly out of step with youthful Googlers and the zippy mobile AMPers. But from my vantage point, Google has been delivering a closed garden solution for a long time.

My Google trilogy is now out of print. I can provide a fair copy with some production glitches for $250. If you are interested, write my intrepid marketer, Benny Kent at

Stephen E Arnold, October 20, 2016

McPaper Broils Marissa Mayer: The Yahoo Saga Continues

October 19, 2016

I read “Marissa Mayer’s Diminishing Legacy at Yahoo.” My first reaction was, “What legacy?” I know that Yahoo, like Hewlett Packard, will become fodder for business school case studies. But legacy? The write up surprised me too. The write up includes some juicy quotes from “experts” about the firm; for example:

The most recent revelations (of spying) “are just kind of piling on,” says Rita McGrath, professor of management at Columbia Business School, who, like other management experts, concede Mayer’s failure to turn around Yahoo will shadow her. “I don’t think it’s like she was a goddess and now these revelations have destroyed her. It’s almost along the lines of, ‘We almost expected that.'”

Okay, a Xoogler fails. But “We almost expected that.” I knew Yahoo was struggling when the outfit hired a person with a questionable academic past. The Yahooligans have, in fact, had management issues for years. Anyone remember Terry Semel, who wanted to make Yahoo into a “media company.” I still don’t know what a “media company” means.


The write up states:

Nearly 50 members of Congress on Friday asked the Obama administration for more information “as soon as possible” on Yahoo’s cooperation with the government. Yahoo, in turn, has called itself a “law-abiding company.”

In today’s fractious political environment, getting 50 politicos to agree on anything suggests that  the Yahoo thing is a big deal.

I found this statement fascinating because [a] it assumes that the Verizon deal will actually take place and [b] that Ms. Mayer is performing in an above average manner, which does not match up with my analysis. Anyway, here’s the expert’s sunny statement:

“She’ll be remembered as the CEO who sold Yahoo to Verizon,” says Greg Sterling, a contributing editor at Search Engine Land, a site that covers the search industry. He gives Mayer a “B” for her stint at Yahoo. “Her legacy will be judged, in part, on what Verizon does with Yahoo.”

I love the “B.”

A good turn of phrase is “suicide mission.” The idea that no manager could survive Yahoo is one that probably resonates with some Yahoohooligans. For me, I think of the company as YaHOOT: More of a comedy of craziness than an outfit ready for the 21st century.

The legacy notion caps the write up. The point, it seems to me, is that USA Today is happy with Ms. Mayer because she is a female CEO in the often testosterone fueled Sillycon Valley scene. I highlighted the following statement in apologetic purple:

Elizabeth Ames, senior vice president of alliances, marketing and programs at the Anita Borg Institute: “With so few women in these high-profile positions, it is a test case — and that’s a pretty big burden for anyone. And it holds true for minorities in the same situation.” Mayer also brought buzz, appeal and interest to Yahoo after two of her predecessors — Carol Bartz and Scott Thompson — damaged the company’s brand, according to Search Engine’s Sterling. “The burden of expectations was too great,” he says. “She herself couldn’t revive that company. She did as well as anybody can, but she couldn’t get the rock all the way up the hill.”

About that “B”: Were the acquisitions given a pass.

Stephen E Arnold, October 19, 2016

Artificial Intelligence: Time to Surf, Folks

October 17, 2016

I read a remarkable article in Fortune Magazine: “Google Artificial Intelligence Guru Says AI Won’t Kill Jobs.” I had a Dilbert moment mixed with a glimpse of bizarro world.

The main point of the write up is that smart software is the next big thing. Unlike other big things such as outsourcing work from the US to other countries with lower cost labor, work will not be “killed.” Strong word.

Image result for bizarro world

I highlighted this statement from the prognosticating write up:

humanity is still “many decades away from encountering that sort of labor replacement at scale.” Instead, the technology is best used to help humans with work-related tasks rather than replace them outright.

Sounds great. Zooming to the subject of Google, the write up reported:

Google has “developed techniques to safely deploy these systems in a controllable way,” countering fears that A.I. systems are left to run on their own accord.

I assume that’s the reason a consortium of folks are going to gather together to figure out how to make artificial intelligence work just right.

I spoke with a person who drives a truck for a living. He was interested in robot driven trucks. He said, “There won’t be much demand for guys like me, right?”

I reassured him. The truth is that “guys like him” are definitely going to lose their jobs. The same full time equivalent compression will operate in law firms, health care delivery, and dozens of other areas where labor is one or the if not the biggest expense. Leasing a system able to work without taking vacations, calling in sick, or demanding a pension will be embraced. Cost control, not work for humans, is the driving factor.

Online may benefit. Think of those folks who lose their jobs and the free time they have. These people will be able to surf the Web, talk to Alexa, and binge watch.

Informationization (a word I first heard in the early 1990s at a conference in Japan) means disruption. Work processes will change. There will be more online consumers. I am not sure what these folks will do for a living.

Unlike the individuals who work in certain types of companies, the guys like the trucker, the legal researcher, the librarian, etc. are going to have plenty of time to be social on Facebook.

Fortune Magazine seems to buy into the baloney that “A.I. will help humans with their jobs, not replace them.” How’s that working out in traditional publishing?

Stephen E Arnold, October 17, 2016

Google: Fragmentation and the False Universal Search

October 14, 2016

I read “Within Months, Google to Divide Its Index, Giving Mobile Users Better & Fresher Content.” Let’s agree to assume that this write up is spot on. I learned that Google plans “on releasing a separate mobile search index, which will become the primary one.”

The write up states:

The most substantial change will likely be that by having a mobile index, Google can run its ranking algorithm in a different fashion across “pure” mobile content rather than the current system that extracts data from desktop content to determine mobile rankings.

The news was not really news here in Harrod’s Creek. Since 2007, the utility of Google’s search system has been in decline for the type of queries the Beyond Search goslings and I typically run. On rare occasion we need to locate a pizza joint, but the bulk of our queries require old fashioned relevance ranking with results demonstration high precision and on point recall.


Time may be running out for Google Web search.

Several observations:

  1. With the volume of queries from mobile surpassing desktop queries, why would Google spend money to maintain two indexes? Perhaps Google will have a way to offer advertisers messaging targeted to mobile users and then sell ads for the old school desktop users? If the ad revenue does not justify the second index, well, why would an MBA continue to invest in desktop search? Kill it, right?
  2. What happens to the lucky Web sites which did not embrace AMP and other Google suggestions? My hunch is that traffic will drop and probably be difficult to regain. Sure, an advertiser can buy ads targeted at desktop users, but Google does not put much wood behind that which becomes a hassle, an annoyance, or a drag on the zippy outfit’s aspirations.
  3. What will the search engine optimization crowd do? Most of the experts will become instant and overnight experts in mobile search. There will be a windfall of business from Web sites addressed to business customers and others who use mobile but need an old fashioned boat anchor computing device. Then what? Answer: An opportunity to reinvent themselves. Data scientist seems like a natural fit for dispossessed SEO poobahs.

If the report is not accurate, so what? Here’s an idea. Relevance will continue to be eroded as Google tries to deal with the outflow of ad dollars to social outfits pushing grandchildren lovers and the folks who take snaps of everything.

The likelihood of a separate mobile index is high. Remember universal search? I do. Did it arrive? No. If I wanted news, I had to search Google News. Same separate index for scholar, maps, and other Google content. The promise of universal search was PR fluff.

Fragmentation is the name of the game in the world of Alphabet Google. And fragmented services have to earn their keep or get terminated with extreme prejudice. Just like Panoramio (I know. You are asking, “What’s Panoramio?), Google Web search could very well be on the digital glide way to the great beyond.

Stephen E Arnold, October 14, 2016

Alphabet Google: From Search to Stuff in Market Nooks

October 12, 2016

I worked through some of the write ups about the Google tangible products. One excellent point appeared in a British tabloid in “Why Google May Have a Big Problem with Its New Pixel Phone.” The Pixel phone is expensive.

Here in Harrod’s Creek, we noticed that the blue model emulates the colors of the University of Kentucky.

Image result for barnes & Noble nook

Is Alphabet Google have a Barnes & Noble Nook moment. Building gizmos to take on market leaders seems so easy. How did that Nook thing work out for the bookstore company? Right. Bookstores are chugging along. The Nook? Hmmm.

None of the articles were scanned pointed out that Alphabet Google is trying to become something other than search. Sure, a Pixel phone helps generate search traffic, but we think there are other forces at work; for example:

  • Apple envy. Thos margins on high end software are as tasty as a hot apple pie
  • Control. The GOOG has been struggling to maintain its “ah, shucks” approach to lock in with digital services. Maybe gizmos are a way to achieve control without so much oversight from regulators.
  • A lack of ideas. The Alphabet Google thing is not doing too well in the Facebook space. Amazon invented a new hardware category so the kids in Mountain View can do a me too, not a “Eureka.” Ah, imitation.

As the dust settles from the Google stuff blast off, one of the goslings asked, “Do you think Alphabet Google is falling into the Barnes & Noble Nook pitfall?”

Great question. Shifting from search advertising to making products is a bit of a change. Leopards can, I suppose, can change their spots. Easy in Photoshop. Tough in the real world.

Stephen E Arnold, October 12, 2016

More about Good Enough Search

October 10, 2016

I have concluded that finding information is entering a mini Dark Ages. The evidence I have gathered suggests that young folks will speak to their mobile devices to get pizza and information for their PhD research projects. I have a folder of examples of applications of smart software which produces remarkable marketing assertions and black box outputs.

I have added to my collection the write up “Postgress Full Text Search Is Good Enough.” I learned that “good enough” search has these features:

  • Stemming
  • Ranking / Boost Support
  • Multiple languages
  • Fuzzy search for misspelling
  • Accent support

I assume, which is risky, that keywords are part of the basic feature set. But in a world of “good enough”, who knows?

The write up provides code snippets for and details regarding the implementation of Postgress’ search function. The explanation of Postgress’ internal methods may require that you keep some Postgress manuals handy and have a browser pointed at Bing or Google to chase down some of the jargon; for instance:

A tsvector value is a sorted list of distinct lexemes which are words that have been normalized to make different variants of the same word look alike. For example, normalization almost always includes folding upper-case letters to lower-case and often involves removal of suffixes (such as ‘s’, ‘es’ or ‘ing’ in English). This allows searches to find variant forms of the same word without tediously entering all the possible variants.

The section on optimization and indexation provides some useful guidelines. Trouble may result from mismatching one’s data with the types of indices Postgress offers.

If you are using Postgress and interested in “good enough” search, you will find the write up helpful. If you are an entrepreneur and want to tap into an underserved market for a graphical administrative interface for Postgress “good enough” search, you will find that the write provides a checklist for you to follow.

For me in rural Kentucky, I marvel at the happy acceptance of “good enough” search. Once “good enough” takes hold, where does one find the impetus to deliver outstanding search? Do I look to dtSearch? IBM OmniFind (aka, Watson). A whizzy cloud service like Amazon’s?

I suppose I can ask Siri or Cortana. Good enough search and good enough answers. Except when the answers are off point or just incorrect. A “C” is good enough for today’s business and technical approaches.

Stephen E Arnold, October 10, 2016

Autonomy Founder Lynch Snags Bloomberg Profile

October 7, 2016

I read “Former Autonomy CEO Mike Lynch Is Running a Hands-on Fund While He Battles HP in Court.” I have a modest file of open source information about the Hewlett Packard purchase of Autonomy, the company which sparked a shift in search and information access. Years ago, I even met some Autonomy professionals and performed a small task for which I was compensated in a modest way. Autonomy and its Integrated Data Operating Layer / Digital Reasoning Engine remain important milestones. Autonomy was one of the first firms to apply fancy math to thorny problems in making sense of unstructured information. Today, Dr. Lynch’s approach informs many of the smart software companies capturing headlines about online translation to predicting which team will win a football game.

Bloomberg Businessweek’s write up focuses on Dr. Michael Lynch’s life after his split from HP. That Sillycon Valley icon bought Autonomy and seems to have sold the property on again. American real estate television programs call this process a “flip.” The idea is to buy low and sell high. HP’s approach to a “flip” is a bit different. HP bought high and seems to have sold low. That’s why I use the term “Sillycon Valley” to describe the executive methods associated with San Francisco and environs.

The write up tells me:

His $1 billion Europe-focused firm, Invoke Capital, looks like a cross between the Carlyle Group and Y Combinator.

His business approach is described this way:

“We want an unfair advantage. The internal saying is, always take a gun to a knife fight,” Lynch says. As for Invoke’s pitch to startups: “There’s a little bit of an element of the Spice Girls. We can bring people together.”

The article focuses on Luminance, a company with smart software for legal eagles. The company I find most interesting is Dr. Lynch’s Darktrace.

HP Enterprise is suing Dr. Lynch. The legal shoot out will take place in 2017. The write up correctly points out that Dr. Lynch may show up in court to explain that he generated more revenue, more value, and more saleable software than HP did with Autonomy’s technology.

The Bloomberg report does not make these points:

  • Dr. Lynch’s products tap his knowledge of numerical recipes. When applied in an informed manner, these methods are useful and have applicability to a number of business problems. HP bought Autonomy and seems to have lacked the expertise or motivation or business acumen to output big money
  • HP decided to buy Autonomy and then suffered buyer’s remorse in the midst of management turmoil and efforts to prevent Hewlett Packard from becoming an also-ran in the Sillycon Valley growth race
  • HP wants to prove Dr. Lynch fooled them even though HP’s professionals and consultants performed due diligence prior to buying Autonomy. Perhaps HP should look at its systems and methods?

Worth monitoring even if an observer does not understand Bayesian, Laplacian, Monte Carlo, and Markovian methods. My hunch is that HP’s lawyers may find these mathematical methods helpful in determining the odds for certain events. When one feeds data about Dr. Lynch’s current hot start ups into the model, the predictive output suggests more revenue and innovation for Dr. Lynch. HP, on the other hand, may receive a different set of probabilities.

If you are not familiar with the Autonomy information access system, I have made available, a free report about Autonomy. You may access it at my archive.

Stephen E Arnold, October 7, 2017

Five Years in Enterprise Search: 2011 to 2016

October 4, 2016

Before I shifted from worker bee to Kentucky dirt farmer, I attended a presentation in which a wizard from Findwise explained enterprise search in 2011. In my notes, I jotted down the companies the maven mentioned (love that alliteration) in his remarks:

  • Attivio
  • Autonomy
  • Coveo
  • Endeca
  • Exalead
  • Fabasoft
  • Google
  • IBM
  • ISYS Search
  • Microsoft
  • Sinequa
  • Vivisimo.

There were nodding heads as the guru listed the key functions of enterprise search systems in 2011. My notes contained these items:

  • Federation model
  • Indexing and connectivity
  • Interface flexibility
  • Management and analysis
  • Mobile support
  • Platform readiness
  • Relevance model
  • Security
  • Semantics and text analytics
  • Social and collaborative features

I recall that I was confused about the source of the information in the analysis. Then the murky family tree seemed important. Five years later, I am less interested in who sired what child than the interesting historical nuggets in this simple list and collection of pretty fuzzy and downright crazy characteristics of search. I am not too sure what “analysis” and “analytics” mean. The notion that an index is required is okay, but the blending of indexing and “connectivity” seems a wonky way of referencing file filters or a network connection. With the Harvard Business Review pointing out that collaboration is a bit of a problem, it is an interesting footnote to acknowledge that a buzzword can grow into a time sink.


There are some notable omissions; for example, open source search options do not appear in the list. That’s interesting because Attivio was at that time I heard poking its toe into open source search. IBM was a fan of Lucene five years ago. Today the IBM marketing machine beats the Watson drum, but inside the Big Blue system resides that free and open source Lucene. I assume that the gurus and the mavens working on this list ignored open source because what consulting revenue results from free stuff? What happened to Oracle? In 2011, Oracle still believed in Secure Enterprise Search only to recant with purchases of Endeca, InQuira, and Rightnow. There are other glitches in the list, but let’s move on.

Read more

Microsoft and Both Hewlett Packards Are Chums

September 20, 2016

I read “Microsoft Beats Out Rivals for HP Software Deal.” The write up does not answer the following questions:

  1. Did Microsoft or HP’s public relations advisers bring this story to Fortune Magazine?
  2. How much will HP save by using Microsoft’s sales management and database software instead of Oracle’s and Salesforce’s software?
  3. How much will the transition from the Oracle and Salesforce systems to the Microsoft system cost?
  4. Why couldn’t HP use its hardware with the Oracle and Saleforce systems?
  5. Why did HP choose a proprietary solution when there are satisfactory open source options available?
  6. Who back was injured after the frenzy of scratching ended?

What the write up reveals is that Oracle and Salesforce lost a big customer. I also highlighted this passage:

This deal adds another dimension to HP-Microsoft partnerships. HP is a huge and longtime hardware partner—its PCs ship with Microsoft Windows and often with its Office applications as well. There is significant overlap between the two companies’ reseller partners. And since most Microsoft partners run Dynamics CRM already, HP’s use of the product could simplify collaboration and data exchange. HP claims about 100,000 partners worldwide.

I will not comment about the “claims” about partners. Let’s see. HPQ buys hardware from HPE. Microsoft is a partner for HPQ and HPE. Looks like a friendly group. Add one person and the companies have a gold foursome. Will Google get asked to join the group? We know Oracle and Salesforce won’t.

Stephen E Arnold, September 20, 2016

Next Page »