PageRank Revealed with a Superficial Glance

December 24, 2016

I read “19 Confirmed Google Ranking Factors.” The table below comes from my 2004 monograph The Google Legacy. You will be able to view a seven minute summary on December 20, 2016. The table in The Google Legacy table consists of more than 100 factors used in the Google relevance system. Each of the PageRank elements was extracted from open source information; for example, journal articles, Google technical papers which were once easily available from Google, patents, various speeches, and blog posts. We estimated that the factors are tuned and modified to deal with hacks, tricks, and new developments. Here is an extract from the tables in The Google Legacy:


Imagine my surprise when I worked through the 19 factors in the article “19 Confirmed Google Ranking Factors.” My research suggested that by 2004, Google had layered on and inserted many factors which the company did not document. These adjustments have continued since 2004 when production of The Google Legacy began and changes could not longer be made to the book text.

The idea that one can influence PageRank by paying attention to a handful of content, format, update, and technical requirements is interesting for two reasons:

  1. It continues the simplification of the way people think about Google and its methods
  2. Google faces “inertia” when it comes to making changes in its core relevance methods; that is, it is easier for Google to “wrap” the core with new features than it is to make certain changes. That’s the reason there is an index for mobile search and an index for desktop search.

Here’s an example of the current thinking about Google’s relevance ranking methods from the article cited in this blog post: Links. Yep, PageRank relies on links. Think about IBM Almaden Clever and you get a good idea how this works. What Google has added were methods which pay attention to less crude signals. Google also pays attention to signals which “deduct” or “down check” a page or site. Transgressions include duplicate content and crude tricks to fool Google’s algorithms; for example, you click on a link that says “White House” and you see porn. This issue and thousands of others have been “fixed” by Google engineers. My 2004 listing of 100 factors is a fraction of the elements the Google relevance systems process.

Another example of relevance simplification appears in “10 Google Search Ranking Factors Content Marketers Should Prioritize (And 3 You Shouldn’t).” Yep, almost 20 years of relevance tweaks boil down to a dozen rules. Hey, if these worked, why isn’t everyone in the SEO game generating oodles of traffic? Answer: The Google system is a bit more slippery and requires methods with more rubber studs on the SEO gym shoe.l

The problem with boiling down Google’s method to a handful of checkpoints is that the simplification can impart false confidence. Do this and the traffic does NOT materialize. What happened? The answer is that a misstep has been introduced while doing the “obvious” search engine optimization tweak. To give one example, consider making changes to one’s site. Google notes frequency and type of changes to a Web site. How about those frequent and radical redesigns. How does Mother Google interpret that information?

Manipulating relevance in order to boost a site’s ranking in a results list can have some interesting consequences. Over the years, I have stated repeatedly that if a webmaster wants traffic, buy AdWords. The other path is to concentrate on producing content which other people want to read. Shortcuts and tricks can lead to some fascinating consequences and, of course, work for the so called search engine optimization experts.

Matt, Matt, where are you now? Oh, that’s right…

Stephen E Arnold, December 24, 2016


