BERT: It Lives

November 2, 2020

I wrote about good old BERT before.

I was impressed with the indexing and context cues in BERT. The acronym does not refer to the interesting cartoon character. This BERT is Bidirectional Encoder Representations from Transformers. If you want more information about this approach to making sense of text, just navigate to the somewhat turtle like Stanford University site and retrieve the 35 page PDF.

BERT popped up again in a somewhat unusual search engine optimization context (obviously recognized by Google’s system at least seven percent of the time): “Could Google Passage Indexing Be Leveraging BERT?”

I worked through the write up twice. It was, one might say, somewhat challenging to understand. I think I figured it out:

Google is trying to index the context in which an “answer” to a user’s query resides. Via Google parsing magic, the answer may be presented to the lucky user.

I pulled out several gems from the article which is designed to be converted into manipulations to fool Google’s indexing system. SEO is focused on eroding relevance to make a page appear in a Google search result list whether the content answers the user’s query or not.

The gems:

BERT does not always mean the ‘BERT’. Ah, ha. A paradox. That’s helpful.
Former Verity and Yahoo search wizard Prabhakar Raghavan allegedly said: “Today we’re excited to share that BERT is now used in almost every query in English, helping you get higher quality results for your questions.” And what percentage of Google queries is “almost every”? And what percentage of Google queries are in English? Neither the Googler nor the author of the article answer these questions.
It’s called passage indexing, but not as we know it. The “passage indexing” announcement caused some confusion in the SEO community with several interpreting the change initially as an “indexing” one. Confusion. No kidding?
And how about this statement about “almost every”? “Whilst only 7% of queries will be impacted in initial roll-out, further expansion of this new passage indexing system could have much bigger connotations than one might first suspect. Without exaggeration, once you begin to explore the literature from the past year in natural language research, you become aware this change, whilst relatively insignificant at first (because it will only impact 7% of queries after all), could have potential to actually change how search ranking works overall going forward.”

That’s about it because the contradictions and fascinating logic of the article have stressed my 76 year old brain’s remaining neurons. The write up concludes with this statement:

Whilst there are currently limitations for BERT in long documents, passages seem an ideal place to start toward a new ‘intent-detection’ led search. This is particularly so, when search engines begin to ‘Augment Knowledge’ from queries and connections to knowledge bases and repositories outside of standard search, and there is much work in this space ongoing currently. But that is for another article.

Plus, there’s a list of references. Oh, did I mention that this essay/article in its baffling wonderfulness is only 15,000 words long. Another article? Super.

Stephen E Arnold, November 2, 2020

Written by Stephen E. Arnold · Filed Under AI, Google, Natural language processing, News, SEO

Comments

Comments are closed.

Search the site
Subscribe to Beyond Search
Feature archive
News archive

Stephen E. Arnold monitors search, content processing, text mining and related topics from his high-tech nerve center in rural Kentucky. He tries to winnow the goose feathers from the giblets. He works with colleagues worldwide to make this Web log useful to those who want to go "beyond search". Contact him at sa [at] arnoldit.com. His Web site with additional information about search is arnoldit.com.