A Famed Author Talks about Semantic Search
February 24, 2017
I read “An Interview with Semantic Search and SEO Expert David Amerland.” Darned fascinating. I enjoyed the content marketing aspect of the write up. I also found the explanation of semantic search intriguing as well.
This is the famed author. Note the biceps and the wrist gizmos.
The background of the “famed author” is, according to the write up:
David Amerland, a chemical engineer turned semantic search and SEO expert, is a famed author, speaker and business journalist. He has been instrumental in helping startups as well as multinational brands like Microsoft, Johnson & Johnson, BOSCH, etc. create their SMM and SEO strategies. Davis writes for high-profile magazines and media organizations such as Forbes, Social Media Today, Imassera and journalism.co.uk. He is also part of the faculty in Rutgers University, and is a strategic advisor for Darebee.com.
Darebee.com is a workout site. Since I don’t workout, I was unaware of the site. You can explore it at Darebee.com. I think the name means that a person can “dare to be muscular” or “date to be physically imposing.” I ran a query for Darebee.com on Giburu, Mojeek, and Unbubble. I learned that the name “Darebee” does come up in the index. However, the pointers in Unbubble are interesting because the links identify other sites which are using the “darebee” string to get traffic. Here’s the Unbubble results screen for my query “darebee.”
What I found interesting is the system administrator for Darebee.com is none other than David Amerland, whose email is listed in the Whois record as email@example.com. Darebee is apparently a part of Amerland Enterprises Ltd. in Hertfordshire, UK. The traffic graph for Darebee.com is listed by Alexa. It shows about 26,000 “visitors” per month which is at variance with the monthly traffic data of 3.2 million on W3Snoop.com.
When I see this type of search result, I wonder if the sites have been working overtime to spoof the relevance components of Web search and retrieval systems.
I noted these points in the interview which appeared in the prestigious site Kamkash.com.
On relevance: Data makes zero sense if you can’t find what you want very quickly and then understand what you are looking for.
On semantic search’s definition: Semantic search essentially is trying to understand at a very nuanced level, and then it is trying to give us the best possible answer to our query at that nuanced level of our demands or our intent.
On Boolean search: Boolean search essentially looks at something probabilistically.
On Google’s RankBrain: [Google RankBrain] has nothing to do with ranking.
On participating in Google Plus: Google+ actually allows you to be pervasively enough very real in a very digital environment where we are synchronously connected with lot of people from all over the world and yet the connection feels very…very real in terms of that.
I find these statements interesting.
There is, however, the unfortunate factoid that Boolean is probabilistic. I am not sure I would agree with this statement. Boolean, as I understand it, matches strings. When strings are connected with a Boolean operator, the system displays only the results in which the two terms appear; for example, apples AND oranges. This is not how I understand probabilistic systems. But I am not an expert of the ilk of the wizard whose statements I have cited.
The real meat of the write up is search engine optimization. The idea is that relevance can be fooled. A document which may not be relevant to a user’s query will appear if the SEO expert fiddles the system. The hapless user sees information not relevant to the query. The Unbubble search results illustrate this. I ran a query for “darebee” and I received hits to sites which are using the string “darebee” to appear in the results for the company Darebee. Is Darebee happy with this? Who knows? The person trying to contact Darebee may be frustrated, and Darebee loses a customer. Not what I would want if I owned Darebee.
The SEO angle is that a company should not focus on keywords. Well, that’s interesting because Google’s business model pivots on selling traffic when users put a specific keyword in their query. When I run a query, I typically use words. I am not sure if these are “keywords.” When I ran the query for Darebee, that was an entity, and the Unbubble system dutifully spit out links in which that string appeared. Desktop queries have grown from one or two keywords to almost three in the 25 years the Internet has been available to assorted individuals and experts. In fact, clicking on an old school Endeca “facet” is little more than another form of keyword search. The Endeca “facet” displays results to which the Endeca indexing system applied a category term just like the old fashioned ABI/INFORM classification codes from the 1980s.
SEO, it seems, boils down to “good content.” I am not sure what “good content” is. Perhaps this interview with the expert is one example?
What I found intriguing is that search is becoming “more personalized.” I noted this statement:
when it comes to content, the best – the top strategy is how do we create the personalization of a content that we actually deliver. And…and this may change everything. It may change, for example, long articles they used to try and cover everything in 4,000 words to actually being chunked and being…becoming more specific. So you actually get five pieces of content about 250 words each. But they are very very focused. You may change the format. Instead of going for, you know, perhaps text driven which is wordy to read, into a 3 minute video presentation which shows you something very quickly, shows it to you a lot better what to do and convinces a lot more because of the human contact and the interface which is very visual. It may require podcasts, which somebody can absorb while driving.
I long for the days when humans curated and abstracted high value information. Keyword indexing was then supplemented with human-assisted assignment of classification and category codes, entity type codes, and other types of metadata. I wish I could run a query based on my preparation for the research task.
What we have now is a Rutgers’ professor who obviously did not attend my lecture at Rutgers when I won some ASIS award for my contributions to information science. In that lecture, I pointed out that the flow of electronic information had the same effect as water pouring down a hillside. Erosion of thought, institutions, and details would take place.
We have in this write up an example of the erosion of the basic concepts of precision and recall. Objectivity is marginalized in order to deliver only what the user’s behaviors (overt or monitored) trigger the algorithms to display.
Nifty stuff. In today’s world Boolean can be Bayesian I suppose. To find Darebee, I dare say that I have to use
Stephen E Arnold, February 24, 2017