Search Engines: Bias, Filters, and Selective Indexing

I read “It’s Not Just a Social Media Problem: How Search Engines Spread Misinformation.” The write up begins with a Venn diagram. My hunch is that quite a few people interested in search engines will struggle with the visual. Then there is the concept that typing in a search team returns results are like loaded dice in a Manhattan craps game in Union Square.

The reasons, according to the write up, that search engines fall off the rails are:

  • Relevance feedback or the Google-borrowed CLEVER method from IBM Almaden’s patent
  • Fake stories which are picked up, indexed, and displayed as value infused,

The write up points out that people cannot differentiate between accurate, useful, or “factual” results and crazy information.

Okay, here’s my partial list of why Web search engines return flawed results:

  1. Stop words. Control the stop words and you control the info people can find
  2. Stored queries. Type what you want but get the results already bundled and ready to display.
  3. Selective spidering. The idea is that any index is a partial representation of the possible content. Instruct spiders to skip Web sites with information about peanut butter, and, bingo, no peanut butter information
  4. Spidering depth. Is the bad stuff deep in a Web site? Just limit the crawl to fewer links?
  5. Spider within a span. Is a marginal Web site linking to sites with info you want killed? Don’t follow links off a domain.
  6. Delete the past. Who looks at historical info? A better question, “What advertiser will pay to appear on old content?” Kill the backfile. Web indexes are not archives no matter what thumbtypers believe.

There are other methods available as well; for example, objectionable info can be placed in near line storage so that results from questionable sources display with latency or slow enough to cause the curious user to click away.

To sum up, some discussions of Web search are not complete or accurate.

Stephen E Arnold, March 15, 2021


DarkCyber for June 9, 2020, Is Now Available: AI and Music Composition

The DarkCyber for June 9, 2020, presents a critical look at music generated by artificial intelligence. The focus is the award-winning song in the Eurovision AI 2020 competition. The interview discusses the characteristics of AI-generated music, its impact on music directors, how professional musicians deal with machine-created music, and the implications of non-numan music. The program is a criticism of the state-of-the-art for smart software. Instead of focusing on often over-hyped start ups and large companies making increasingly exaggerated claims, the Australian song and the two musicians make clear that AI is a work in progress. You can view the video at https://vimeo.com/427227666.

Kenny Toth, June 9, 2020

Latest News

Search and the Bezos Bulldozer

For the last three years, I have been giving lectures about the lock in methods implemented by Amazon. I refer to the company as the online bookstore in order to... Read more »

April 13, 2021 | Comment

Checklist of Shady Digital Marketing Tactics

I think the author of “The Problem With Digital Marketing” wanted to make a positive contribution to the art and science of paying to get attention. The write... Read more »

April 13, 2021 | Comment

Scrutinizing Technology Wild Stallions: Regulators Care

Does government regulation bring some adulting to technology companies running wild? Yes, if the information in the weird orange newspaper is accurate. “Chinese... Read more »

April 13, 2021 | Comment

Software Development: Big Is the One True Way

I read an essay called “Everyone Is Still Terrible At Creating Software At Scale.” I am often skeptical about categorical affirmatives. Sometimes a sweeping... Read more »

April 13, 2021 | Comment

A Test to Determine Googliness

I read “After Working at Google, I’ll Never Let Myself Love a Job Again.” I immediately thought of the statement, “You’ll Never Work in This Town Again.”... Read more »

April 12, 2021 | Comment

Shaping Data Is Indeed a Thing and Necessary

I gave a lecture at Microsoft Research many years ago. I brought up the topic of Kolmogorov’s complexity idea and making fast and slow smart software sort of work.... Read more »

April 12, 2021 | Comment

Want to Change Employee Behavior? What Not to Do

I read “The One System That Changes Employee Behavior.” Interesting but disconnected from good old reality. I assume that the breezy recommendations comprise... Read more »

April 12, 2021 | Comment

Apple: Two Cores Inside One Juicy Delight

I am not sure whom to believe. Tim Apple, the spokesperson for security and privacy, or a “senior Apple engineer named Eric Friedman. Mr. Friedman has insight... Read more »

April 12, 2021 | Comment

An Exploration of Search Code

Software engineer Bard de Geode posts an exercise in search coding on his blog—“Building a Full-Text Search Engine in 150 Lines of Python Code.” He has pared... Read more »

April 9, 2021 | Comment

Facebook Security: Fodder for Testimony?

Who knows if this is true? “533 Million Facebook Users’ Phone Numbers Leaked on Hacker Forum.” The write up states: The mobile phone numbers and other personal... Read more »

April 9, 2021 | Comment

  • Archives

  • Recent Posts

  • Meta