Search and Retrieval: A Sub Sub Assembly

What’s happening with search and retrieval? Google’s results irritate some; others are happy with Google’s shaping of information. Web competitors exist; for example, Kagi.com and Neva.com. Both are subscription services. Others provide search results “for free”; examples include Swisscows.com and Yandex.com. You can find metasearch systems (minimal original spidering, just recycling results from other services like Bing.com); for instance, StartPage.com (formerly Ixquick.com) and DuckDuckGo.com. Then there are open source search options. The flagship or flagships are Solr and Lucene. Proprietary systems exist too. These include the ageing X1.com and the even age-ier Coveo system. Remnants of long-gone systems are kicking around too; to wit, BRS and Fulcrum from OpenText, Fast Search now a Microsoft property, and Endeca, owned by Oracle. But let’s look at search as it appears to a younger person today.


A decayed foundation created via smart software on the Mage.space system. A flawed search and retrieval system can make the structure built on the foundation crumble like Southwest Airlines’ reservation system.

First, the primary means of access is via a mobile device. Surprisingly, the source of information for many is video content delivered by the China-linked TikTok or the advertising remora YouTube.com. In some parts of the world, the go-to information system is Telegram, developed by Russian brothers. This is a centralized service, not a New Wave Web 3 confection. One can use the service and obtain information via a query or a group. If one is “special,” an invitation to a private group allows access to individuals providing information about open source intelligence methods or the Russian special operation, including allegedly accurate video snips of real-life war or disinformation.

The challenge is that search is everywhere. Yet in the real world, finding certain types of information is extremely difficult. Obtaining that information may be impossible without informed contacts, programming expertise, or money to pay what would have been called “special librarian research professionals” in the 1980s. (Today, it seems, everyone is a search expert.)

Here’s an example of the type of information which is difficult if not impossible to obtain:

  • The ownership of a domain
  • The ownership of a Tor-accessible domain
  • The date at which a content object was created, the date the content object was indexed, and the date or dates referenced in the content object
  • Certain government documents; for example, unsealed court documents, US government contracts for third-party enforcement services, authorship information for a specific Congressional bill draft, etc.
  • A copy of a presentation made by a corporate executive at a public conference.

I can provide other examples, but I wanted to highlight the flaws in today’s findability.

Read more »


DarkCyber, March 29, 2022: An Interview with Chris Westphal, DataWalk

Chris Westphal is the Chief Analytics Officer of DataWalk, a firm providing an investigative and analysis tool to commercial and government organizations. The 12-minute interview covers DataWalk’s unique capabilities, its data and information resources, and the firm’s workflow functionality. The video can be viewed on YouTube at this location.

Stephen E Arnold, March 29, 2022

Latest News

TikTok: What Does the Software Do?

A day or two ago, information reached me in rural Kentucky about Google’s Project Zero cyber team. I think the main idea is that Google’s own mobiles, Samsung’s,... Read more »

March 22, 2023 | Comment

Google and Its High School Management: An HR Example

I read “Google Won’t Honor Medical Leave During Its Layoffs, Outraging Employees.” Interesting explanation of some of Google’s management methods. These... Read more »

March 22, 2023 | Comment

Stanford: Llama Hallucinating at the Dollar Store

Editor’s Note: This essay is the work of a real, and still alive, dinobaby. No smart software involved with the exception of the addled llama. What happens when... Read more »

March 21, 2023 | Comment

Negative News Gets Attention: Who Knew? Err. Everyone in TV News

I love academic studies. I have a friend who worked in television news in New York before he was lured to the Courier Journal’s video operation. I asked him how... Read more »

March 21, 2023 | Comment

Are the Image Rights Trolls Unhappy?

Imagine the money. Art aggregators like Getty Images, Alamy, and others suck up images from old books, open source repositories, and probably from kindergarteners.... Read more »

March 21, 2023 | Comment

20 Years Ago: Primus Knowledge Solutions

Note: Written by a real-live dinobaby. No smart software involved. I am not criticizing Primus Knowledge Solutions (acquired by ATG in 2004 and then Oracle purchased... Read more »

March 20, 2023 | Comment

OSINT Analysts Alert: Biases Distilled to a One Page Cheat Sheet

“Toward Parsimony in Bias Research: A Proposed Common Framework of Belief-Consistent Information Processing for a Set of Biases” is an academic write up. Usually... Read more »

March 20, 2023 | Comment

Rights Issues: How Can Money Be Extracted from Content?

I don’t have a dog in this fight. I gave up on “real” publishers when the outfits with which I was working in Sweden and the UK went to the big printing press... Read more »

March 20, 2023 | Comment

Amazon Sells What Sells: Magazines and Newspapers Apparently Do Not Sell Well

I read “Amazon Will Discontinue Newspaper and Magazine Subscriptions in September.” The write up reports that Amazon is “abandoning the Kindle for Periodicals... Read more »

March 17, 2023 | Comment

Real News Professional Employs Ad Hominem Method with Flair

I love “real news” output by Silicon Valley-savvy professionals. I read a good article called “Elon Musk Is An Angry Man Who turns on Everybody, Says Longtime... Read more »

March 17, 2023 | Comment

  • Archives

  • Recent Posts

  • Meta