Why Search Is Hard and Quick and Dirty Good Enough Methods Are Train Wrecks
December 15, 2021
I recommend to anyone interested in search and smart software the article “The Business of Extracting Knowledge from Academic Publications.” I am not going to summarize it, nor am I going to discuss why modern search systems are racing toward a collision with useful information retrieval. There was one omission from the essay, and I want to highlight it. I am not critical of this write up. I want to make clear that there is another dimension to scientific, technical, and medical publishing that is often overlooked. I learned this when we created the Pharmaceutical News Index decades ago.
Here’s the omission:
Wizards in technical fields work overtime to obfuscate some of their systems, methods, insights, and findings. The reason is that wizards want to remain wizards and have an ace up their sleeve if one is required to win a poker game for tenure, an over achieving graduate assistant, or some legal eagle involved in a patent dispute. Other reasons for withholding, distorting, and shaping information are related to insecurity. Yep, wizards are wizards in order to have a way to build a defense against those who don’t know what they don’t know and think that what they know defines knowledge.
When it comes to search and retrieval, key words are okay but not perfect. Index terms (what GenXers call tags) are helpful. But the substance of STM content does not yield insights, inventions, or any of the other “knowledge gems” that those pitching smart software believe will spill forth in a results list or a visualization.
What does the information in the article imply for smart software? My answer is, “Misleading or incorrect answers to certain types of inquiries.”
Don’t believe me? That’s okay. Just wait. STM content is “easier” to index than general business writing which is much easier to tag than the excrescences on TikTok, Twitch, or (heaven help me), Twitter.
Stephen E Arnold, December 15, 2021