The Collision of Search Thinkers and the Wide World of Finding

January 4, 2022

To get some insight into the vibrations set off when search thinkers run into market behaviors, you will want to scan the Twitter thread about the need to create an alternative to Google. The focus is medical information. The idea is to return results for a health query without “clickbait sites riddled with crappy ads.” The criticism of the Google was not ignored. No less a luminary than Danny Sullivan replied with Google’s “we are always looking to keep improving our results.”

Digital Don Quixotes saddled up and asserted in this Tweet stream that Google can be beaten. The fix is to create a niche search engine tailored to provide results where Google is just thrilled to present “spam.” Assorted Tweeters added comments.

What do these two Tweeter threads suggest to me?

First, there are niche search engines(what I call vertical search services) that deliver on point results. These are probably not ones most people think about because users of free or ad-supported systems do not know much about finding high value information. Also, I know from my decades in the commercial database business that most “online experts” don’t want to pay for access to commercial online services. Academics get “free” access to content pools like Lexis Nexis, and the “old” Dialog type files because institutions pay the license fees. To the academic user, high value information is “free.” It is not.

Second, a number of Web centric search engines provide reasonably useful results. Examples range from iSeek.com to the Metager system. The mechanism for locating specific information is to frame a query, manually or automatically pass the query to numerous search engines, de-duplicate the result sets, and examine the links. Industrious searchers may enlist tools like Maltego or other open source software to identify potentially helpful items to examine initially. Who wants to do this? I suggest that fewer than three percent of online users pursue this approach. People want to have the mobile phone light up when a pizza joint is nearby or the Tesla’s electric gauge is creeping into the “hello, I need a flat bed truck, please” zone.

Third, Google has operated without meaningful regulation, oversight, or competition for decades. The vaunted ad-revenue engine was not a Google invention. Google took advantage of a particular point in time when searching the Web was gaining traction and useful competition from Alta Vista, Exalead, and Fast Search’s AllTheWeb services were distracted. Google sucked up some AltaVista folks; Exalead was decidedly French; and Fast Search chased the enterprise. Other actions transpired, but the result was that the Google used free to get traffic and traffic made the Yahoo, Overture, GoTo revenue model work like a champ. Remember this was decades ago, not yesterday.

Here’s what I think is going on:

  1. Pundits don’t know or care much about Okeano, Swisscows or  other “free” online search systems. How about searching for those Instagram snaps with Picuki?
  2. Niche search engines are thriving; for example, some of the Israeli specialized software and services firms provide quite helpful access to Facebook content. Who knows? Not too many pundits on the Tweeter and certainly not Google’s PR experts.
  3. Google is not a search engine. Google is a global content system, a fact I explored in my Google: The Digital Gutenberg, originally a long white paper for a government customer who found my view of the world interesting. BearStearns published a report in 2007 which featured my diagram of the Google “octopus” which identified the digital fabric that the company was weaving. Now Google owns the sheep, the dyes, the weaving machines, and the concept of digital fabrics. The overall quality of the Google outputs is “good enough,” and, believe me, it is tough to knock off a global outfit which satisfies the big hump in the standard distribution with something “better.” Whatever “better” means.

Net net: Search is a very, very fuzzy word. At one end of the spectrum are those who are searching well because they can locate an Uber-type service. At the other end of the spectrum are those who deal in extremely rarified content disciplines and have quite good services available; for example, Daylight chemical informatics.

In the middle? A long-standing, persistent and fundamental disconnect between search and what is actually going on in the datasphere.

Pizza? Google’s got that nailed. Need information to fabricate calandria (nuclear terminology)? Google can’t help too much because who searches for calandria, buys ads related to calandria, or knows anything about calandria?

Stephen E Arnold, January 4, 2021

Why Search Is Hard and Quick and Dirty Good Enough Methods Are Train Wrecks

December 15, 2021

I recommend to anyone interested in search and smart software the article “The Business of Extracting Knowledge from Academic Publications.” I am not going to summarize it, nor am I going to discuss why modern search systems are racing toward a collision with useful information retrieval. There was one omission from the essay, and I want to highlight it. I am not critical of this write up. I want to make clear that there is another dimension to scientific, technical, and medical publishing that is often overlooked. I learned this when we created the Pharmaceutical News Index decades ago.

Here’s the omission:

Wizards in technical fields work overtime to obfuscate some of their systems, methods, insights, and findings. The reason is that wizards want to remain wizards and have an ace up their sleeve if one is required to win a poker game for tenure, an over achieving graduate assistant, or some legal eagle involved in a patent dispute. Other reasons for withholding, distorting, and shaping information are related to insecurity. Yep, wizards are wizards in order to have a way to build a defense against those who don’t know what they don’t know and think that what they know defines knowledge.

When it comes to search and retrieval, key words are okay but not perfect. Index terms (what GenXers call tags) are helpful. But the substance of STM content does not yield insights, inventions, or any of the other “knowledge gems” that those pitching smart software believe will spill forth in a results list or a visualization.

What does the information in the article imply for smart software? My answer is, “Misleading or incorrect answers to certain types of inquiries.”

Don’t believe me? That’s okay. Just wait. STM content is “easier” to index than general business writing which is much easier to tag than the excrescences on TikTok, Twitch, or (heaven help me), Twitter.

Stephen E Arnold, December 15, 2021

The Coveo IPO: Making Some Headway

December 9, 2021

A number of Canadian tech companies have recently gone public on the Toronto Stock Exchange only to be met with muted responses. One was enterprise search firm Coveo, which went public in November in order to position itself globally, attract talent, and fund future acquisitions. CEO Louis Têtu appears unconcerned about the apparent indifference to his and other companies’ fledgling stock, The Globe and Mail reports in its piece, “Coveo CEO Dismisses Soft Trading Start on TSX as Quebec Software Company Closes $215-Million IPO.” Writer Sean Silcoff tells us:

“Coveo received more than $1-billion in orders for its IPO… . The stock hit $18 on its first day of trading last Thursday, but has since retreated, briefly trading below the issue price Tuesday. That makes it the fourth new tech issue this autumn – following D2L Corp., Q4 Inc. and E Automotive Inc. – to trade below its issue price. Coveo stock closed Wednesday at $15.30, up 1.7 per cent. Mr. Têtu dismissed Coveo’s ho-hum start as a public company, noting the share price of New York Stock Exchange-listed rival Elastic NV had dropped by 15 per cent over the previous four sessions. ‘There is a set of market dynamics we don’t control; the tide raises and lowers all boats,’ he said. ‘I think the jury is going to be out until the first earnings call [as a public company] and the subsequent earnings call. I think anybody who understands the stock market and IPOs … wouldn’t draw conclusions’ from the stock’s early performance. Coveo became the 20th Canadian tech IPO on the TSX to raise $50-million or more since July, 2020. By contrast, there were 12 such IPOs in the 11 years ended December, 2019.”

I suppose that is a good point—progress is progress, even if it is not at light speed. The write-up [paywalled] includes a few more details about Coveo’s growth and profits. Since its founding in 2005, the company has acquired two AI-powered e-commerce firms: Tooso in 2019 and Qubit in 2021. It sounds like Coveo may have some more companies already in its sights.

The good news is that the stock on December 8, 2021, was trending up. Search and retrieval is a tough business. Just ask the former CEOs of Autonomy and Fast Search & Transfer or take a look at the dust up between Amazon and Elastic. Worth monitoring. Maybe take a stake?

Cynthia Murrell December 10, 2021

What Company Is the Leader in Search Powered by Artificial Intelligence? One Answer May Surprise You. It Did Me.

November 30, 2021

Give up? The answer is Lucidworks, “the leader in AI-powered search.” You can get the gull story from Unite.ai and the article “Will Hayes, CEO of Lucidworks – Interview Series.” What’s “AI”? I don’t know, and the answer is not provided from @IAmWillHayes’ comments. What’s “search”? I don’t know because no specific definition is provided. (Search is a blanket word, covering everything from the open source Lucene in policeware solutions to whiz-bang, patented real time methods for time series data from Trendalyze. And we must not forget the generous offerings of “search” for eDiscovery, product supplier data, chemical structures, streaming video files, code libraries, and mysterious content like the interesting information in encrypted Signal and Telegram interactions. Search at Lucidworks is different it seems.

I noted this statement:

Lucidworks takes mission-critical business problems and solves them with search.

I assume that Lucidworks is disconnected from Dassault Systèmes search based applications approach. There is a 2011 book titled “Search Based Applications: At the Confluence of Search and Database Technologies.” The author is Dr. Gregory Grefenstette with assistance from Laura Wilber. The Lucidworks’ assertion struck me as one more example of marketing hoo hah disconnected from what came before. At least, the Dassault technology was original, not a recycling of open source software.

Here’s another statement offered as an original insight:

Lucidworks offers products and applications for commerce, customer service, and the workplace that use AI and machine learning to solve search. Fusion, our flagship product, uses AI extensively through every stage of enriching data—during ingest and at query time, for understanding user intent, and personalizing results that match that intent.

I want to point out that the Paris-based firm Polyspot used almost the exact same language (both French and English) to describe the company’s approach to information access. Here’s what Bloomberg says about the now repositioned company:

PolySpot SAS develops and publishes enterprise software. The Company’s products offer search and information access solutions designed to improve business and ensure that companies can access the data they need, regardless of their structure, format or origin. PolySpot markets its products internationally.

Dis Yogi Berra or Yogi Bear say: “It’s déjà vu all over again.” I go with the cartoon bear. The aphorism applies to Lucidworks in my opinion.

Lucidworks also does chatbots, fits into the connected experience cloud (CXC), and compounds “value.” Okay. The company, according to @IAmWillHayes, is “leader in next-generation search solutions and we have an exciting roadmap of cloud products coming in the near future.”

I wonder what outfits like Algolia, Coveo, Sphinx Search, and even the heroic X1 think about this assertion. What will Google’s revolving door search experts make of Lucidworks’ bold assertion? What about the crafty laborers in AWS search vineyards who watch the competitors gun for the Bezos bulldozer? What about the innovators working on the somewhat frightening IBM search solution? Maybe Microsoft will just pull a “Fast Search” and buy Lucidworks to beef up its incredible array of finding systems?

My hunch is that Lucidworks has to deal with its backers who want their money back plus some upside. Mix in the harsh market realities of many options, some free or low cost, and others bundled with purpose built solutions like Voyager Labs’ software and what do you get?

I am not sure about your answer. My answer is, “Recycling marketing lingo, ideas, and assertions which are decades old?” Will AI, machine learning, and CXC pull a rabbit from the search magician’s hat?

Maybe. But the investors who have injected more than $200 million into the company may want more than a magic show. And what is “search” and “AI” anyway? Solr with a new outfit from Amazon?

Stephen E Arnold, November 30, 2021

Ask Jeeves Has a Younger Cousin, Ask Jarvis

November 25, 2021

Ask Jeeves.com was a “smart” online search engine. The name lives on in Ask.com. Who remembers? No one. No matter. The younger cousin is now available. Ask Jarvis is “an AI code assistant developed by Assistiv.ai.” The idea is that a hard working developer handling a full time job via Zoom and working on numerous side gigs needs help. Just ask Jarvis when you need a programming tip or a chunk of a manpage. You can find the Web page at https://askjarvis.io. Is it the rule based wonder of the original smart Ask Jeeves.com? Nope, this is an artificial intelligence / machine learning 2021 search system with natural language “powered by OpenAI codex, a descendant of GPT-3.” Years ago this would have been labeled a vertical search engine. Today? I am not sure.

Stephen E Arnold, November 25, 2021

Battle of the Experts? Snowden Versus Sullivan, Wowza

November 19, 2021

This is a hoot: “Edward Snowden Dunks on Search Gurus in Hilarious Twitter Clapback.” Mr. Snowden is an individual who signed a secrecy agreement and elected to ignore it. Mr. Sullivan is a search engine optimization journalist, who is now laboring in the vineyards of Google.

The write up makes clear that Mr. Snowden finds the Google Web search experience problematic. (I wanted to write lousy, but I wish to keep maintain some level of polite discourse.)

Mr. Sullivan points out that Mr. Snowden was talking about “site search.” For those not privy to Google Dorks, a site search requires the names of a site like doe.gov preceded by the Google operator site: At least, that’s the theory.

The write up concludes with a reference to search engine optimization or SEO. That’s Mr. Sullivan’s core competency. Mr. Snowden’s response is not in the article or it could be snagged in the services monitored by the Federal service for supervision of Communications, Information Technology and Mass Media (Roskomnadzor) in everyone favorite satellite destroying country.

Quite a battle. The Snowden Sullivan slugfest. No, think this is emblematic of what has happened to those who ignore secrecy agreements and individuals who have worked hard to make relevance secondary to Google pay to play business processes.

Stephen E Arnold, November 19, 2021

You: Just Bake in Search

November 17, 2021

Google has a new rival, a search engine built with developers in mind: You.com. The platform, now in beta, uses AI to summarize information while supplying links. It also promises never to track queries, sell user data, or push targeted advertising. A couple test searches reveal results neatly tailored to the subject. My first two searches produces Wikipedia articles at the top, followed by general Web results, then topic-specific selections (News, Music, Shopping, etc.), a customized “quick facts” section, and more. When I typed in “pecan pie,” it was smart enough to lead with recipes.

Though the page itself does not emphasize the creator’s focus on developers, he discusses it on the Y Combinator post, “You.com, Private Search Engine that Summarizes the Web—Built for Devs.” He announces:

“My name is Richard Socher, and I’m the founder of you.com, the world’s first open search engine platform that summarizes the web for you. We launched our public beta today, and are excited to share it with you. If you’re a developer, we have several ‘search-apps’ such as StackOverflow (with code snippets), W3Schools, MDN, Copilot-like Code Completion, json checkers, and more. All of them geared to help you code faster. Let us know if you have other app ideas for how to make your coding life better. … We wanted to create a search engine that delivers relevant content, not ads or SEO’d pages, and do it in a whole new interface that puts you in control through personalized preferences.”

We learn more from an article at Venture Beat, “AI-Driven Search Engine You.com Takes on Google with $20M.” Writer Kyle Wiggers reveals that substantial funding is led by Salesforce CEO Marc Benioff. His publication asked Socher about his inspiration for the platform:

“As the economy moves online, it’s You.com’s assertion that the internet is becoming more centralized and controlled by a few powerful, ill-meaning tech corporations. … ‘I had the original idea [for You.com] eight and a half years ago,’ Socher told VentureBeat via email. ‘Today, there’s too much information, and no one has time to read it, process it, or know what to trust. [A] single gatekeeper controls the vast majority of the search market, dictating what you see: too many advertisements and a flood of search-engine-optimized pages … On top of that, 65% of search queries end without a click on another site, which means traffic stays within the Google ecosystem.’

That is a good point. See the Venture Beat article for details on how Socher uses AI to underpin You’s search, the site’s approaches to customization and privacy, and a comparison to its rivals.

Cynthia Murrell November 17, 2021

Elastic Adds Optimyze for Best Cloud Optimization

November 4, 2021

Elastic specializes in enterprise and cloud search solutions, but the company has also branched out by assisting systems in gaining big data insights. Help Net Security details Elastic’s newest move in this area: “Elastic Acquires Optimyze To Deliver Visibility Into Cloud Native Environments.” Optimyze providers a simpler way for users to gain insights from their entire IT ecosystem, eliminate blind spots with Prodfiler, generates continuous system profiling, and low performance overhead code.

Elastic also recently acquired Cmd and build.security. Combined with these other acquisitions, Optimyze with enable Elastic users to monitor and protect data from the unified Elastic Search Platform:

“Optimyze provides frictionless continuous profiling, while the Elastic Search Platform delivers analytics and machine learning capabilities with the ability to correlate and contextualize profiling data with metrics, logs, and traces. The ability to unify the three pillars of observability—metrics, logs and traces—with emerging continuous profiling capabilities delivers actionable insights to customers, leading to improvements in service quality and performance while reducing MTTD (mean-time-to-detect) and MTTR (mean-time-to-resolution).”

Elastic takes the idea of search to a different level. Instead of only concentrating on finding user generated data, Elastic observes, secures, tracks, and locates all kinds of data related to a system’s performance. Does this change the definition of enterprise and cloud search altogether?

Whitney Grace, November 4, 2021

Microsoft Search: Still Trying after All These Years

November 2, 2021

That was “FAST,” wasn’t it? You lived through LiveSearch, right? Jellyfish? Powerset? Outlook Search in its assorted flavors like Life Savers? I could go on, but I am quite certain no one cares.

Nevertheless,

Bing’s new feature may possibly prompt some workers to switch to the search-engine underdog. TechRadar Pro reports the development in its brief write-up, “One of Microsoft’s Most-Hated Products Might Actually Be Getting a Useful Upgrade.” Writer Mike Moore reveals:

“The tech giant is boosting one of its less-celebrated products to give enterprise users an easier way to search online. The update means that enterprise users will now get their historical searches as suggestions in the autosuggest pane on Bing and Microsoft Search in Bing, according to the official Microsoft 365 roadmap entry. … The new update should mean that enterprise users looking to quickly find files that they’ve searched for or opened before will no longer need to manually trawl through endless files and folders in search of the elusive location. The update is still currently in development, but Microsoft will doubtless be keen to get it out soon and help boost Bing engagement. The feature is set to be available to Microsoft Search users across the globe via the company’s general availability route, meaning web, desktop and mobile users will all be able to utilize it upon release.”

Moore notes Microsoft’s tenacity in continuing to support Bing despite Google’s astounding market share lead. He wonders whether the company may have lost some enthusiasm recently, though, when it was revealed that the most searched-for term on Bing is “Google.” A tad embarrassing, perhaps. Does Microsoft suppose its file-finding feature will turn the tide? Unlikely, but some of our readers may find the tool useful, nonetheless.

What’s next for Microsoft search? Perhaps broader and deeper indexing of US government Web sites for a starter?

Cynthia Murrell, November 2, 2021

Google Launches Even More Personalized Search Upgrade

October 21, 2021

Google is already the most used search engine on the planet and delivers fairly accurate results. Like many companies, Google continues to push innovation and The National News shares the latest search upgrade in: “Google To Introduce Search 1,000 Times More Powerful Than Current Engine.” Google’s new search technology leverages AI that combines search criteria for more personalized and accurate results.

Google revealed its latest search achievement at the Search On ’21 event, where executives discussed how they plan to use their AI research to stop the spread of misinformation and make information on the Web more useful. Google also wants to regain shopping traffic from Amazon, Alibaba, Lazada, and other commerce Web sites. The new search technology aims to improve the shopping search experience:

“Google teased the MUM technology during its annual I/O summit last May. It uses its so-called T5 – Text-To-Text Transfer Transformer – framework and is said to be 1,000 times more powerful than the Bert (Bidirectional Encoder Representations from Transformers) technology the company currently uses.

The revamped search technology, using the company’s image-recognition tool Google Lens, will combine data from text, images and even videos, which would then provide more accurate and tailor-made results. Lens has been updated with new AI-powered language features that will narrow searches further. ‘For example, when you search for ‘cropped jackets’, we’ll show you a visual feed of jackets in various colors and styles, alongside other helpful information like local shops, style guides and videos,’ Bill Ready, president of commerce, payments and next billion users at Google, said.”

Google will also include a “wildfire layer” on its Maps to keep track of forest fires in real time. To combat misinformation, search results will include an “About This Result”option that cites the result’s sources and what others users think of it.

Google designed a picture search engine for shopping and is actually citing sources for search results? Yes, please!

Whitney Grace, October 20, 2021

« Previous PageNext Page »

  • Archives

  • Recent Posts

  • Meta