Consensus: A Gen AI Search Fed on Research, not the Wild Wild Web

September 3, 2024

How does one make an AI search tool that is actually reliable? Maybe start by supplying it with only peer-reviewed papers instead of the whole Internet. Fast Company sings the praises of Consensus in, “Google Who? This New Service Actually Gets AI Search Right.” Writer JR Raphael begins by describing why most AI-powered search engines, including Google, are terrible:

“The problem with most generative AI search services, at the simplest possible level, is that they have no idea what they’re even telling you. By their very nature, the systems that power services like ChatGPT and Gemini simply look at patterns in language without understanding the actual context. And since they include all sorts of random internet rubbish within their source materials, you never know if or how much you can actually trust the info they give you.”

Yep, that pretty much sums it up. So, like us, Raphael was skeptical when he learned of yet another attempt to bring generative AI to search. Once he tried the easy-to-use Consensus, however, he was convinced. He writes:

“In the blink of an eye, Consensus will consult over 200 million scientific research papers and then serve up an ocean of answers for you—with clear context, citations, and even a simple ‘consensus meter’ to show you how much the results vary (because here in the real world, not everything has a simple black-and-white answer!). You can dig deeper into any individual result, too, with helpful features like summarized overviews as well as on-the-fly analyses of each cited study’s quality. Some questions will inevitably result in answers that are more complex than others, but the service does a decent job of trying to simplify as much as possible and put its info into plain English. Consensus provides helpful context on the reliability of every report it mentions.”

See the post for more on using the web-based app, including a few screenshots. Raphael notes that, if one does not have a specific question in mind, the site has long lists of its top answers for curious users to explore. The basic service is free to search with no query cap, but creators hope to entice us with an $8.99/ month premium plan. Of course, this service is not going to help with every type of search. But if the subject is worthy of academic research, Consensus should have the (correct) answers.

Cynthia Murrell, September 3, 2024

Written by Stephen E. Arnold · Filed Under Business strategy, News, Search | Comments Off on Consensus: A Gen AI Search Fed on Research, not the Wild Wild Web

Elastic N.V. Faces a New Search Challenge

September 2, 2024

This essay is the work of a dumb dinobaby. No smart software required.

Elastic N.V. and Shay Banon are what I call search survivors. Gone are Autonomy (mostly), Delphis, Exalead, Fast Search & Transfer (mostly), Vivisimo, and dozens upon dozens of companies who sought to put an organization’s information at an employee’s fingertips. The marketing lingo of these and other now-defunct enterprise search vendors is surprisingly timely. Once can copy and paste chunks of Autonomy’s white papers into the OpenAI ChatGPT search is coming articles and few would notice that the assertions and even the word choice was more than 40 years old.

Elastic N.V. survived. It rose from a failed search system called Compass. Elastic N.V. recycled the Lucene libraries, released the open source Elasticsearch, and did an IPO. Some people made a lot of money. The question is, “Will that continue?”

I noted the Silicon Angle article “Elastic Shares Plunge 25% on Lower Revenue Projections Amid Slower Customer Commitments.” That write up says:

In its earnings release, Chief Executive Officer Ash Kulkarni started positively, noting that the results in the quarter we solid and outperformed previous guidance, but then comes the catch and the reason why Elastic stock is down so heavily after hours. “We had a slower start to the year with the volume of customer commitments impacted by segmentation changes that we made at the beginning of the year, which are taking longer than expected to settle,” Kulkarni wrote. “We have been taking steps to address this, but it will impact our revenue this year.” With that warning, Elastic said that it expects fiscal second-quarter adjusted earnings per share of 37 to 39 cents on revenue of $353 million to $355 million. The earnings per share forecast was ahead of the 34 cents expected by analysts, but revenue fell short of an expected $360.8 million. It was a similar story for Elastic’s full-year outlook, with the company forecasting earnings per share of $1.52 to $1.56 on revenue of $1.436 billion to $1.444 billion. The earnings per share outlook was ahead of an expected $1.42, but like the second quarter outlook, revenue fell short, as analysts had expected $1.478 billion.

Elastic N.V. makes money via service and for-fee extras. I want to point out that the $300 million or so revenue numbers are good. Elastic B.V. has figured out a business model that has not required [a] fiddling the books, [b] finding a buyer as customers complain about problems with the search software, [c] the sources of financing rage about cash burn and lousy revenue, [d] government investigators are poking around for tax and other financial irregularities, [e] the cost of running the software is beyond the reach of the licensee, or [f] the system simply does not search or retrieve what the user wanted or expected.

Elastic B.V. and its management team may have a challenge to overcome. Thanks, OpenAI, the MSFT Copilot thing crashed today.

So what’s the fix?

A partial answer appears in the Elastic B.V. blog post titled “Elasticsearch Is Open Source, Again.” The company states:

The tl;dr is that we will be adding AGPL as another license option next to ELv2 and SSPL in the coming weeks. We never stopped believing and behaving like an open source community after we changed the license. But being able to use the term Open Source, by using AGPL, an OSI approved license, removes any questions, or fud, people might have.

Without slogging through the confusion between what Elastic B.V. sells, the open source version of Elasticsearch, the dust up with Amazon over its really original approach to search inspired by Elasticsearch, Lucid Imagination’s innovation, and the creaking edifice of A9, Elastic B.V. has released Elasticsearch under an additional open source license. I think that means one can use the software and not pay Elastic B.V. until additional services are needed. In my experience, most enterprise search systems regardless of how they are explained need the “owner” of the system to lend a hand. Contrary to the belief that smart software can do enterprise search right now, there are some hurdles to get over.

Will “going open source again” work?

Let me offer several observations based on my experience with enterprise search and retrieval which reaches back to the days of punch cards and systems which used wooden rods to “pull” cards with a wanted tag (index term):

When an enterprise search system loses revenue momentum, the fix is to acquire companies in an adjacent search space and use that revenue to bolster the sales prospects for upsells.
The company with the downturn gilds the lily and seeks a buyer. One example was the sale of Exalead to Dassault Systèmes which calculated it was more economical to buy a vendor than to keep paying its then current supplier which I think was Autonomy, but I am not sure. Fast Search & Transfer pulled of this type of “exit” as some of the company’s activities were under scrutiny.
The search vendor can pivot from doing “search” and morph into a business intelligence system. (By the way, that did not work for Grok.)
The company disappears. One example is Entopia. Poof. Gone.

I hope Elastic B.V. thrives. I hope the “new” open source play works. Search — whether enterprise or Web variety — is far from a solved problem. People believe they have the answer. Others believe them and license the “new” solution. The reality is that finding information is a difficult challenge. Let’s hope the “downturn” and “negativism” goes away.

Stephen E Arnold, September 2, 2024

Written by Stephen E. Arnold · Filed Under Business strategy, Financial, News, Search | Comments Off on Elastic N.V. Faces a New Search Challenge

A Familiar Cycle: The Frustration of Almost Solving the Search Problem

August 16, 2024

This essay is the work of a dumb dinobaby. No smart software required.

Search and retrieval is a difficult problem. The solutions have ranged from scrolls with labels to punched cards and rods to bags of words. Each innovation or advance sparked new ideas. Boolean gave way to natural language. Natural language evolved into semi-smart systems. Now we are in the era of what seems to be smart software. Like the punch card systems, users became aware of the value of consistent, accurate indexing. Today one expects a system to “know” what the user wants. Instead of knowing index terms, one learns to be a prompt engineer.

Search and retrieval is not “solved” using large language models. LLMs are a step forward on a long and difficult path. The potential financial cost of thinking that the methods are a sure-fire money machine is high. Thanks, MSFT Copilot. How was DEFCON?

I read “LLM Progress Is Slowing — What Will It Mean for AI?.” The write up makes clear that some of the excitement of smart software which can makes sense of natural language queries (prompts) has lost some of its shine. This type of insight is one that probably existed when a Babylonian tablet maker groused about not having an easy way to stack up clay tablets for the money guy. Search and retrieval is essential for productive work. A system which makes that process less of a hassle is welcomed. After a period of time one learns that the approach is not quite where the user wants it to be. Researchers and innovators hear the complaint and turn their attention to improving search and retrieval … again.

The write up states:

The leap from GPT-3 to GPT-3.5 was huge, propelling OpenAI into the public consciousness. The jump up to GPT-4 was also impressive, a giant step forward in power and capacity. Then came GPT-4 Turbo, which added some speed, then GPT-4 Vision, which really just unlocked GPT-4’s existing image recognition capabilities. And just a few weeks back, we saw the release of GPT-4o, which offered enhanced multi-modality but relatively little in terms of additional power. Other LLMs, like Claude 3 from Anthropic and Gemini Ultra from Google, have followed a similar trend and now seem to be converging around similar speed and power benchmarks to GPT-4. We aren’t yet in plateau territory — but do seem to be entering into a slowdown. The pattern that is emerging: Less progress in power and range with each generation.

This is an echo of the complaints I heard about Dr. Salton’s SMART search system.

The “fix” according to the write up may be to follow one of these remediation paths:

More specialization
New user interfaces
Open source large language models
More and better data
New large language model architectures.

These are ideas bolted to the large language model approach to search and retrieval. I think each has upsides and downsides. These deserve thoughtful discussion. However, the evolution of search-and-retrieval has been an evolutionary process. Those chaos and order thinkers at the Santa Fe Institute suggest that certain “things” self organize and emerge. The idea has relevance to what happens with each “new” approach to search and retrieval.

The cited write up concludes with this statement:

One possible pattern that could emerge for LLMs: That they increasingly compete at the feature and ease-of-use levels. Over time, we could see some level of commoditization set in, similar to what we’ve seen elsewhere in the technology world. Think of, say, databases and cloud service providers. While there are substantial differences between the various options in the market, and some developers will have clear preferences, most would consider them broadly interchangeable. There is no clear and absolute “winner” in terms of which is the most powerful and capable.

I think the idea about competition is mostly correct. However, what my impression of search and retrieval as a technology thread is that progress is being made. I find it encouraging that more users are interacting with systems. Unfortunately search and retrieval is not solved by generating a paragraph a high school student can turn into a history teacher as an original report.

Effective search and retrieval is not just a prompt box. Effective information access remains a blend of extraordinarily trivial activities. For instance, a conversation may suggest a new way to locate relevant information. Reading an article or a longer document may trigger an unanticipated connection between ant colonies and another task-related process. The act of looking at different sources may lead to a fact previously unknown which leads in turn to another knowledge insight. Software alone cannot replicate these mental triggers.

LLMs like stacked clay tablets provide challenges and utility. However, search and retrieval remains a work in progress. LLMs, like semantic ad matching, or using one’s search history as a context clue, are helpful. But opportunities for innovation exist. My view is that the grousing about LLM limitations is little more than a recognition that converting a human concept or information need to an “answer” is a work in progress. The difference is that today billions of dollars have been pumped into smart software in the hope that information retrieval is solved.

Sorry, it is not. Therefore, the stakes of realizing that the golden goose may not lay enough eggs to pay off the cost of the goose itself. Twenty years ago search and retrieval was not a sector consuming billions of dollars in the span of a couple of years. That’s what is making people nervous about LLMs. Watching Delphi or Entopia fail was expensive, but the scale of the financial loss and the emotional cost of LLM failure is a different kettle of fish.

Oh, and those five “fixes” in the bullet points from the write up. None will solve the problem of search and retrieval.

Stephen E Arnold, August 16, 2024

Written by Stephen E. Arnold · Filed Under AI, News, Search | Comments Off on A Familiar Cycle: The Frustration of Almost Solving the Search Problem

Publication Founded by a Googler Cheers for Google AI Search

June 5, 2024

This essay is the work of a dinobaby. Unlike some folks, no smart software improved my native ineptness.

To understand the “rah rah” portion of this article, you need to know the backstory behind Search Engine Land, a news site about search and other technology. It was founded by Danny Sullivan, who pushed the SEO bandwagon. He did this because he was angling for a job at Google, he succeeded, and now he’s the point person for SEO.

Another press release touting the popularity of Google search dropped: “Google SEO Says AI Overviews Are Increasing Search Usage.” The author Danny Goodwin remains skeptical about Google’s spiked popularity due to AI and despite the bias of Search Engine Land’s founder.

During the QI 2024 Alphabet earnings call, Google/Alphabet CEO Sundar Pichai said that the search engine’s generative AI has been used for billions of queries and there are plans to develop the feature further. Pichai said positive things about AI, including that it increased user engagement, could answer more complex questions, and how there will be opportunities for monetization.

Goodwin wrote:

“All signs continue to indicate that Google is continuing its slow evolution toward a Search Generative Experience. I’m skeptical about user satisfaction increasing, considering what an unimpressive product AI overviews and SGE continues to be. But I’m not the average Google user – and this was an earnings call, where Pichai has mastered the art of using a lot of words to say a whole lot of nothing.”

AI is the next evolution of search and Google is heading the parade, but the technology still has tons of bugs. Who founded the publication? A Googler. Of course there is no interaction between the online ad outfit and an SEO mouthpiece. Un-uh. No way.

Whitney Grace, June 5, 2024

Written by Stephen E. Arnold · Filed Under Google, Marketing, News, Search, SEO | Comments Off on Publication Founded by a Googler Cheers for Google AI Search

Dexa: A New Podcast Search Engine

May 21, 2024

This essay is the work of a dinobaby. Unlike some folks, no smart software improved my native ineptness.

Google, Bing, and DuckDuckGo (a small percentage) dominate US search. Spotify, Apple Podcasts, and other platforms host and aggregate podcast shows. The problem is neither the twain shall meet when people are searching for video or audio content. Riley Tomasek was inspired by the problem and developed the Deva app:

“Dexa is an innovative brand that brings the power of AI to your favorite podcasts. With Dexa’s AI-powered podcast assistants, you can now explore, search, and ask questions related to the knowledge shared by trusted creators. Whether you’re curious about sleep supplements, programming languages, growing an audience, or achieving financial freedom, Dexa has you covered. Dexa unlocks the wisdom of experts like Andrew Huberman, Lex Fridman, Rhonda Patrick, Shane Parrish, and many more.

With Dexa, you can explore the world of podcasts and tap into the knowledge of trusted creators in a whole new way.”

Alex Huberman of Huberman Labs picked up the app and helped it go viral.

From there the Deva team built an intuitive, complex AI-powered search engine that indexes, analyzes, and transcribes podcasts. Since Deva launched nine months ago it has 50,000 users, answered almost one million, and partnered with famous podcasters. A recent update included a chat-based interface, more search and discover options, and ability watch referenced clips in a conversation.

Deva has raised $6 million in seed money and an exclusive partnership with Huberman Lab.

Deva is still a work in progress but it responds like ChatGPT but with a focus of conveying information and searching for content. It’s an intuitive platform that cites its sources directly in the search. It’s probably an interface that will be adopted by other search engines in the future.

Whitney Grace, May 21, 2024

Written by Stephen E. Arnold · Filed Under News, Rich media, Search | Comments Off on Dexa: A New Podcast Search Engine

Ho Hum: The Search Sky Is Falling

May 15, 2024

This essay is the work of a dinobaby. Unlike some folks, no smart software improved my native ineptness.

“Google’s Broken Link to the Web” is interesting for two reasons: [a] The sky is falling — again and [b] search has been broken for a long time and suddenly I should worry.

The write up states:

When it comes to the company’s core search engine, however, the image of progress looks far muddier. Like its much-smaller rivals, Google’s idea for the future of search is to deliver ever more answers within its walled garden, collapsing projects that would once have required a host of visits to individual web pages into a single answer delivered within Google itself.

Nope. The walled garden has been in the game plan for a long, long time. People who lusted for Google mouse pads were not sufficiently clued in to notice. Google wants to be the digital Hotel California. Smarter software is just one more component available to the system which controls information flows globally. How many people in Denmark rely on Google search whether it is good, bad, or indifferent? The answer is, “99 percent.” What about people who let Google Gmail pass along their messages? How about 67 percent in the US. YouTube is video in many countries even with the rise of TikTok, the Google is hanging in there. Maps? Ditto. Calendars? Ditto. Each of these ubiquitous services are “search.” They have been for years. Any click can be monetized one way or another.

Who will pay attention to this message? Regulators? Users of search on an iPhone? How about commuters and Waze? Thanks, MSFT Copilot. Good enough. Working on those security issues today?

Now the sky is falling? Give me a break. The write up adds:

where the company once limited itself to gathering low-hanging fruit along the lines of “what time is the super bowl,” on Tuesday executives showcased generative AI tools that will someday plan an entire anniversary dinner, or cross-country-move, or trip abroad. A quarter-century into its existence, a company that once proudly served as an entry point to a web that it nourished with traffic and advertising revenue has begun to abstract that all away into an input for its large language models. This new approach is captured elegantly in a slogan that appeared several times during Tuesday’s keynote: let Google do the Googling for you.

Of course, if Google does it, those “search” abstractions can be monetized.

How about this statement?

But to everyone who depended even a little bit on web search to have their business discovered, or their blog post read, or their journalism funded, the arrival of AI search bodes ill for the future. Google will now do the Googling for you, and everyone who benefited from humans doing the Googling will very soon need to come up with a Plan B.

Okay, what’s the plan B? Kagi? Yandex? Something magical from one of the AI start ups?

People have been trying to out search Google for a quarter century. And what has been the result? Google’s technology has been baked into the findability fruit cakes.

If one wants to be found, buy Google advertising. The alternative is what exactly? Crazy SEO baloney? Hire a 15 year old and pray that person can become an influencer? Put ads on Tubi?

The sky is not falling. The clouds rolled in and obfuscated people’s ability to see how weaponized information has seized control of multiple channels of information. I don’t see a change in weather any time soon. If one wants to run around saying the sky is falling, be careful. One might run into a wall or trip over a fire plug.

Stephen E Arnold, May 15, 2024

Written by Stephen E. Arnold · Filed Under AI, Google, News, Search | Comments Off on Ho Hum: The Search Sky Is Falling

Will Google Behave Like Telegram?

May 10, 2024

This essay is the work of a dinobaby. Unlike some folks, no smart software improved my native ineptness.

I posted a short item on LinkedIn about Telegram’s blocking of Ukraine’s information piped into Russia via Telegram. I pointed out that Pavel Durov, the founder of VK and Telegram, told Tucker Carlson that he was into “free speech.” A few weeks after the interview, Telegram blocked the data from Ukraine for Russia’s Telegram users. One reason given, as I recall, was that Apple was unhappy. Telegram rolled over and complied with a request that seems to benefit Russia more than Apple. But that’s just my opinion. The incident, which one of my team verified with a Ukrainian interacting with senior professionals in Ukraine, the block. Not surprisingly, Ukraine’s use of Telegram is under advisement. I think that means, “Find another method of sending encrypted messages and use that.” Compromised communications can translate to “Rest in Peace” in real time.

A Hong Kong rock band plays a cover of the popular hit Glory to Hong Kong. The bats in the sky are similar to those consumed in Shanghai during a bat festival. Thanks, MSFT Copilot. What are you working on today? Security or AI?

I read “Hong Kong Puts Google in Hot Seat With Ban on Protest Song.” That news story states:

The Court of Appeal on Wednesday approved the government’s application for an injunction order to prevent anyone from playing Glory to Hong Kong with seditious intent. While the city has a new security law to punish that crime, the judgment shifted responsibility onto the platforms, adding a new danger that just hosting the track could expose companies to legal risks. In granting the injunction, judges said prosecuting individual offenders wasn’t enough to tackle the “acute criminal problems.”

What’s Google got to do with it that toe tapper Glory to Hong Kong?

The write up says:

The injunction “places Google, media platforms and other social media companies in a difficult position: Essentially pitting values such as free speech in direct conflict with legal obligations,” said Ryan Neelam, program director at the Lowy Institute and former Australian diplomat to Hong Kong and Macau. “It will further the broader chilling effect if foreign tech majors do comply.”

The question is, “Roll over as Telegram allegedly has, or fight Hong Kong and by extension everyone’s favorite streaming video influencer, China?” What will Google do? Scrub Glory to Hong Kong, number one with a bullet on someone’s hit parade I assume.

My guess is that Google will go to court, appeal, and then take appropriate action to preserve whatever revenue is at stake. I do know The Sundar & Prabhakar Comedy Show will not use Glory to Hong Kong as its theme for its 2024 review.

Stephen E Arnold, May 10, 2024

Written by Stephen E. Arnold · Filed Under Business strategy, Google, Government, Legal matters, News, Search | Comments Off on Will Google Behave Like Telegram?

Google Search Is Broken

May 10, 2024

ChatGPT and other generative AI engines have screwed up search engines, including the all-powerful Google. The Blaze article, “Why Google Search Is Broken” explains why Internet search is broke, and the causes. The Internet is full of information and the best way to get noticed in search results is using SEO. A black hat technique (it will probably be considered old school in the near future) to manipulate search results is to litter a post with keywords aka “keyword stuffing.”

ChatGPT users realized that it’s a fantastic tool for SEO, because they tell the AI algorithm to draft a post with a specific keyword and it generates a decent one. Google’s search algorithm then reads that post and pushes it to the top of search results. ChatGPT was designed to read and learn language the same way as Google: skin the Internet, scoop up information from Web sites, and then use it to teach the algorithm. This threatens Google’s search profit margins and Alphabet Inc. doesn’t like that:

“By and large, people don’t want to read AI-generated content, no matter how accurate it is. But the trouble for Google is that it can’t reliably detect and filter AI-generated content. I’ve used several AI detection apps, and they are 50% accurate at best. Google’s brain trust can probably do a much better job, but even then, it’s computationally expensive, and even the mighty Google can’t analyze every single page on the web, so the company must find workarounds.

This past fall, Google rolled out its Helpful Content Update, in which Google started to strongly emphasize sites based on user-generated content in search results, such as forums. The site that received the most notable boost in search rankings was Reddit. Meanwhile, many independent bloggers saw their traffic crash, whether or not they used AI.”

Google wants to save money by offloading AI detection/monitoring to forum moderators that usually aren’t paid. Unfortunately SEO experts figured out Google’s new trick and are now spamming user-content driven Websites. Google recently signed a deal with Reddit to acquire its user data to train its AI project, Gemini.

Google hates AI generated SEO and people who game its search algorithms. Google doesn’t have the resources to detect all the SEO experts, but went they are found Google extracts vengeance with deindexing and making better tools. Google released a new update to its spam policies to remove low-quality, unoriginal content made to abuse its search algorithm. The overall goal is to remove AI-generated sites from search results.

If you read between the lines, Google doesn’t want to lose more revenue and is calling out bad actors.

Whitney Grace, May 10, 2024

Written by Stephen E. Arnold · Filed Under Google, News, Search | Comments Off on Google Search Is Broken

Torrent Search Platform Tribler Works to Boost Decentralization with AI

May 7, 2024

Can AI be the key to a decentralized Internet? The group behind the BitTorrent-based search engine Tribler believe it can. TorrentFreak reports, “Researchers Showcase Decentralized AI-Powered Torrent Search Engine.” Even as the online world has mostly narrowed into commercially controlled platforms, researchers at the Netherlands’ Delft University of Technology have worked to decentralize and anonymize search. Their goal has always been to empower John Q. Public over governments and corporations. Now, the team has demonstrated the potential of AI to significantly boost those efforts. Writer Ernesto Van der Sal tells us:

“Tribler has just released a new paper and a proof of concept which they see as a turning point for decentralized AI implementations; one that has a direct BitTorrent link. The scientific paper proposes a new framework titled ‘De-DSI’, which stands for Decentralised Differentiable Search Index. Without going into technical details, this essentially combines decentralized large language models (LLMs), which can be stored by peers, with decentralized search. This means that people can use decentralized AI-powered search to find content in a pool of information that’s stored across peers. For example, one can ask ‘find a magnet link for the Pirate Bay documentary,’ which should return a magnet link for TPB-AFK, without mentioning it by name. This entire process relies on information shared by users. There are no central servers involved at all, making it impossible for outsiders to control.”

Van der Sal emphasizes De-DSI is still in its early stages—the demo was created with a limited dataset and starter AI capabilities. The write-up briefly summarizes the approach:

“In essence, De-DSI operates by sharing the workload of training large language models on lists of document identifiers. Every peer in the network specializes in a subset of data, which other peers in the network can retrieve to come up with the best search result.”

The team hopes to incorporate this tech into an experimental version of Tribler by the end of this year. Stay tuned.

Cynthia Murrell, May 7, 2024

Written by Stephen E. Arnold · Filed Under News, Search | Comments Off on Torrent Search Platform Tribler Works to Boost Decentralization with AI

Kagi Search Beat Down

April 17, 2024

This essay is the work of a dumb dinobaby. No smart software required.

People surprise me. It is difficult to craft a search engine. Sure, a recent compsci graduate will tell you, “Piece of cake.” It is not. Even with oodles of open source technology, easily gettable content, and a few valiant individuals who actually want relevant results — search and retrieval are tough to get right. The secret to good search, in my opinion, is to define a domain, preferably a technical field, identify the relevant content, obtain rights, if necessary, and then do the indexing and the other “stuff.”

In my experience, it is a good idea to have either a friend with deep pockets, a US government grant (hello, NSF, said Google decades ago), or a credit card with a hefty credit line. Failing these generally acceptable solutions, one can venture into the land of other people’s money. When that runs out or just does not work, one can become a pay-to-play outfit. We know what that business model delivers. But for a tiny percentage of online users, a subscription service makes perfect sense. The only problem is that selling subscriptions is expensive, and there is the problem of churn. Lose a customer and spend quite a bit of money replacing that individual. Lose big customers spend oodles and oodles of money replacing that big spender.

I read “Do Not Use Kagi.” This, in turn, directed me to “Why I Lost Faith in Kagi.” Okay, what’s up with the Kagi booing? The “Lost Faith” article runs about 4,000 words. The key passage for me is:

Between the absolute blasé attitude towards privacy, the 100% dedication to AI being the future of search, and the completely misguided use of the company’s limited funds, I honestly can’t see Kagi as something I could ever recommend to people.

I looked at Kagi when it first became available, and I wrote a short email to the “Vlad” persona. I am not sure if I followed up. I was curious about how the blend of artificial intelligence and metasearch was going to deal with such issues as:

Deduplication of results
Latency when a complex query in a metasearch system has to wait for a module to do it thing
How the business model was going to work: Expensive subscription, venture funding, collateral sales of the interface to law enforcement, advertising, etc..
Controlling the cost of the pings, pipes, and power for the plumbing
Spam control.

I know from experience that those dabbling in the search game ignore some of my routine questions. The reasons range from “we are smarter than you” to “our approach just handles these issues.”

Thanks, MSFT Copilot. Recognize anyone in the image you created?

I still struggle with the business model of non-ad supported search and retrieval systems. Subscriptions work. Well, they worked out of the gate for ChatGPT, but how many smart search systems do I want to join? Answer: Zero.

Metasearch systems are simply sucker fish on the shark bodies of a Web search operator. Bing is in the metasearch game because it is a fraction of the Googzilla operation. It is doing what it can to boost its user base. Just look at the wonky Edge ads and the rumored miniscule gain the additional of smart search has delivered to Bing traffic. Poor Yandex is relocating and finds itself in a different world from the cheerful environment of Russia.

Web content indexing is expensive, difficult, and tricky.

But why pick on Kagi? Beats me. Why not write about dogpile.com, ask.com, the duck thing, or startpage.com (formerly ixquick.com)? Each embodies a certain subsonic vibe, right?

Maybe it is the AI flavor of Kagi? Maybe it is the amateur hour approach taken with some functions? Maybe it is just a disconnect between an informed user and an entrepreneurial outfit running a mile a minute with a sign that says, “Subscribe”?

I don’t know, but it is interesting when Web search is essentially a massive disappointment that some bright GenX’er has not figured out a solution.

Stephen E Arnold, April 17, 2024

Written by Stephen E. Arnold · Filed Under AI, News, Search | Comments Off on Kagi Search Beat Down

« Previous Page — Next Page »

Search the site
Subscribe to Beyond Search
Feature archive
News archive

Stephen E. Arnold monitors search, content processing, text mining and related topics from his high-tech nerve center in rural Kentucky. He tries to winnow the goose feathers from the giblets. He works with colleagues worldwide to make this Web log useful to those who want to go "beyond search". Contact him at sa [at] arnoldit.com. His Web site with additional information about search is arnoldit.com.

Categories
- 3D-Printing
- Acquisition
- Advertising
- Aggregation
- AI
- Alexa
- algorithms
- Amazon
- Amazonia
- Analytics
- Appliance
- Applications
- Audio
- Augmented Reality
- Big data
- Bing
- Bitcoin
- Bitext
- Book review
- Business intelligence
- Business process
- Business strategy
- Censorship
- Cloud computing
- Company Profile
- Conferences
- Connectors
- Consulting
- Consumer
- Content processing
- Copyright
- Corporate Concerns
- Cost
- Crawl
- Crowdfunding
- cryptocurrency
- Customer support
- Cyber OSINT
- cybercrime
- cybersecurity
- Dark Web
- DarkCyber
- Data
- Data mining
- Database
- Deepfakes
- Digital Assistant
- Digital Library
- E2EE
- ECommerce
- EDiscovery
- Editorial opinion
- Education
- Emoticons
- Enterprise
- Enterprise search
- Entity extraction
- Ethics
- Facebook
- Faceted search
- Factualities
- Feature
- Federated search
- Financial
- Fogint
- Google
- Governance
- Government
- Hackers
- healthcare
- IBM Watson
- Image search
- Indexing
- Infrastructure
- Innovation
- Integration
- intelware
- Interface
- Internet
- Interview
- Investment
- law enforcement
- Legal matters
- Library automation
- Management
- Marketing
- Mathematics
- Metadata
- Microsoft
- Mobile
- Natural language processing
- News
- NGIA
- Online (general)
- Open Access
- Open source
- OSINT
- Osint Radar
- Overflight
- Palantir
- Patents
- Personnel
- Podcast
- Policeware
- Portals
- Predictive coding
- Privacy
- Profile
- Publishing
- Quotation
- Real time search
- Reference tool
- Rich media
- Robot Writer
- Search
- Search enabled applications
- search engine
- Search quality
- Security
- Semantic
- Sentiment analysis
- SEO
- SharePoint
- Short Honks
- Smart Technology
- Social
- Social Media
- software
- Statistics
- Taxonomy
- Technology
- Text analytics
- Text processing
- Tools
- Tor
- Training
- Translation
- Twitter
- Uncategorized
- Unstructured Data
- User experience
- User Interface
- Vertical search
- Video
- visualization
- Voice search
- Voice technology
- Web 3
- Web Services
- Webinar
- Windows
- Work flow
- XML
- Yahoo

Beyond Search

Consensus: A Gen AI Search Fed on Research, not the Wild Wild Web

Elastic N.V. Faces a New Search Challenge

A Familiar Cycle: The Frustration of Almost Solving the Search Problem

Publication Founded by a Googler Cheers for Google AI Search

Dexa: A New Podcast Search Engine

Ho Hum: The Search Sky Is Falling

Will Google Behave Like Telegram?

Google Search Is Broken

Torrent Search Platform Tribler Works to Boost Decentralization with AI

Kagi Search Beat Down

Search the site

Categories

Archives

Recent Posts

Meta

Share this:

Share this:

Share this:

Share this:

Share this:

Share this:

Share this:

Share this:

Share this:

Share this:

Search the site

Categories

Archives

Recent Posts

Meta