Kagi Rolls Out a Small Web Initiative

October 5, 2023

Vea4_thumb_thumb_thumb_thumb_thumb_tNote: This essay is the work of a real and still-alive dinobaby. No smart software involved, just a dumb humanoid.

Recall the early expectations for the Web: It would be a powerful conduit for instant connection and knowledge-sharing around the world. Despite promises to the contrary, that rosy vision has long since given way to commercial interests’ paid content, targeted ads, bots, and data harvesting. Launched in 2018, Kagi offers a way to circumvent those factors with its ad-free, data protecting search engine—for a small fee, naturally. Now the company is promoting what it calls the Kagi Small Web initiative. We learn from the blog post:

“Since inception, we’ve been featuring content from the small web through our proprietary Teclis and TinyGem search indexes. This inclusion of high-quality, lesser-known parts of the web is part of what sets Kagi’s search results apart and gives them a unique flavor. Today we’re taking this a step further by integrating Kagi Small Web results into the index.”

See the write-up for examples. Besides these insertions into search results, one can also access these harder-to-find sources at the new Kagi Small Web website. This project displays a different random, recent Web page with each click of the “Next Post” button. Readers are also encouraged to check out their experimental Small YouTube, which we are told features content by YouTube creators with fewer than 4,000 subscribers. (Although as of this writing, the Small YouTube link supplied redirects right back to the source blog post. Hmm.)

The write-up concludes with these thoughts on Kagi’s philosophy:

“The driving question behind this initiative was simple yet profound: the web is made of millions of humans, so where are they? Why do they get overshadowed in traditional search engines, and how can we remedy this? This project required a certain leap of faith as the content we crawl may contain anything, and we are putting our reputation on the line vouching for it. But we also recognize that the ‘small web’ is the lifeblood of the internet, and the web we are fighting for. Those who contribute to it have already taken their own leaps of faith, often taking time and effort to create, without the assurance of an audience. Our goal is to change that narrative. Together with the global community of people who envision a different web, we’re committed to revitalizing a digital space abundant in creativity, self-expression, and meaningful content – a more humane web for all.”

Does this suggest that Google Programmable Search Engine is a weak sister?

Cynthia Murrell, October 5, 2023

This Dinobaby Likes Advanced Search, Boolean Operators, and Precision. Most Do Not

August 28, 2023

Vea4_thumb_thumb_thumb_thumb_thumb_tNote: This essay is the work of a real and still-alive dinobaby. No smart software involved, just a dumb humanoid.

I am not sure of the chronological age of the author of “7 Reasons to Replace Advanced Search with Filters So Users Can Easily Find What They Need.” From my point of view, the author has a mental age of someone much younger than I. The article identifies a number of reasons why “advanced search” functions are lousy. As a dinobaby, I want to be crystal clear: A user should have an interface which allows that user to locate the information required to respond in a useful way to a query.

8 24 sliding board

The expert online searcher says with glee, “I love it when free online search services make finding information easy. Best of all is Amazon. It suggests so many things I absolutely need.” Hey, MidJourney, thanks for the image without suggesting Mother MJ okay my word choice. “Whoever said, ‘Nothing worthwhile comes easy’ is pretty stupid,” shouts or sliding board slider.

Advanced search in my dinobaby mental space means Boolean operators like AND, OR, and NOT, among others. Advanced search requires other meaningful “tags” specifically designed to minimize the ambiguity of words; for example, terminal can mean transportation or terminal can mean computing device. English is notable because it has numerous words which make sense only when a context is provided. Thus, a Field Code can instruct the retrieval system to discard the computing device context and retrieve the transportation context.

The write up makes clear that for today’s users training wheels are important. Are these “aids” like icons, images, bundles of results under a category dark patterns or assistance for a user. I can only imagine the push back I would receive if I were in a meeting with today’s “user experience” designers. Sorry, kids. I am a dinobaby.

I really want to work through seven reasons advanced search sucks. But I won’t. The number of people who know how to use key word search is tiny. One number I heard when I was a consultant to a certain big search engine is less than three percent of the Web search users. The good news for those who buy into the arguments in the cited article is that dinobabies will die.

Is it a lack of education? Is it laziness? Is it what most of today’s users understand?

I don’t know. I don’t care. A failure to understand how to obtain the specific information one requires is part of the long slow slide down a descent gradient. Enjoy the non-advanced search.

Stephen E Arnold, August 28, 2023

Academic Research Resources: Smart Software Edition

August 8, 2023

Vea4_thumb_thumb_thumb_thumb_thumb_tNote: This essay is the work of a real and still-alive dinobaby. No smart software involved, just a dumb humanoid.

One of my research team called “The Best AI Tools to Power Your Academic Research.”  The article identifies five AI infused tools; specifically:

  • ChatPDF
  • Consensus
  • Elicit.org
  • Research Rabbit
  • Scite.ai

Each of the tools is described briefly. The “academic research” phrase is misleading. These tools can provide useful information related to inventors and experts (real or alleged), specific technical methods, and helpful background or contest for certain social, political, and intellectual issues.

If you have access to a LLM question-and-answer system, experimenting with article summaries, lists of information, and names of people associated with a particular activity — give a ChatGPT system a whirl too.

Stephen E Arnold, August 8, 2023

AI-Search Tool Talpa Burrows Into Library Catalogues

July 19, 2023

Vea4_thumb_thumb_thumb_thumb_thumb_t[1]Note: This essay is the work of a real and still-alive dinobaby. No smart software involved, just a dumb humanoid.

For a few years now, libraries have been able to augment their online catalogue with enrichment services from Syndetics Unbound, which adds details and imagery to each entry. Now the company is incorporating new AI capabilities, we learn from its write-up, “Introducing Talpa Search.” Talpa is still experimental and is temporarily available to libraries already using Syndetics Unbound.

7 15 biijwirn

A book lover in action. Thanks MidJourney. You made me more appealing than I was in the 1951 when I got kicked out of the library for reading books for adults, not stuff about Freddy the Pig.

Participating libraries will get a year of the service for free. We cannot know just how much they will be saving, though, since the pricing remains a mystery. Writer Tim Spalding describes how Talpa works:

“First, Talpa queries large language models (from Claude AI and ChatGPT) for books and other media. Critically, every item is checked against true and authoritative bibliographic data, solving the problem of invented answers (called ‘hallucinations’) that such models can fall into. Second, Talpa uses the natural-language abilities of large language models to parse and understand queries, which are then answered using traditional library data. Thus a search for ‘novels about World War II in France’ is broken down into subjects and tags and answered with results from the library’s collection. Our authoritative book data comes from Syndetics Unbound, Bowker and LibraryThing. Surprisingly, Talpa’s ability to find books by their cover design isn’t powered by AI at all, but by the effort of thousands of book lovers who have played LibraryThing’s CoverGuess cover-tagging game since 2010!”

Interesting. If you don’t happen to be part of a library using Syndetics, you can try Talpa out at one of the three libraries linked to in the post. The tool sports a cute mole mascot and, to add a bit of personality, supplies mole facts beneath the search bar. As with many AI tools, the functionality has plenty of room to grow. For example, my search for “weaving velvet” did return a few loom-centered books scattered through the results but more prominently suggested works of fiction or philosophy that simply contained “velvet” in the title. (Including, adorably, several versions of “The Velveteen Rabbit.”) The write-up does not share when the tool will be available more widely, but we hope it will be more refined when it is. Is it AI? Isn’t everything?

Cynthia Murrell, July 19, 2023

Amazon Is Winning the Product Search Derby… for Now

July 12, 2023

Vea4_thumb_thumb_thumb_thumb_thumb_t[1]Note: This essay is the work of a real and still-alive dinobaby. No smart software involved, just a dumb humanoid.

Google cannot be happy about these numbers. We learn from a piece at Search Engine Land that now “50% of Product Searches Start on Amazon.” That is even worse for the competition than previously predicted. In fact, Google’s share of this market has slipped to less than a third at 31.5%. What’s Google’s solution to this click loss? Higher ad pricing? Or maybe an even higher ad-to-real content ratio?

7 9 search race

The search racers are struggling to win traffic related to products. What has Amazon accomplished? Has Google’s vehicle lost power? What about Microsoft, a company whose engine is Bing-ing?

We also learn just 14% of respondents start their searches at retail or brand websites, while social media and review sites each capture a measly 2%. But that could change as Generation Z continues to age into independent shoppers. That group is the most likely to launch searches from social media. They are also most inclined to check online reviews. Reviews with photos are especially influential. Writer Danny Goodwin cites a recent Pew survey as he writes:

“Reviews and ratings can make or break a sale more than any other factor, including product price, free shipping, free returns and exchanges, and more. Overall, 77% of respondents said they specifically seek out websites with reviews – and this number was even higher for Gen Z (87%) and millennials (81%). Ratings without accompanying reviews are considered untrustworthy by 56% of survey respondents. Where people read reviews and ratings:

  • Amazon: 94%
  • Retail websites (e.g., Target, Wal-Mart): 91%
  • Search engines: 70%
  • Brand websites (the brand that manufactures the product: 68%
  • Independent review sites: 40%

User-generated photos and videos gain value. Sixty percent of consumers looked at user-generated images or videos when learning about new products.

  • 77% of respondents said they trust customer photos and videos.
  • 53% said user-generated photos and videos from previous customers impacted their decision whether to purchase a product.”

So there you have it—if you have a product to market online, best encourage reviews. With pics, or it didn’t happen. Videos are a significant marketing factor. What happens if Zuck’s Threads pushes into product search, effectively linking text promotions with Instagram? And the Google? Let’s ask Bard?

Cynthia Murrell, July 12, 2023

Scinapse Is A Free Academic-Centric Database

July 11, 2023

Vea4_thumb_thumb_thumb_thumb_thumb_t[1]Note: This essay is the work of a real and still-alive dinobaby. No smart software involved, just a dumb humanoid.

Quality academic worthy databases are difficult to locate outside of libraries and schools. Google Scholar attempted to qualify as an alternative to paywalled databases, but it returns repetitive and inaccurate results. Thanks to AI algorithms, free databases improved, such as Scinapse.

Scinapse is designed by Pluto and it is advertised as the “researcher’s favorite search engine. Scinapse delivers accurate and updated research materials in each search. Many free databases pull their results from old citations and fail to include recent publications. Pluto promises Scinapse delivers high-performing results due to its original algorithm optimized for research.

The algorithm returns research materials based on when it was published, how many times it was citied, and how impactful a paper was in notable journals. Scinapse consistently delivers results that are better than Google Scholar. Each search item includes a complete citation for quick reference. The customized filters offer the typical ways to narrow or broaden results, including journal, field of study, conference, author, publication year, and more.

People can also create an account to organize their research in reading lists, share with other scholars, or export as a citation list. Perhaps the most innovative feature is the paper recommendations where Scinapse sends paper citations that align with research. Scinapse aggregates over 48,000 journals. There are users in 196 countries and 1,130 reputable affiliations. Scinapse’s data sources include Microsoft Research, PubMed, Semantic Scholar, and Springer Nature.

Whitney Grace, July 11, 2023

In the Midst of Info Chaos, a Path Identified and Explained

July 10, 2023

Vea4_thumb_thumb_thumb_thumb_thumb_t[1]Note: This essay is the work of a real and still-alive dinobaby. No smart software involved, just a dumb humanoid.

The Thread – Twitter spat in the midst of BlueSky and Mastodon mark a modest change in having one place to go for current information. How does one maintain awareness with high school taunts awing, Mastodon explaining how easy it is to use, and BlueSky doing its deep gaze thing?

One answer and a quite good one at that appears in “RSS for Post-Twitter News and Web Monitoring.” The author knows quite a bit about finding information, and she also has the wisdom to address me as “dinobaby.” I know a GenZ when I get an email that begins, “Hey, there.” Trust me. That salutation does not work as the author expects.

In the cited article, you will get useful information about newsfeeds, screenshots, and practical advice. Here’s an example of what’s in the excellent how to:

If you want to check a site for RSS feeds and you think it might be a WordPress site, just add /feed/ to the end of the domain name. You might get a 404 error, but you also might get a page full of information!

There are more tips. Just navigate to Research Buzz, and learn.

This dinobaby awards one swish of its tail to Tara Calishain. Swish.

Stephen E Arnold, July 10, 2023

Neeva: Is This Google Killer on the Run?

May 18, 2023

Vea4_thumb_thumb_thumb_thumb_thumb_tNote: This essay is the work of a real and still-alive dinobaby. No smart software involved, just a dumb humanoid.

Sometimes I think it is 2007 doing the déjà vu dance. I read “Report: Snowflake Is in Advanced Talks to Acquire Search Startup Neeva.” Founded by Xooglers, Neeva was positioned to revolutionize search and generate subscription revenue. Along the highway to the pot of gold, Neeva would deliver on point results. How did that pay for search model work out?

According to the article:

Snowflake Inc., the cloud-based data warehouse provider, is reportedly in advanced talks to acquire a search startup called Neeva Inc. that was founded by former Google LLC advertising executive Sridhar Ramaswamy.

Like every other content processing company I bump into, Neeva was doing smart software. Combine the relevance angle with generative AI and what do you get? A start up that is going to be acquired by a firm with some interesting ideas about how to use search and retrieval to make life better.

Are there other search outfits with a similar business model? Sure, Kagi comes to mind. I used to keep track of start ups which had technology that would provide relevant results to users and a big payday to the investors. Do these names ring a bell?

Cluuz
Deepset
Glean
Kyndi
Siderian
Umiboza

If the Snowflake Neeva deal comes to fruition, will it follow the trajectory of IBM Vivisimo. Vivisimo disappeared as an entity and morphed into a big data component. No problem. But Vivisimo was a metasearch and on-the-fly tagging system. Will the tie up be similar to the Microsoft acquisition of Fast Search & Transfer. Fast still lives but I don’t know too many Softies who know about the backstory. Then there is the HP Autonomy deal. The acquisition is still playing out in the legal eagle sauna.

Few care about the nuances of search and retrieval. Those seemingly irrelevant details can have interesting consequences. Some are okay like the Dassault Exalead deal. Others? Less okay.

Stephen E Arnold, May 18, 2023

Am I a Moron Because I Use You.com?

May 10, 2023

Vea4_thumb_thumb_thumb_thumb_thumb_tNote: This essay is the work of a real and still-alive dinobaby. No smart software involved, just a dumb humanoid.

“Only Morons Use ChatGPT As a Substitute for Google” is a declarative statement. Three words strike me as important in the title of the Lifehacker (an online publication).

First, “morons.” A moron according to TheFreedictionary.com citation is: A city in Eastern Argentina although it has the accented ó. On to the next definition which is “A person who is considered foolish or stupid.” I think this is closer to the mark. I am not comfortable invoking the third definition because it aims denotative punch a a person with a person having a mental age of from seven to 12. I am 78, so let’s go with “foolish or stupid.” I am in that set.

Second, “ChatGPT.” I think the moniker can apply specifically to the for-fee service of OpenAI. It is possible that “ChatGPT” stands for an entire class of generative software. I tried to make a list of a who’s who in generative software and abandoned the task. Quite a few companies are in the game either directly like the aforementioned OpenAI or a bandwagon of companies joyfully tallied by ProductWatch.com and a few LinkedIn contributors. I think the idea is that ChatGPT outputs content which is either derivative (a characteristic of a machine eating other people’s words and images) or hallucinatory (a feature of software which can go off the rails and output like a digital Lewis Carroll galumphing around a park in which young females frolic).

Third, “Google.” My hunch is that the author is an expert online searcher who like many open source intelligence professionals rely on the advertising-supported Google search for objective, on-point answers. Oh, my, that’s quite a reliable source of information. I want to point out that Google focuses on revenue-generation from advertising. Accuracy of results often has little connection to the user’s query. My interpretation of the word “Google” is that Google is good, probably better than “ChatGPT” in providing answers designed to meet the needs of users who may not read above the 9th grade level, struggle with derivatives, and cannot name the capital of Tasmania. (It is Hobart, by the way.)

I am on the fence with the word “only.” I am not comfortable with categorical affirmatives. Given the context of the article and the fact that Google is the Web search engine of choice (conscious or manipulated) for 90 percent of people in North America and Western Europe, I can understand why the field of view is narrow. An expert with regard to Google knows more and more about less and less.

Why is ChatGPT presented as the yan to Google’s yang? The write up says:

If you can’t fact-check the bot, the answers it gives are useless.

Okay, so people fact-check Google outputs. What does Google present with regard to Larry Page and his alleged interactions with Mr. Epstein, the late Mr. Epstein I should say. Run this query Google Page Epstein and this is what the system presented to me on May 6, 2023:

image

Are these stories accurate? Is CNBC the most relevant story about a lawsuit involving JPMorgan Chase and Jeffrey Epstein? What about the New York Post article “Virgin Islands Struggles to Subpoena Google’s Larry Page in Jeffrey Epstein Lawsuit against JPMorgan.” Or the tech insider outfit Gizmodo which offers “Google’s Larry Page Could Be Served in Jeffrey Epstein Case.” There are citations to the British newspaper Telegraph and one from Law & Crime. There is a Yahoo story and a Daily Beast article.

Now what does You.com offer me when I query Google Page Epstein?

There have been recent news articles reporting that the US Virgin Islands has issued subpoenas to Google co-founder Larry Page in connection with a lawsuit against JPMorgan Chase over the Jeffrey Epstein sex trafficking case 1. The lawsuit alleges that JPMorgan Chase facilitated Epstein’s trafficking scheme by continuously offering financial support to Epstein despite knowing about the illegal activities conducted at Little St. James, a private island owned by Epstein. The subpoenas seek documents and communications related to JPMorgan’s banking relationship with Epstein. Sergey Brin, another Google co-founder, has also reportedly been subpoenaed. Despite the subpoenas, it appears that the US Virgin Islands is struggling to serve Page with the legal documents. A recent ruling allows Page to be served with the subpoenas via his attorney in California. cnbc1

I would suggest that the Google citations provide a list, no indication of which source is more or less highly regarded for accuracy. Google wants me to click on one or more of the links, ingest the content of each article, and then synthesize the items of information which strike me as on the money. You.com on the other hand provides me with the bare bones of the alleged involvement with a person who like Lewis Carroll may have had an interest in hanging out around a park on a sunny Saturday afternoon. Catching some rays and perhaps coming up with new ideas are interpretations of such as action by a lawyer hired to explain the late and much lamented Mr. Epstein.

So which is it? The harvesting of buckwheat the old-fashioned way or the pellet of information spat out in a second or two?

I think the idea is that morons are going to go the ChatGPT-like route. Wizards and authors of online “real” news articles want to swing that sickle and relive the thrill of the workers in Vincent van Gogh’s “The Harvest.”

The article says:

you can’t tell whether an AI-generated fact is true or not by the way the text looks; it’s designed to look plausible and correct. You have to fact-check it.

Does one need to fact-check what Google spits out? What about the people who follow Google Maps’s instructions and drive off a cliff? What about the links in Google Scholar to papers with non-reproducible results?

Here’s the conclusion to the write up:

So if you want to use ChatGPT to get ideas or brainstorm places to look for more information, fine. But don’t expect it to base its answers on reality. Even for something as innocuous as recommending books based on your favorites, it’s likely to make up books that don’t even exist.

I like that “don’t even exist.” Google Bard would never do that. Google management would never fire a smart software executive who points out that Google’s smart software is biased. Google would never provide search results that explain how to steal copyright protected software. Well, maybe just one time like this:

image

Oh, no. Wonky software would never ever do that but for Google’s results via YouTube for the query “Magix Vegas crack.” Now who is a moron? Perhaps an apologist for Google?

Stephen E Arnold, May 10, 2023

Divorcing the Google: Legal Eagles Experience a Frisson of Anticipation

April 24, 2023

No smart software has been used to create this dinobaby’s blog post.

I have poked around looking for a version or copy of the contract Samsung signed with Google for the firms’ mobile phone tie up. Based on what I have heard at conferences and read on the Internet (of course, I believe everything I read on the Internet, don’t you?), it appears that there are several major deals.

The first is the use of and access to the mindlessly fragmented Android mobile phone software. Samsung can do some innovating, but the Google is into providing “great experiences.” Why would a mobile phone maker like Samsung allow a user to manage contacts and block mobile calls without implementing a modern day hunt for gold near Placer.

The second is the “suggestion” — mind you, the suggestion is nothing more than a gentle nudge — to keep that largely-malware-free Google Play Store front and center.

The third is the default search engine. Buy a Samsung get Google Search.

Now you know why the legal eagles a shivering when they think of litigation to redo the Google – Samsun deal. For those who think the misinformation zipping around about Microsoft Bing displacing Google Search, my thought would be to ask yourself, “Who gains by pumping out this type of disinformation?” One answer is big Chinese mobile phone manufacturers. This is Art of War stuff, and I won’t dwell on this. What about Microsoft? Maybe but I like to think happy thoughts about Microsoft. I say, “No one at Microsoft would engage in disinformation intended to make life difficult for the online advertising king. Another possibility is Silicon Valley type journalists who pick up rumors, amplify them, and then comment that Samsung is kicking the tires of Bing with ChatGPT. Suddenly a “real” news outfit emits the Samsung rumor. Exciting for the legal eagles.

The write up “Samsung Can’t Dump Google for Bing As the Default Search Engine on Its Phones” does a good job of explaining the contours of a Google – Samsung tie up.

Several observations:

First, the alleged Samsung search replacement provides a glimpse of how certain information can move from whispers at conferences to headlines.

Second, I would not bet against lawyers. With enough money, contracts can be nullified, transformed, or left alone. The only option which disappoints attorneys is the one that lets sleeping dogs lie.

Third, the growing upswell of anti-Google sentiment is noticeable. That may be a far larger problem for Googzilla than rumors about Samsung. Perceptions can be quite real, and they translate into impacts. I am tempted to quote William James, but I won’t.

Net net: If Samsung wants to swizzle a deal with an entity other than the Google, the lawyers may vibrate with such frequency that a feather or two may fall off.

Stephen E Arnold, April 24, 2023

« Previous PageNext Page »

  • Archives

  • Recent Posts

  • Meta