Phi Beta Iota Interviews Stephen E Arnold about Shaped Web Search Results

August 29, 2018

Robert David Steele, publisher of the Phi Beta Iota blog, interviewed Stephen E Arnold about allegations related to Google search results. The interview reveals that some Web search systems make it possible to modify search results to return specific information. The example Stephen gives comes from the FirstGov.gov US government search system powered in the early 2000s by Fast Search & Transfer.

Steele highlighted this statement from the interview:

“There is not enough money available to start over at Google. After two decades of fixing, tweaking, and enhancing, Google search is sort of chugging along. I think it is complex and swathed like a digital mummy in layers of code.”

You can read the full text of the interview titled “Robert Steele: An Interview with Stephen E. Arnold on Google and Google Search — How the Digital Mummy Might Manipulate Search.”

The three monographs Stephen wrote about Google are no longer in print. However, he does have fair copies (pre publication drafts) of the manuscripts. If you are interested in these reports, write benkent2020 at yahoo dot com.

Kenny Toth, August 29, 2018

Finding Information Is Difficult: How about Books to Read?

August 29, 2018

For a long time, search has been dominated by the big names in the business and when anyone claims they might be a threat to Google or Bing it’s usually laughable. However, niche engines are beginning to really fill a void that the big dogs can’t. We discovered more from a recent Make Use Of story, “The 11 Best Sites for Finding What Books to Read Next.”

The most interesting was about Gnooks, which said:

“Gnooks is probably the simplest of these sites to use. You can enter up to three author’s names, and Gnooks will recommend another author you might like.

We noted:

“The interface is clean and distraction-free, but if you want to find out more about the recommended authors, you’ll have to take your search elsewhere.”

It’s a weird reversal to how the Internet originally felt. Everything was pigeonholed just like this back then and maybe we had something right. Aside from books, there are also niche engines for travel and, our personal favorite, to see what movies are streaming on what sites. This is a welcome service. Niche finding sites remain useful and underscore the limitations of the search superstore approach.

Patrick Roland, August 29, 2018

Internet Search Engines that Reach Past Bing or Google Search

August 27, 2018

An article at Kimallo shares a roster of their ten “Most Valuable Deep Web Search Engines.” Billed as a list of search engines that plumb depths not found in a Google or Bing search, this collection is indeed that. One could wish the Dark Web and the Deep Web were not conflated in the piece’s introduction, but anyone who is fuzzy on the difference can click here for clarification. The list is an assortment of search engines that tap into the Deep and/or Dark Web to different degrees in different ways. Only one, “not Evil,” uses Tor, about which we’re told:

“Unlike other Tor search engines, not Evil is not for profit. The cost to run not Evil is a contribution to what one hopes is a growing shield against the tyranny of an intolerant majority. Not Evil is another search engine in the Tor network. According to its functionality and quality it is highly competitive with the competitors. There is no advertising and tracking. Due to thoughtful and continuously updated algorithms of search it is easy to find the necessary goods, content or information. Using not Evil, you can save a lot of time and keep total anonymity. The user interface is highly intuitive. It should be noted that previously this project was widely known as TorSearch.”

The other nine entries include people-prying tools pipl and mylife; metasearch engines Yippy, Fazzle, and privacy-centric DuckDuckGo; SurfWax, which seeks to turn search into a “visual process”; StartPage, another platform emphasizing privacy; the Wayback Machine, an archive of open web pages; and Google Scholar, which can be configured to access the NSCU Libraries’ databases and journal subscriptions. I’ll add that Beyond Search pointed out Ichidan last autumn, a search engine designed to look up sites hosted through the Tor network. Though one should not rely on the Kimallo article to distinguish between these general Web classifications, anyone who would like to go beyond the reach of Bing or Google may want to explore these options.

One question: Do metasearch systems go “beyond” Google? Some here at Beyond Search believe metasearch engines are recyclers, not indexes which point to content not included in primary spidering and indexing systems.

Cynthia Murrell, August 27, 2018

Web Search with Privacy: SearX

August 24, 2018

For far too long we have been living in the Wild West of search: there are too few rules and personal data has been far too fluid. While we wait for the Googles of the world to change their policies (fat chance!) the time has come to find alternatives for those of us who care about keeping their privacy a top priority. We learned more about this revolution from a Make Use Of story, “Avoid Google and Bing: 7 Alternative Search Engines That Value Privacy.”

According to the story:

“Functionally, SearX is a metasearch engiyne, meaning it aggregates data from a number of other search engines then provides you with the best mix available. Results from several of the other search engines on this list—including DuckDuckGo, Qwant, and StartPage—are available. You can customize the engines that SearX uses to find results in the Preferences menu.”

Is a new search engine the answer? Probably not likely. In another time, we might point to the idea that the world has room for more search engines, but with the rise of voice search and the amount of money needed to research this type of thing, the odds of a new search engine taking over for Google or the like is very much impossible. There are other privacy centric Web search systems; for example, Unbubble.

The question becomes, “Are these systems private, or are the data available to authorities with the proper documentation?” Marketing is different from privacy for some people.

Patrick Roland, August 24, 2018

DuckDuck Go and Its View of Google

August 16, 2018

A post at the Search Engine Journal reproduces a series of tweets—“DuckDuckGo Blasts Google for Anti-Competitive Search Behavior,” they report. Writer Matt Southern introduces the captured tweets, noting that DuckDuckGo seems to have been prompted by the record $5 billion fine recently levied on Google by the EU for antitrust violations. Here’s what DuckDuckGo had to say about specific ways Googley practices have affected them:

“We welcome the EU cracking down on Google’s anti-competitive search behavior. We have felt its effects first hand for many years and has led directly to us having less market share on Android vs iOS and in general mobile vs desktop.

We noted:

“Up until just last year, it was impossible to add DuckDuckGo to Chrome on Android, and it is still impossible on Chrome on iOS. We are also not included in the default list of search options like we are in Safari, even though we are among the top search engines in many countries.

And this statement was interesting:

“The Google search widget is featured prominently on most Android builds and is impossible to change the search provider. For a long time it was also impossible to even remove this widget without installing a launcher that effectively changed the whole way the OS works. Their anti-competitive search behavior isn’t limited to Android. Every time we update our Chrome browser extension, all of our users are faced with an official-looking dialogue asking them if they’d like to revert their search settings and disable the entire extension.”

Google owns the domain Duck.com, which redirects to the Google home page and may confuse some DuckDuckGo users. Southern notes the privacy-centric search engine continues to dog Google on Twitter; for example, they recently called it a “myth” that users cannot be tracked when using (Google-owned) Chrome in Incognito mode and linked to a post that details why their process is far more effective at protecting user privacy. I suggest the curious navigate to that resource for the technical details.

BeyondSearch believes that DuckDuckGo is a metasearch system with some unique content. Depending on one’s point of view, there may be significant differences between DuckDuckGo and primary Web indexing systems like Exalead, Qwant, or Yandex. Running the same query on different systems is often a useful way to get a sense of what is in an index and what is not.

Cynthia Murrell, August 14, 2018

 

Code Search Capability Offers New Options

August 13, 2018

The days of sifting through code like a panhandler looking for a sparkly gold nugget are over. Innovative technologies and groundbreaking partnerships are making the infinite numbers of binary code just as searchable as any word combo in Google. One such pairing recently came across our desk in a blog post from Elastic, “Welcome Insight.io to the Elastic Team.”

According to the report:

”Code search capability also aligns with our vision for solutions-based offerings: by using and combining components of the Elastic Stack in a very precise way, we can deliver focused and intuitive experiences that solve specific pain points, with little to no overhead for the user. This enables delightful user experiences right out-of-the-box, with the initial hurdles and optimizations already taken care of.”

These two will make for a powerful partnership thanks to code search, but they are far from the only ballgame in town. In fact, some familiar names are popping up in this realm, including Bing, who has been dying for an angle to beat out Google for years. Jumping into code search early might just be that niche, which would be a shocking turnabout for the red headed step child of search. Worth a watch.

Patrick Roland, August 13, 2018

Google and Really, Like Cool Expert Search

August 10, 2018

To say I was surprised by Google’s celebrity search comes close to the truth. I am not sure. I think I will ask a celebrity if I were surprised or just anchored in the past. Don’t know about the really, like cool approach to getting information online? Navigate to “ Google’s New Celebrity Video App Is Basically AMA for Search.” I learned:

…The search giant [that would be China bound Google, of course] released a new app called Cameos, which lets celebs record vertical full-screen video answers to commonly searched-for questions about them.

Public figures include athletes, pop stars, and (I assume) technical superstars like Messrs. Brin and Page.

The celebrities can choose what questions to answer, record those answers, and make them available to a person who asks a question about the global Gan-Gross-Prasad conjecture. Tough luck if a movie star does not know the answer. I mean like who cares? Google can have Wei Zhang record an answer for the users of this new service.

From my point of view, I would like to enter a Boolean query with date limiters and get a results list with the “Date last indexed” displayed. I would like to have access to urls for PDFs. I would like, in short, to have a search system which returned sort of relevant results.

I assume I can ask Taylor Swift type people to help me out here. Celebrity is expertise it seems.

Stephen E Arnold, August 10, 2018

Elastic Teams With Startup Insight.io for Semantic Search

August 10, 2018

We’ve learned that a Search company we’ve been following with some interest, Elastic, is pairing with a Palo Alto-based startup to develop and integrate semantic search tools. Computer Weekly shares some details in, “Elastic Puts ‘Semantic Code Search’ Into Stack With Insight.io.” Writer Adrian Bridgwater tells us:

“Known for its Elasticsearch and Elastic Stack products, Elastic insists that Insight.io’s technology is ‘highly complementary’ to other Elastic use cases and solutions—indeed, Insight.io is built on the Elastic Stack. Insight.io provides an interface to search and navigate the source code that is said to ‘go beyond’ simple free text search. Current programming language support includes C/C++, Java, Scala, Ruby, Python, and PHP. This ‘beyond text search’ function gives developers the ability to search for code pertaining to specific application functionality and dependencies. Essentially it provides IDE-like code intelligence features such as cross-reference, class hierarchy and semantic understanding. The impact of such functionality should stretch beyond exploratory question-and-answer utility, for example, enabling more efficient onboarding for new team members and reducing duplication of work for existing teams as they scale.”

According to Elastic’s CEO, integration of the technology will be familiar to anyone who observed how they did it with past acquisitions, like Opbeat and Prelert. We’re also assured that all of Insight.io’s workers are being welcomed into Elastic’s development fold. Bridgwater notes that, with the startup’s Beiging-based engineering team, Elastic now has its first “formal” dev team located in China. Founded in 2012, Elastic is now based in Mountain View, California.

Cynthia Murrell, August 10, 2018

Qwant Search Now Integrated into Vivaldi Browser

July 27, 2018

We notice that French search system Qwant has been working to expand its reach. Earlier this year, the French delegation to China suggested that country consider implementing Qwant. We observed at the time that this was an interesting direction for the privacy-centered platform. Now, we learn the search engine has made its way into a rising browser from the post, “Vivaldi Update Integrates Qwant Search Engine” at gHacks. Writer Martin Brinkmann reports:

“Qwant promises that it ‘does not collect data about its users when they search,’ and that it does not use ‘any cookie nor any tracking device’ to track the browsing habits of users or create tracking profiles. The search engine does not put searchers into filter bubbles either as users from the same region will get the same set of results when they search for the same terms. You can select Qwant with a click on the small down arrow icon next to the search symbol in the search bar, or by opening the Search preferences vivaldi://settints/search/. There you can make Qwant the default search engine if you want and enable use as a private search engine. Last but not least, you may also use the nickname q to run searches on Qwant from Vivaldi’s address bar. Just type q searchterm to do so.”

Billed as a browser for “power users,” Vivaldi tends to put out a new release about 4 times a year. Its version 1.15 was released in April, and the inclusion of Qwant takes place in the most recent of three updates pushed out since then. See the write-up for a list of the other improvements. Vivaldi still captures just a small portion internet search traffic around the world, but is clearly working to grow those numbers. Founded in 2014, Vivaldi is based in Oslo, Norway. The Paris-based Qwant was founded in 2011 and launched its browser in 2013. Qwant incorporates some  of the spirit of the Pertimm search and retrieval system. Pertimm was, shall we say, quirky.

Cynthia Murrell, July 27, 2018

Insight into Google Image Search

July 22, 2018

I read “This Is What Happens When You Google the Word “Idiot.” The insight pivots on a query sent to Google Images for the word “idiot.” The results presented images of the US president. The same query fed to Bing generates a set of results without the image of Donald Trump. Here’s the explanation about the “why” of these results:

Google states that “[image search] analyzes the text on the page adjacent to the image, the image caption and dozens of other factors to determine the image content.” Added to that, Google uses sophisticated algorithms to remove duplicate images and ensure that the best quality images are presented first in your results. What this means is that whoever writes an article determines (mostly, there are other factors too) whether an image appears in Google Image Search results or not. This partly depends on the keywords they use adjacent or in the caption of the image, not necessarily the “content” of the image. Also, Google indexes the images on a website the same way it indexes web pages, by crawling across the Internet periodically. A quick investigation of the pages in the search results for the word “idiot” proves this to be true. In each of the links where Donald Trump’s image appears, the word “idiot” appears as a keyword and in most cases close to his image or sometimes in the caption.

Seems simple enough. Word plus image equals relevance.

Stephen E Arnold, July 21, 2018

« Previous PageNext Page »

  • Archives

  • Recent Posts

  • Meta