DuckDuckGo: Nibbling at the Little Toe of Googzilla

January 25, 2017

I like DuckDuckGo. I fire queries at the system and see if there are items I have missed after I have checked out Qwant.com, Unbubble.eu, Giburu.com, Ixquick (now StartPage.com), Exalead Search, Yandex, and, oh, I almost forgot, the Google.

I read “DuckDuckGo Hits Milestone 14 Million Searches in a Single Day.” I learned:

DuckDuckGo revealed it has hit a milestone of 14 million searches in a single day. In addition, the search engine is celebrating a combined total of 10 billion searches performed, with 4 billion searches conducted in December 2016 alone. For a niche search engine that many people don’t know exists, that’s some notable year-over-year growth. Around this same time last year, DuckDuckGo was serving 8–9 million searches per day on average.

Just to keep DuckDuckGo’s achievement in perspective, Internet Live Stats says that Googzilla handles 3.5 billion searches per day. Our research suggests that there is room for Web search systems like DuckDuckGo to grow. About half of those with Internet access don’t run queries. Hey, that Facebook thing is a big deal. Also, there are some folks who are looking to expand their search horizons.

So, on a per day basis 14 million searches for DuckDuckGo and 3.5 billion searches for the GOOG.

Stephen E Arnold, January 25, 2017

Bing Gets Nostalgic

January 25, 2017

In my entire life, I have never seen so many people who were happy to welcome in a New Year.  2016 will be remembered for violence, political uproar, and other stuff that people wish to forget.  Despite the negative associations with 2016, other stuff did happen and looking back might offer a bit of nostalgia for the news and search trends of the past year.  On MSFT runs down a list of what happened on Bing in 2016,“Check Out The Top Search Trends On Bing This Past Year.”

Rather than focusing on a list of just top searches, Bing’s top 2016 searches are divided into categories: video games, Olympians, viral moments, tech trends, and feel good stories.  More top searches are located over at Bing page.  However, on the top viral trends it is nice to see that cat videos have gone down in popularity:

Ryder Cup heckler

Villanova’s piccolo girl

Powerball

Aston Martin winner

Who’s the mom?

Evgenia Medvedeva

Harambe the gorilla

#DaysoftheWeek

Cats of the Internet

Pokemon Go

On a personal level, I am surprised that Harambe the gorilla outranked Pokemon Go.  Some of these trends I do not even remember making the Internet circuit and I was on YouTube and Reddit for all of 2016.  I have been around enough years to recognize that things come and go and 2016 might have come off as a bad year for many, in reality, it was another year.  It also did not forecast doomsday.  That was back in 2000, folks.  Get with the times!

Whitney Grace, January 25, 2017

Searchy Automates Your Search Parameters

January 25, 2017

The article on FileForum Beta News titled Searchy for Windows 0.5.1 promises users the ability to gain more control over their search parameters and prevent wasted time on redundant searches.  By using search scopes, categories, and search templates, Searchy claims to simplify and organize search. The service targets users who tend to search for similar items all day, and makes it easier for those users to find what they need without all that extra typing. The article goes into more detail,

Your daily routine consists of lots repetitive searches? With Searchy you can automate that. Just write a template for similar search queries and stop typing the same things over and over… Search using Google’s and Bing’s web, image, video and news search engines. Often performing searches on same websites? Spending much time on advanced search filters in Google or Bing? Searchy will simplify that too. Just add scopes for the websites and search filters, and use them like a boss.

Searchy was developed by freelance developer Alex Kaul, who found that entering the same phrase over and over in Google was annoying. By automating the search phrase, Searchy enables users to skip a step. It may be a small step, but as we all know, a small task when completed one hundred times a day becomes a very large and tiresome one.

Chelsea Kerwin, January 25, 2017

HonkinNews for January 24, 2017, Now Available

January 24, 2017

Another week and another search and content processing news round up is live. This week we cover the Dark Web delivery system known as the Royal Mail. Why are some Beltway Bandits developing a sudden craving for antacids? The transition from President Obama to President Trump may be a contributing factor. Some  other government news caught out attention too; specifically. The slimming down of Darpa’s open source software catalog and the CIA Crest search for more than 10 million previously classified CIA documents. We also highlight IBM’s call for rules to make sure that artificial intelligence does not run amok. We are not sure if Big Blue is cracking the old buggy whip at speeding Teslas or if IBM has a grand plan to keep smart software on a short leash. Dear old Yahoot (sorry, I meant Yahoo or Yabba Dabba Hoot) figures in an anecdote about effective management. Yahoo USA is not able to convince Yahoo Japan that selling ivory is a bad thing. That item made it “tusk” in time for this week’s show. You can view the program at this link.

Kenny Toth, January 24, 2017

A Moist Anti Palantir Protest: No Bagels? No Donuts?

January 24, 2017

On January 18, 2017 three score protesters showed up in front of the Shire. (That’s what Palantir Technologies uses as a handle for its Palo Alto headquarters.) According to “Tech Employees Protest in Front of Palantir HQ over Fears It Will Build Trump‘s Muslim Registry,” it was raining. Oh, there were an estimated 50 people experiencing the Great California Drought. I learned:

The organizers behind this particular protest hail from tech, Stanford, and Palo Alto and say they are concerned Palantir and its co-founder and Trump advisor Peter Thiel stand to profit off the makings of a citizen database that could be used against those identifying as followers of Islam in the United States.

I also read “Palantir Tried To Placate Protesters With Free Philz Coffee.” The “real” news service informed me:

The crowd, assembled in waterlogged windbreakers and sopping down coats, included employees from Facebook and other tech companies, along with labor activists, and students from nearby Stanford University. The hour-long protest was staged to pressure Palantir into more accountability and transparency around the databases it has built.

One hour. Wow. What a statement.

The “real” news report said:

The company [Palantir] was also hospitable to protesters, putting out a table of free Philz coffee with a little Palantir logo.

Here in Harrod’s Creek, protestors usually get a snort of moonshine. At least that has some impact. I am not sure if the folks trying to find a parking place were amused with soggy advocates of something which may not happen. But what if there is a registry and Palantir was not involved.

Will the valiant protesters identify the government contractor. Assemble in front of that outfit’s building and make their voices heard? As long as the target of the protest is near at hand and the weather cooperates. That’s a maybe then.

Stephen E Arnold, January 24, 2017

Google Allegedly Skews Results Listings to Help Itself: Surprise?

January 24, 2017

I read a Wall Street Journal recycling research from a search engine optimization outfit called SEMrush. The idea is that Google looks at a user’s query and puts links to its products where its search users will see them and click on them. Doesn’t Diego Simeone’s kid play professional soccer. I suppose the successful coach of Athletico Madrid has skewed other coaches’ interest in his gifted, advantaged progeny.

My Wall Street Journal online account doesn’t work. No joy. I take the dead tree version of Mr. Murdoch’s flagshipish newspaper, but it is a hassle to provide links to online content which will not appear. Therefore, I have watched the “revelations” about Google’s fiddled results as it flashed around the interwebs.

I noted a version of the story with the crafty title “’Google Buys Ad Space above Search Results to Promote Its Own Products – Giving It an Advantage over Its Online Competitors’. The write up provides a clear explanation of Google’s alleged misdeeds. I can hear the shouts from some, “Bench Simeone’s kid.”

The write up asserts:

According to the Journal, whenever someone enters a search term in Google related to pieces of hardware, ads for the relevant items sold by either Google or a sister company would appear in the most prominent spot on the page 91 percent of the time. In 43 percent of instances, the top two ads were for Google-linked products.

The write up embraces the meat of the SEMrush research revelation:

Google’s practice of favoring its own product ads on relevant search results has raised questions over whether it is violating anti-trust laws.

My view is that Google displays information in search results which attempt to accomplish these goals. Remember. These are my observations based on my research for my three Google monographs and the columns I wrote about Google for Information Today for two or three years. Links to the articles are on my LinkedIn page and on the Information Today Web site.

Observation 1. Google makes decisions like any other Sillycon Valley company; that is, product managers or their ilk cajole engineers to make changes which generate revenue. Some senior executives are unaware or partially aware of these manual and algorithmic tweaks. In large, chaotic outfits, only a handful of people may know what’s been twiddled. Most in the outfit don’t care what their colleagues are doing. The consequences can range from nuking traffic to a Web site to giving pride of place to a Google “fave”.

Observation 2. Google needs clicks itself. What is the company going to do to push some of its own products. Why plug the iPhone when one is selling the Pixel. The Pixel marketers who manage to get some space on results screens can’t manufacture enough phones to meet demand. Some of the Pixel’s flaws go unfixed because the craziness of the Pixel allows miscommunication, missteps, and misunderstandings to flourish in the Google greenhouse. Producing clicks is tough even for Google because only a tiny fraction of Web and mobile search users click on ads and pay attention to their messages. My view is that silver tongued Googlers talk their products to the top. Google engineers just want life to be marketing to leave them alone as I perceive the work environment. The “pride of place” phenomenon has spread like mould in wallboard. Those who should be paying attention are involved with Loon balloons and wrangling for slots in President Trump’s administration. The business processes themselves allow the present results policies to flourish and become the de facto way to do business.

Observation 3. Google needs to pump up revenue. I know that most of the Wall Street wizards think I deserve to live in a backwater in rural Kentucky. But the reality is that the shift from desktop search traffic to mobile search traffic has started a fire in the USS Google’s lower deck. Alphabet, the parent company, has to find replacement revenues before the search revenue starts to flat line and maybe drift down. Thus, the need for money feeds the indifference to the business processes which allow Googlers to promote the company’s products in order to pump up sales. Whether Google transfers cash to buy ads or whether an engineer does what a sleek Google MBA wants makes no difference to me. The result shaping has been a characteristic of “relevance” since 2005 or 2006. Precision and recall have been killed in the battle for revenue.

I don’t like or dislike the Alphabet Google thing. I have paid some attention to the company since I met Larry Page at a search engine conference. I was, as I recall, one of the people who said on our panel that truncation was going to be in Google’s future. Mr. Page laughed at me and said, “Never.”

Guess what?

Google implemented truncation when it started to get serious about clustering and other not so CLEVER methods.

Once advertising enters search processes, objectivity, precision, and recall are doomed. After 16 years, I find it amusing that experts are just now discovering that Google search is not the same as running a query on the old fashioned Westlaw system.

There are other little surprises in the Alphabet Google system too. I documented many of these for my clients between 2002 and 2010 when I grew tired of hearing people say, “I am an expert in search.”

Yeah, right. That’s why it has taken 16 years for the nature of Google search results to catch users’ attention.

Shaped results? Ad placement fast dancing?

Big surprises, right?

Stephen E Arnold, January 24, 2017

You Too, Can Learn Linear Algebra

January 24, 2017

Algebra was invented in Persia nearly one thousand years ago. It is one of the fundamental branches of mathematics and its theories are applied to many industries.  Algebra ranges from solving for x to complex formulas that leave one scratching their head.  If you are interested in learning linear algebra, then you should visit Sheldon Axler’s Web site.  Along with an apparent love for his pet cat, Axler is a professor of mathematics at San Francisco State University.

On his Web site, Axler lists the various mathematics books he has written and contributed too.  It is an impressive bibliography and his newest book is titled, Linear Algebra Abridged.  He describes the book as:

Linear Algebra Abridged is generated from Linear Algebra Done Right (third edition) by excluding all proofs, examples, and exercises, along with most comments. Learning linear algebra without proofs, examples, and exercises is probably impossible. Thus this abridged version should not substitute for the full book. However, this abridged version may be useful to students seeking to review the statements of the main results of linear algebra.

Algebra can be difficult, but as Axler wrote above learning linear algebra without proofs is near impossible.  However, if you have a grounded understanding of algebra and are simply looking to brush up or study linear principles without spending a sizable chunk on the textbook, then this is a great asset.  The book is free to download from Axler’s Web site, along with information on how to access the regular textbook.

Whitney Grace, January 24, 2017

Hacks to Make Your Google Dependence Even More Rewarding

January 24, 2017

The article on MakeUseOf titled This Cool Website Will Teach You Hundreds of Google Search Tips refers to SearchyApp, a collection of tricks, tips, and shortcuts to navigate Google search more easily. The lengthy list is divided into sections to be less daunting to readers. The article explains,

What makes this site so cool is that the tips are divided into sections, so it’s easy to find what you want. Here are the categories: Facts (e.g. find the elevation of a place, get customer service number,…) Math (e.g. solve a circle, use a calculator, etc.), Operators (search within number range, exclude a keyword from results, find related websites, etc.), Utilities (metronome, stopwatch, tip calculator, etc.), Easter Eggs (42, listen to animal sounds, once in a blue moon, etc.).

The Easter Eggs may be old news, but if you haven’t looked into them before they are a great indicator of Google’s idea of a hoot. But the Utilities section is chock full of useful little tools from dice roller to distance calculator to converting units to translating languages. Also useful are the Operators, or codes and shortcuts to tell Google what you want, sometimes functioning as search restrictions or advanced search settings. Operators might be wise to check out for those of us who forgot what our librarians taught us about online search as well.

Chelsea Kerwin, January 24, 2017

Twitter: Selling and Banning Its Way to Its Future

January 23, 2017

Twitter is making news again. The company sold some tools to the Google. Google, wisely Beyond Search thinks, has not yet built up the gumption to buy the whole Twitter enchilada. And Twitter continues to annoy some professionals who use Twitter data to figure out the who, what, and why of certain illegal activities.

Twitter Bans Award-Winning London, Ont., Company for Helping Police Track Protesters” explains:

A London, Ont., data mining company has been banned from Twitter and is being reviewed by Facebook for selling surveillance software to North American police services to monitor people at Black Lives Matter events and other public protests.

The company in question is Media Sonar, one of a number of firms which developed tools to make sense of messages and metadata generated by the folks who send information via Twitter “tweets”. (You can watch a video explaining some of the firm’s methods at this link.) Another example of a social media analysis outfit is Geofeedia which has been given a bloody nose by spasmodic Silicon Valley wizards.

The write up reports:

Media Sonar did not return calls to CBC News but its website states that it works to help clients analyze the sentiment of social media posts and can use location-based data to monitor threats.

Beyond Search believes that some high flying Silicon Valley companies develop systems and do not think about how these systems will be used. Then  when the high flying Silicon Valley executives realize that their whizzy new creation has some interesting applications, the Twitter-type outfits take action. The approach is fascinating to watch.

On one hand, Twitter is struggling to develop its user base and get some sizzle back. On the other hand, the company is selling off grandma’s furniture and turning off revenue from licensees of the Twitter content stream.

Interesting stuff. Chaos monkeys in real life? Seems like it.

Stephen E Arnold, January 23, 2017

Indexing: The Big Wheel Keeps on Turning

January 23, 2017

Yep, indexing is back. The cacaphone “ontology” is the next big thing yet again. Folks, an ontology is a form of metadata. There are key words, categories, and classifications. Whipping these puppies into shape has been the thankless task of specialists for hundreds if not thousands of years. “What Is an Ontology and Why Do I Want One?” tries to make indexing more alluring. When an enterprise search system delivers results which are off the user’s information need or just plain wrong, it is time for indexing. The problem is that machine based indexing requires some well informed humans to keep the system on point. Consider Palantir Gotham. Content finds its way into the system when a human performs certain tasks. Some of these tasks are riding herd on the indexing of the content object. IBM Analyst’s Notebook and many other next generation information access systems work hand in glove with expensive humans. Why? Smart software is still only sort of smart.

The write up dances around the need for spending money on indexing. The write up prefers to confuse a person who just wants to locate the answer to a business related question without pointing, clicking, and doing high school research paper dog work. I noted this passage:

Think of an ontology as another way to classify content (like a taxonomy) that allows you to identify what the content is about and how it relates to other types of content.

Okay, but enterprise search generally falls short of the mark for 55 to 70 percent of a search system’s users. This is a downer. What makes enterprise search better? An ontology. But without the cost and time metrics, the yap about better indexing ends up with “smart content” companies looking confused when their licenses are not renewed.

What I found amusing about the write up is that use of an ontology improves search engine optimization. How about some hard data? Generalities are presented, not instead of some numbers one can examine and attempt to verify.

SEO means getting found when a user runs a query. That does not work too well for general purpose Web search systems like Google. SEO is struggling to deal with declining traffic to many Web sites and the problem mobile search presents.

But in an organization, SEO is not what the user wants. The user needs the purchase order for a client and easy access to related data. Will an ontology deliver an actionable output. To be fair, different types of metadata are needed. An ontology is one such type, but there are others. Some of these can be extracted without too high an error rate when the content is processed; for example, telephone numbers. Other types of data require different processes which can require knitting together different systems.

To build a bubble gum card, one needs to parse a range of data, including images and content from a range of sources. In most organizations, silos of data persist and will continue to persist. Money is tight. Few commercial enterprises can afford to do the computationally intensive content processing under the watchful eye and informed mind of an indexing professional.

Cacaphones like “ontology” exacerbate the confusion about indexing and delivering useful outputs to users who don’t know a Boolean operator from a SQL expression.

Indexing is a useful term. Why not use it?

Stephen E Arnold, January 23, 2017

« Previous PageNext Page »

  • Archives

  • Recent Posts

  • Meta