Build an Alternative Google: How To Wanted
April 6, 2018
Hacker News presented an interesting question, “How would you build an internet scale web crawler?” We have been talking with companies which have developed Internet search systems that are not available for free Web search. Those conversations have produced some fascinating information. Some of the data will be included in my upcoming lecture for a government agency and then in my two presentations at the June 2018 Telestrategies ISS Conference in Prague.
What was interesting about this question was the few people responded. That is interesting because my team’s research for my new presentations on deanonymizing encrypted chat and deanonymizing digital currency transactions pivot on comprehensive Internet indexing. In fact, more companies are indexing the Internet content than at any time in the last 10 years.
The second issue the post triggered was a realization that only a handful of people jumped on the topic. This low response to the question in itself is interesting. With more activity in indexing, why aren’t more people helping out JustinGarrson? That’s a question worth thinking about.
Third, one of the responses to the Hacker News question was a pointer to the YaCy.net open source project. We once included this technology in our Internet Research for Law Enforcement training program. My recollection of the system is fuzzy, so I will get one of my team to take at look.
The final thought the Hacker News’ story triggered was, “Have people just accepted Bing, Google, Qwant, and a handful of metasearch systems as too dominant to challenge?” My view is that an opportunity exists to create a public facing Internet search and retrieval system. The reason? Outstanding alternatives to Bing, Google, and Qwant are available for those who qualify as customers and who are willing to pay the license fees.
My hunch is that just as enterprise search has coalesced around the open source Lucene/Solr technologies, free Web search has become “game over” because the ad supported model has won.
The problem, of course, is that a person looking for information usually does not realize that free Web search results are neither comprehensive, timely, or objective.
I hope individuals like JustinGarrison get the information needed to seize an opportunity in Internet search.
Stephen E Arnold, April 6, 2018
SEO Tips for Featured Snippets
March 26, 2018
We like Google’s Featured Snippets feature, at least when the information it serves up is relevant to the query. That is the tool that places text from, and links to, a site that (ideally) answers the user’s question at the top of search results. Naturally, Search Engine Optimization pros want their clients’ sites to grace these answer boxes as often as possible. That is the idea behind VolumeNine’s blog post, “Featured Snippets in Search: An Overview.” Writer Megan Duffy sees Featured Snippets as an opportunity for those already well-positioned in the search rankings. She explains,
There’s no debate that holding the primary spot on a search engine results page helps drive a ton of traffic. But it takes a long, disciplined approach to climb to the top of an organic search result. The featured snippet provides a bit of a shortcut. The featured snippet is an opportunity for any page ranked in the top ten of results to jump straight to the top with less effort compared to building a page’s search rank from, for example, from eighth to first. Having a featured snippet effectively puts you at search result zero and allows your business to earn traffic as the top search result.
Duffy goes on to make recommendations for maximizing one’s chances of being picked for that Snippet spot. To her credit, she emphasizes that good content is key; we like to see that is still a consideration.
Cynthia Murrell, March 26, 2018
Million Short: A Metasearch Option
March 22, 2018
An interview at Forbes delves into the story behind Million Short, an alternative to Google for Internet Search. As concerns grow about online privacy, information accuracy, and filter bubbles, options that grant the user more control appeal to many. Contributor Julian Mitchell interviews Million Short founder and CEO Sanjay Arora in his piece, “This Search Engine Startup Helps You Find What Google Is Missing.” Mitchell informs us:
Founded in 2012, Million Short is an innovative search engine that takes a new and focused approach to organizing, accessing, and discovering data on the internet. The Toronto-based company aims to provide greater choices to users seeking information by magnifying the public’s access to data online. Cutting through the clutter of popular searches, most-viewed sites and sponsored suggestions, Million Short allows users to remove up to the top one million sites from the search set. Removing ‘an entire slice of the web’, the company hopes to balance the playing field for sites that may be new, suffer from poor SEO, have competitive keywords, or operate a small marketing budget. Million Short Founder and CEO Sanjay Arora shares the vision behind his company, overthrowing Google’s search engine monopoly, and his insight into the future of finding information online.
The subsequent interview gets into details, like Arora’s original motivation for creating Million Short—Search is too important to be dominated by a just few companies, he insists. The pair explores both advantages and challenges the company has seen, as well as a look to the future. See the article for more.
Cynthia Murrell, March 22, 2018
New SEO Predictions May Just Be Spot On
March 7, 2018
What will 2018 bring us? If the past twelve months were any indication, we have no idea what will hit next. However, that doesn’t stop the experts from trying to cash in on their Nostradamus abilities. Some of them actually sound pretty plausible, like Search Engine Journal article, “47 Experts on the Top SEO Trends For 2018.”
There are some real longshots on the list, but also some really insightful thoughts like:
In 2018 there will be an even bigger focus on machine learning and “SEO from data.” Of course, the amplification side of things will continue to integrate increasingly with genuine public relations exercises rather than shallow-relationship link building, which will become increasingly easy to detect by search engines.
Something which was troubling about 2017, and as we head into 2018, is the new wave of organizations merely bolting on SEO as a service without any real appreciation of structuring site architectures and content for both humans and search engine understanding. While social media is absolutely essential as a means of reaching influencers and disrupting a conversation to gain traction, grow trust and positive sentiment, those who do not take the time to learn about how information is extracted for search too may be disappointed.
We especially agree with how the importance of SEO will grow in the new year. Innovative organizations are finding amazing new ways to manipulate the data and we don’t expect that to stop. It’ll be interesting to see where we stand twelve months from now.
Patrick Roland, March 7, 2018
Multi-purpose Search Tool Is Like Magic
March 2, 2018
The Internet of things has evolved from an entertaining gimmick to instantly access information to an indispensable tool for daily life. Search engines like Google and Duckduckgo make searching the Internet simple, but in closed systems like databases and storage silos, searching is still complicated. Usually, individual systems have their own out-of-the-box search engines, but its accuracy is so-so. Cloud computing complicates search even more. Instead of searching just one system, cloud computing requires search software that can handle multiple systems at once. The search technology is out there, but can it really perform as well as Google or even DuckDuckGo?
The Code Project wrote about a new, multi-faceted search tool in the post, “Multidatabase Text Search Tool.” Searching text in all files across many systems is one of the most complicated procedures for a search engine, especially if you want accuracy and curated results. That is what DBTextFinder was developed for:
DBTextFinder is a simple tool that helps you to perform a precise search in all the stored procedures, functions, triggers, packages and views code, or a selected subset of them, using regular expressions.Additionally, you can search for a given text in all the text fields of a selected set of tables, using regular expressions too.The application provides connections to MySQL, SQL Server and Oracle servers, and supports remote connections via WCF services. You can easily extend the list of available DBMS writing your own connectors without having to change the application code.
DBTextFinder appears to have it all. It is programmable, gets along well with other computer languages, and was designed to be user-friendly. What more could you ask for?
Whitney Grace, March 2, 2018
No Google Makes People Go Crazy
February 26, 2018
Beyond being the top search engine in the western world, Google has wormed its way into our daily lives with more than one service. Google offers email, free Web storage, office suite software (word processing, presentations, spreadsheets), blogging software, YouTube, online ad services, and many more. If we did not have Google, many of us would experience withdrawal symptoms. So what would you do without Google? TechCrunch posted the article,“That Time I Got Locked Out Of My Google Account For A Month” and author Ron Miller explained how it impacted his life.
Miller, like most of us, forgot his Google password and jumped through the hoops to recover it. After plying the red tape, he was denied access to his account and was simply locked out. The biggest problem was that he did not have any recourse. As a technology journalist, Miller had Google contacts, but without that access, he did not know what he would have done. Miller’s Google contact tried to get support for his case, but for two weeks he was given the runaround. Finally, the PR contact came through and using an alternate email address, Miller finally had access to his sweet, sweet Google data.
Miller learned that there was little he could have done without his PR contact and others locked out of the accounts are SOL. What is a Google user supposed to do?
The only thing I can suggest, and which I think I will do in the future, is to use a password manager and don’t leave it to chance. One day you could click “Forgot Password” and that could be the last time you access your Google account. Your digital life could be hanging by that thin thread called your password, and if you can’t remember it at some point, it is like you don’t exist and you are cut off.
Hey, Google, please make retrieving a password easier!
Whitney Grace, February 26, 2018
Google Retains Opt-Out Option
February 15, 2018
While the desire by most organizations to land at the top of relevant Internet search results was strong enough to spawn the entire SEO profession, some entities are not so eager for traffic. Now we learn Google will continue to let sites opt out of its search results, even though the legal requirement to do so has expired. Ubergizmo reports, “Google Will Let Websites Opt Out of Surfacing in Search Results.” Writer Adnan Farooqui writes:
Google settled an antitrust investigation by the FTC back in 2012 by promising to change its behavior in several areas. The commitments it made included removing AdWords restrictions that made it harder for advertisers to run multi-platform campaigns and giving websites the option to opt out of being displayed in search results and having their content crawled. Both commitments that Google made to the FTC back in 2012 have expired as of December 27th, 201[7]. It’s under no obligation to continue honoring them but Google has said in a letter to the FTC that it will honor them. ‘We believe that these policies provide additional flexibility for developers and websites, and we will continue them as policies after the commitments expire,’ Google confirmed in the letter.
So, fear not— if you’d prefer your site not be found by drive-by Google traffic, the search engine will continue to have your back.
Cynthia Murrell, February 15, 2018
Google Has Its Own Browser History
February 2, 2018
Have you ever wanted to look at your past Google searches, but did not want to go through your browser history? Google has a new feature that will allow users to see their recent searches. Search Engine Land reports that “Google Home Page Search Box Now Shows You Recent Searches By Default” and it is a super option. Super annoying, that is. Whenever you use Google search, a default dropdown appears before you even enter text into the search box.
Even former Google search executive Matt Cutts said this “new feature” is super annoying. He tried to opt out of it, but could not find the opt-out option. Search Engine Land sent Google an email to see what the scoop was. They discovered that even Google found the automatic browser history box annoying. Here is Google’s official response:
Google has confirmed with Search Engine Land that this is not the behavior they want and it was likely a bug. “We launched the ability to see past searches by clicking the search box earlier this year. However, past searches should not be appearing immediately on page load, so we are working to fix this issue,” a Google spokesperson told Search Engine Land.
All right, Google! You admitted a mistake and provided a solution. Now can you do something about the fake news stories that are plaguing Google News?
Whitney Grace, February 2, 2018
Some Think Google Is No Longer the King of Search
January 18, 2018
Google is much more than a search engine, it’s a verb. Like Xerox and Kleenex before it, that says something about the hierarchy of their business. However, some are claiming it’s time for alternatives (In search…not in copy-making or nose blowing). This, according to a recent Eyerys story, “Searching Beyond Google: When The Internet is Too Big for a Single Search Engine.”
According to the story:
[T]he information you need might be hidden from the tools you use. Either because the webmasters wanted that to happen by blocking search engines’ access, or inaccessible by search engine because they are behind paywalls or login forms, or lies inside the deep web.
To access them, you need more specific tools other than search engines, and look at the right place, with the right privilege.
If and only if you still can’t find the information you’re looking for, it’s either not available on the internet, or doesn’t exist in the first place.
Or, they could be hidden inside database, encrypted, lies deeper and accessible to only using certain IPs, classified methods or privilege. In this case, it’s not publicly available though it is there. You need to be a hacker to get yourself into that, and that is certainly illegal by any means.
While the story has its heart in the right place, recommending alternative engines, like DuckDuckGo, and giving tips on using social media for search, it’s not really too believable. For one, humans are creatures of habit and they are stuck on the single search engine method. This is wishful thinking, and actually makes sense in places, but we can’t see it happening.
Patrick Roland, January 18, 2018
Qwant Goes to China
January 17, 2018
The roots of Qwant stretch back to Pertimm, an interesting search system which pre-dated today’s Qwant. Information in my files about Qwant reminded me that Qwant is a metasearch system which combines its own crawling of French sources. The key feature of Qwant is that it is not retaining data about users’ queries. It is important to keep in mind that legal intercepts can capture Internet data and may be able to map user actions to particular Web sites or topics.
In the article “Not Just a Horse: Macron Also Brings Privacy-Based Browser on Trip to China,” the French delegation visiting Chinese officials is, in part, designed to promote the use of Qwant.
I noted this statement in the article, one of the founders of Qwant allegedly stated:
Yes, we need a lot of data but we don’t need to know that it’s you or me. The whole idea of Qwant is to make AI and IoT without the data of the users. In our case, based on the fact that we are a privacy-based search engine, we don’t need people’s data. So maybe we‘ll have some technology that we can use more easily in China than some of our competitors.
My perception is that China is quite interested in who searches what, particularly within the Middle Kingdom. Qwant will follow “local regulations.”
My recollection is that Google has not achieved the same level of dominance that it has in Europe, home of Qwant.
Since the demise of Quaero and Muscat, Yandex has become one of the European alternatives to Google. The Exalead Web search system is still online, but it does not attract much attention. I find it useful because Google results are thin when I search for older content. You can locate the Exalead search system at this link. Dassault Systèmes uses Exalead for its product component search, and I am surprised that the company does not push the Web search capability more aggressively.
If you have not tried Qwant, you can try it at www.qwant.com. Compare the results with the Exalead system and the Russian Yandex system.
In my tests, I find it necessary to use multiple search systems, including the low profile iseek.com and Searx.me system. It is more difficult than ever to locate certain types of information in general purpose Web search systems. This applies to metasearch systems like Ixquick (now Startpage.com), Unbubble, Izito, and other systems which try to offer researchers an alternative to Google.
Google works well for pizza. Looking for other types of information? Qwant and other low profile systems have to be used. The process of locating something as basic as the address of a company in Madrid can require quite vigorous hoop jumping.
But China? Interesting.
Stephen E Arnold, January 17, 2018