CyberOSINT banner

Behind the Google Search Algorithm

June 16, 2016

Trying to reveal the secrets behind Google’s search algorithm is almost harder than breaking into Fort Knox.  Google keeps the 200 ranking factors a secret, what we do know is that keywords do not play the same role that they used to and social media does play some sort of undisclosed factor.  Search Engine Journal shares that “Google Released The Top 3 ranking Factors” that offers a little information to help SEO.

Google Search Quality Senior Strategist Andrey Lipattsev shared that the three factors are links, content, and RankBrain-in no particular order.  RankBrain is an artificial intelligence system that relies on machine learning to help Google process search results to push the more relevant search results to the top of the list.  SEO experts are trying to figure out how this will affect their jobs, but the article shares that:

“We’ve known for a long time that content and links matter, though the importance of links has come into question in recent years. For most SEOs, this should not change anything about their day-to-day strategies. It does give us another piece of the ranking factor puzzle and provides content marketers with more ammo to defend their practice and push for growth.”

In reality, there is not much difference, except that few will be able to explain how artificial intelligence ranks particular sites. Nifty play, Google.


Whitney Grace, June 15, 2016
Sponsored by, publisher of the CyberOSINT monograph


SLI Systems Hopeful as Losses Narrow and Revenue Grows

June 14, 2016

The article titled SLI Systems Narrows First-Half Loss on Scoop reports revenue growth and plans to mitigate losses. SLI Systems is a New Zealand-based software as a service (SaaS) business that provides cloud-based search resources to online retailers. Founded in 2001, SLI Systems has already weathered a great deal of storms in the form of the dot-com crash that threatened to stall the core technology (developed at GlobalBrain.) According to a statement from the company, last year’s loss of $502K was an improvement from the loss of $4.1M in 2014. The article states,

“SLI shares have dropped 18 percent in the past 12 months, to trade recently at 76 cents, about half the level of the 2013 initial public offering price of $1.50. The software developer missed its sales forecast for the second half of the 2015 year but is optimistic new chief executive Chris Brennan and Martin Onofrio as chief revenue officer, both Silicon Valley veterans, can drive growth in revenue and earnings.”

The SLI of SLI stands for Search, Learn and (appropriately) Improve. The company hopes to achieve sustainable growth without raising additional capital by continuing to focus on innovation and customer retention rates, which slipped from 90% to 87% recently. Major clients include Lenovo, David Jones, Harvey Norman, and Paul Smith.



Chelsea Kerwin, June 14, 2016

Sponsored by, publisher of the CyberOSINT monograph

Websites Found to Be Blocking Tor Traffic

June 8, 2016

Discrimination or wise precaution? Perhaps both? MakeUseOf tells us, “This Is Why Tor Users Are Being Blocked by Major Websites.” A recent study (PDF) by the University of Cambridge; University of California, Berkeley; University College London; and International Computer Science Institute, Berkeley confirms that many sites are actively blocking users who approach through a known Tor exit node. Writer Philip Bates explains:

“Users are finding that they’re faced with a substandard service from some websites, CAPTCHAs and other such nuisances from others, and in further cases, are denied access completely. The researchers argue that this: ‘Degraded service [results in Tor users] effectively being relegated to the role of second-class citizens on the Internet.’ Two good examples of prejudice hosting and content delivery firms are CloudFlare and Akamai — the latter of which either blocks Tor users or, in the case of, infinitely redirects. CloudFlare, meanwhile, presents CAPTCHA to prove the user isn’t a malicious bot. It identifies large amounts of traffic from an exit node, then assigns a score to an IP address that determines whether the server has a good or bad reputation. This means that innocent users are treated the same way as those with negative intentions, just because they happen to use the same exit node.”

The article goes on to discuss legitimate reasons users might want the privacy Tor provides, as well as reasons companies feel they must protect their Websites from anonymous users. Bates notes that there  is not much one can do about such measures. He does point to Tor’s own Don’t Block Me project, which is working to convince sites to stop blocking people just for using Tor. It is also developing a list of best practices that concerned sites can follow, instead. One site, GameFAQs, has reportedly lifted its block, and CloudFlare may be considering a similar move. Will the momentum build, or must those who protect their online privacy resign themselves to being treated with suspicion?


Cynthia Murrell, June 8, 2016

Sponsored by, publisher of the CyberOSINT monograph

Google Voice Search: An Optimistic View

June 7, 2016

I read “Google Voice Search Records and Keeps Conversations People Haver Around their Phones but You Can Delete the Files.” I like the “you can delete the files. How does one know what has or has not been deleted in this era of real time cloud goodness?

I assume that the information in the write up is accurate.

The write up states:

The feature works as a way of letting people search with their voice, and storing those recordings presumably lets Google improve its language recognition tools as well as the results that it gives to people.

If you want to “delete” these recordings, the write up asserts:

It’s found by heading to Google’s history page and looking at the long list of recordings. The company has a specific audio page and another for activity on the web, which will show you everywhere Google has a record of you being on the internet.

Optimism is good. One presidential hopeful believed certain emails had been deleted. I am not sure that the FSB agrees. It seems that the Independent’s “real journalist” was not aware of “Your Data Is Forever.”

Stephen E Arnold, June 7, 2016

Alibaba Eyes Twiggle, Not the US Ecommerce Search Players

June 7, 2016

I read “Alibaba, the Chinese and Global Powerhouse, Is Investing in Twiggle, a Stealthy Israeli Ecommerce Search Start Up.” I love the stealthy part. Twiggle cannot be too stealthy because Alibaba found out about the system.

I learned in the write up:

Twiggle was funded back in 2013, and its founders were formerly leaders at Google, namely Dr. Amir Konigsberg, previously one of the members of Google’s operations in the emerging markets and former managing director of, and Dr. Adi Avidor, a former engineering tech lead at Google. Combined, the two have authored more than 35 U.S. patents and bring a wealth of experience in digital innovation in the fields of search, artificial intelligence and ecommerce.

The system delivers search and discovery. Crunchbase points out that the company has ingested about $20 million since it opened its doors in 2014.

I assume that the heartbeats of EasyAsk and SLI Systems sped up when word of Alibaba’s search proclivities diffused. With this alleged tie up, I assume the heart rates of the principals at EasyAsk and SLI Systems, assorted open source firms, and probably the every ready IBM Watson have slowed.

Stephen E Arnold, June 7, 2016

Yippy for Vivisimo

June 6, 2016

I read “Yippy Buys MC+A, a Veteran Google Search Appliance Partner.” Yippy, if memory serves, is a variant of Vivisimo. In the good old days, before Vivisimo sold to IBM and suddenly became a Big Data company, Yippy did search and retrieval. I assume the old document limits were lifted. I also assume that the wild and crazy config file editing has been streamlined. I also assume that Yippy is confident it can zoom past Maxxcat and Thunderstone, two outfits also in the search appliance business. Buying a Google reseller provides some insight and maybe leads into which companies embraced the GSA solution. When the GSA was first demonstrated to me, I noted the locked down relevance system. There were other interesting “enhancements” the Googlers included to eliminate the complexity of enterprise search. I recall working on the training materials for a DC reseller of the GSA. Customization was like a Google interview question.

The Fortune write up is one of those reinventions of enterprise search which I enjoy. I circled this comment:

Google Search Appliance was a great idea for companies that deploy a welter of different applications, so important data can be scattered about in different file systems and repositories. It also gave Google a toehold in corporate server rooms, which is why some wondered why Google would cut the product at a time when it’s trying to sell more cloud services to these very companies.

I was unaware that Alphabet Google was in the search business. I thought it was an online advertising outfit. Who at Google wanted to work on the wonky GSA products? For years Google relied on resellers and outfits like Dell to make the over priced gizmos.

Love live Vivisimo. I mean Yippy. If you cannot pin down an integrator, why not buy one?

Stephen E Arnold, June 6, 2016

BA Insight Asserts Government Transformation for Information Access

June 6, 2016

This is a bold, bold assertion. In my limited experience with the US government’s entities, I can attest that certain systems and methods do not change. That obviously does not match the marketing message from BA Insight, a search and retrieval vendor with a Microsoft focus and competence in all sorts of interesting buzzwords.

Navigate to “BA Insight and MD Tech Solutions Join Forces to Transform the Way Government Agencies Find and Work with Content.” I read:

BA Insight today announced a partnership with MD Tech Solutions, a technology services company located in Fredericksburg, Virginia specializing in Custom SharePoint Development, Administration, Design and Architecture.  As a reseller of BA Insight’s software portfolio, MD Tech can now provide its customers with a solution that quickly connects SharePoint users to the essential knowledge they need to be productive, while providing an internet-like search experience that users will love.

Love in the government entities with which I had experience when I labored in the vineyards in Congress, the executive branch, and some other outfits was, in my experience, in short supply. There were numerous search and retrieval systems. There were legacy systems which did magic things to information. Anyone remember the baked in search system with SharePoint? Maybe Fast Search? What about the components in Oracle? Palantir? IBM i2?

I find the notion of transforming the government interesting. Even more fascinating is the notion of users loving a search system. Love in the government. Hmmm.

Stephen E Arnold, June 6, 2016

Everyone Rejoice! We Now Have Emoji Search

June 1, 2016

It was only a matter of time after image search actually became a viable and useful tool that someone would develop a GIF search.  Someone thought it would be a keen idea to also design an emoji search and now, ladies and gentlemen, we have it!  Tech Viral reports that “Now You Can Search Images On Google Using Emoji.”

Using the Google search engine is a very easy process, type in a few keywords or a question, click search, and then delve into the search results.  The Internet, though, is a place where people develop content and apps just for “the heck of it”.  Google decided to design an emoji search option, probably for that very reason.  Users can type in an emoji, instead of words to conduct an Internet search.

The new emoji search is based on the same recognition skills as the Google image search, but the biggest question is how many emojis will Google support with the new function?

“Google has taken searching algorithm to the next level, as it is now allowing users to search using any emoji icon. Google stated ‘An emoji is worth a thousand words’. This feature may be highly appreciated by lazy Google users, as they now they don’t need to type a complete line instead you just need to use an emoji for searching images.”

It really sounds like a search for lazy people and do not be surprised to get a variety of results that do not have any relation to the emoji or your intended information need.  An emoji might be worth a thousand words, but that is a lot of words with various interpretations.


Whitney Grace, June 1, 2016
Sponsored by, publisher of the CyberOSINT monograph An Alternative to Facebook

May 27, 2016

VK is the name of the “old” VKontakte social networking service. My estimates peg the traffic to the site at about 30 percent of Facebook. The user count is in the 300 million range and growing. The user base is concentrated in Russia, but the service is attracting users from other countries. Online translation tools make it easy for a non Russian speaker to use the service.

Earlier this year, I read “German Neo Nazis Flocking to Putin’s Facebook Knock off VKontakte.” The write up seems a bit one sided, but social network sites allegedly linked to Mr. Putin suggests that a bit of additional research and investigation are warranted.

You can sign up and explore at I provide some basics for appropriate prophylactic measures in my Dark Web lectures. One thought is okay here: Be prudent.

You will need a account to access an interesting facial recognition service called FindFace. The site looks like this:


Like Google’s “search by photo”, the FindFace service delivers close matches to the face you upload to the service. The Guardian published “Face Recognition App Taking Russia by Storm May Bring End to Public Anonymity.” The digital wannabe stated:

FindFace compares photos to profile pictures on social network Vkontakte and works out identities with 70% reliability.

I mention and FindFace because I was asked if there were an alternative to Facebook. The answer is, “” However, the use of the service for certain types of groups and certain purposes is less easy than it was in the past. Some folks can use the apps and features instead of fooling around with Dark Web services.

Stephen E Arnold, May 27, 2016

The Reinterpretation of Google History

May 27, 2016

I read “Why Google Beat yahoo in the War for the Internet.” The information in the article touches upon some important points; for example, Google focused on a more homogeneous infrastructure. The history of Google, however, includes some tactical moves which the article ignores.

My enthusiasm for recycling the information about Google’s first five years has shriveled since I published the third volume of my Google trilogy. I want to point out several factoids which, no doubt, will interest few today. The article makes much of what is described as a “fresh start.” I do not agree with the “fresh” part.

First, Google is a descendent of Alta Vista, Jon Kleinberg’s Clever, and the information access research conducted at Stanford University and other universities with an interest in this technical field. As a result, the infrastructure benefits from Digital Equipment’s investment in its Alta Vista system. Much of that “knowledge” migrated to Google as Messrs. Brin and Page hired notable professionals away from the chaos of Alta Vista under Hewlett Packard’s management. Jeff Dean, Simon Tong, and others are responsible for much of the infrastructure for Google. The Alta Vista system was anchored in the DEC technology. The memory management benefits were obtained at a cost. Google embraced commodity hardware and a big chunk of the Alta Vista thinking. Fresh? Well, sort of.

Second, Google’s scrutiny of Yahoo had a couple of payoffs. Yahoo was a crazy quilt of warring tribes. Each tribe had its own technology idols. Google interpreted this as expensive and focused on reducing the costs by standardizing on systems and methods to a greater degree than Yahoo did. Over time, Yahoo became more sluggish due to its different fiefdoms. Google was comparatively stronger due to its less chaotic approaches. Don’t get me wrong. Google in its first five years was a wild and crazy outfit. Yahoo was wilder and crazier. As part of Google’s learning from Yahoo, Google recognized the value of selling ads the Yahoo way. Yahoo was unhappy with Google’s borrowing of its approach. Google settled a legal spat with about $1 billion in payments to the Yahooligans. But the majority of Google’s revenue comes from that me too play.

Third, Google, like Yahoo, is not sure what it will be from year to year. The difference is that Google has crafted a relatively consistent flow of advertising revenue from its early and somewhat crude pre-Oingo days. Google integrated acquired technologies more effectively than Yahoo typically did. The ability to integrate provided Google an important edge.

There are other touchpoints in Google’s early days. From my point of view, Google is from its inception a beneficiary of good luck because the competitors in Web search were distracted in an effort to become portals. Google, as I see the company, is less of an innovator and more of an emulator. Google has yet to demonstrate that renaming the company, reorganizing the units, and funding projects like cheating death will yield the next big thing.

Google, for me, was a one off, an anomaly.

Stephen E Arnold, May 27, 2016

« Previous PageNext Page »