CyberOSINT banner

Websites Found to Be Blocking Tor Traffic

June 8, 2016

Discrimination or wise precaution? Perhaps both? MakeUseOf tells us, “This Is Why Tor Users Are Being Blocked by Major Websites.” A recent study (PDF) by the University of Cambridge; University of California, Berkeley; University College London; and International Computer Science Institute, Berkeley confirms that many sites are actively blocking users who approach through a known Tor exit node. Writer Philip Bates explains:

“Users are finding that they’re faced with a substandard service from some websites, CAPTCHAs and other such nuisances from others, and in further cases, are denied access completely. The researchers argue that this: ‘Degraded service [results in Tor users] effectively being relegated to the role of second-class citizens on the Internet.’ Two good examples of prejudice hosting and content delivery firms are CloudFlare and Akamai — the latter of which either blocks Tor users or, in the case of, infinitely redirects. CloudFlare, meanwhile, presents CAPTCHA to prove the user isn’t a malicious bot. It identifies large amounts of traffic from an exit node, then assigns a score to an IP address that determines whether the server has a good or bad reputation. This means that innocent users are treated the same way as those with negative intentions, just because they happen to use the same exit node.”

The article goes on to discuss legitimate reasons users might want the privacy Tor provides, as well as reasons companies feel they must protect their Websites from anonymous users. Bates notes that there  is not much one can do about such measures. He does point to Tor’s own Don’t Block Me project, which is working to convince sites to stop blocking people just for using Tor. It is also developing a list of best practices that concerned sites can follow, instead. One site, GameFAQs, has reportedly lifted its block, and CloudFlare may be considering a similar move. Will the momentum build, or must those who protect their online privacy resign themselves to being treated with suspicion?


Cynthia Murrell, June 8, 2016

Sponsored by, publisher of the CyberOSINT monograph

Google Voice Search: An Optimistic View

June 7, 2016

I read “Google Voice Search Records and Keeps Conversations People Haver Around their Phones but You Can Delete the Files.” I like the “you can delete the files. How does one know what has or has not been deleted in this era of real time cloud goodness?

I assume that the information in the write up is accurate.

The write up states:

The feature works as a way of letting people search with their voice, and storing those recordings presumably lets Google improve its language recognition tools as well as the results that it gives to people.

If you want to “delete” these recordings, the write up asserts:

It’s found by heading to Google’s history page and looking at the long list of recordings. The company has a specific audio page and another for activity on the web, which will show you everywhere Google has a record of you being on the internet.

Optimism is good. One presidential hopeful believed certain emails had been deleted. I am not sure that the FSB agrees. It seems that the Independent’s “real journalist” was not aware of “Your Data Is Forever.”

Stephen E Arnold, June 7, 2016

Alibaba Eyes Twiggle, Not the US Ecommerce Search Players

June 7, 2016

I read “Alibaba, the Chinese and Global Powerhouse, Is Investing in Twiggle, a Stealthy Israeli Ecommerce Search Start Up.” I love the stealthy part. Twiggle cannot be too stealthy because Alibaba found out about the system.

I learned in the write up:

Twiggle was funded back in 2013, and its founders were formerly leaders at Google, namely Dr. Amir Konigsberg, previously one of the members of Google’s operations in the emerging markets and former managing director of, and Dr. Adi Avidor, a former engineering tech lead at Google. Combined, the two have authored more than 35 U.S. patents and bring a wealth of experience in digital innovation in the fields of search, artificial intelligence and ecommerce.

The system delivers search and discovery. Crunchbase points out that the company has ingested about $20 million since it opened its doors in 2014.

I assume that the heartbeats of EasyAsk and SLI Systems sped up when word of Alibaba’s search proclivities diffused. With this alleged tie up, I assume the heart rates of the principals at EasyAsk and SLI Systems, assorted open source firms, and probably the every ready IBM Watson have slowed.

Stephen E Arnold, June 7, 2016

Yippy for Vivisimo

June 6, 2016

I read “Yippy Buys MC+A, a Veteran Google Search Appliance Partner.” Yippy, if memory serves, is a variant of Vivisimo. In the good old days, before Vivisimo sold to IBM and suddenly became a Big Data company, Yippy did search and retrieval. I assume the old document limits were lifted. I also assume that the wild and crazy config file editing has been streamlined. I also assume that Yippy is confident it can zoom past Maxxcat and Thunderstone, two outfits also in the search appliance business. Buying a Google reseller provides some insight and maybe leads into which companies embraced the GSA solution. When the GSA was first demonstrated to me, I noted the locked down relevance system. There were other interesting “enhancements” the Googlers included to eliminate the complexity of enterprise search. I recall working on the training materials for a DC reseller of the GSA. Customization was like a Google interview question.

The Fortune write up is one of those reinventions of enterprise search which I enjoy. I circled this comment:

Google Search Appliance was a great idea for companies that deploy a welter of different applications, so important data can be scattered about in different file systems and repositories. It also gave Google a toehold in corporate server rooms, which is why some wondered why Google would cut the product at a time when it’s trying to sell more cloud services to these very companies.

I was unaware that Alphabet Google was in the search business. I thought it was an online advertising outfit. Who at Google wanted to work on the wonky GSA products? For years Google relied on resellers and outfits like Dell to make the over priced gizmos.

Love live Vivisimo. I mean Yippy. If you cannot pin down an integrator, why not buy one?

Stephen E Arnold, June 6, 2016

BA Insight Asserts Government Transformation for Information Access

June 6, 2016

This is a bold, bold assertion. In my limited experience with the US government’s entities, I can attest that certain systems and methods do not change. That obviously does not match the marketing message from BA Insight, a search and retrieval vendor with a Microsoft focus and competence in all sorts of interesting buzzwords.

Navigate to “BA Insight and MD Tech Solutions Join Forces to Transform the Way Government Agencies Find and Work with Content.” I read:

BA Insight today announced a partnership with MD Tech Solutions, a technology services company located in Fredericksburg, Virginia specializing in Custom SharePoint Development, Administration, Design and Architecture.  As a reseller of BA Insight’s software portfolio, MD Tech can now provide its customers with a solution that quickly connects SharePoint users to the essential knowledge they need to be productive, while providing an internet-like search experience that users will love.

Love in the government entities with which I had experience when I labored in the vineyards in Congress, the executive branch, and some other outfits was, in my experience, in short supply. There were numerous search and retrieval systems. There were legacy systems which did magic things to information. Anyone remember the baked in search system with SharePoint? Maybe Fast Search? What about the components in Oracle? Palantir? IBM i2?

I find the notion of transforming the government interesting. Even more fascinating is the notion of users loving a search system. Love in the government. Hmmm.

Stephen E Arnold, June 6, 2016

Everyone Rejoice! We Now Have Emoji Search

June 1, 2016

It was only a matter of time after image search actually became a viable and useful tool that someone would develop a GIF search.  Someone thought it would be a keen idea to also design an emoji search and now, ladies and gentlemen, we have it!  Tech Viral reports that “Now You Can Search Images On Google Using Emoji.”

Using the Google search engine is a very easy process, type in a few keywords or a question, click search, and then delve into the search results.  The Internet, though, is a place where people develop content and apps just for “the heck of it”.  Google decided to design an emoji search option, probably for that very reason.  Users can type in an emoji, instead of words to conduct an Internet search.

The new emoji search is based on the same recognition skills as the Google image search, but the biggest question is how many emojis will Google support with the new function?

“Google has taken searching algorithm to the next level, as it is now allowing users to search using any emoji icon. Google stated ‘An emoji is worth a thousand words’. This feature may be highly appreciated by lazy Google users, as they now they don’t need to type a complete line instead you just need to use an emoji for searching images.”

It really sounds like a search for lazy people and do not be surprised to get a variety of results that do not have any relation to the emoji or your intended information need.  An emoji might be worth a thousand words, but that is a lot of words with various interpretations.


Whitney Grace, June 1, 2016
Sponsored by, publisher of the CyberOSINT monograph An Alternative to Facebook

May 27, 2016

VK is the name of the “old” VKontakte social networking service. My estimates peg the traffic to the site at about 30 percent of Facebook. The user count is in the 300 million range and growing. The user base is concentrated in Russia, but the service is attracting users from other countries. Online translation tools make it easy for a non Russian speaker to use the service.

Earlier this year, I read “German Neo Nazis Flocking to Putin’s Facebook Knock off VKontakte.” The write up seems a bit one sided, but social network sites allegedly linked to Mr. Putin suggests that a bit of additional research and investigation are warranted.

You can sign up and explore at I provide some basics for appropriate prophylactic measures in my Dark Web lectures. One thought is okay here: Be prudent.

You will need a account to access an interesting facial recognition service called FindFace. The site looks like this:


Like Google’s “search by photo”, the FindFace service delivers close matches to the face you upload to the service. The Guardian published “Face Recognition App Taking Russia by Storm May Bring End to Public Anonymity.” The digital wannabe stated:

FindFace compares photos to profile pictures on social network Vkontakte and works out identities with 70% reliability.

I mention and FindFace because I was asked if there were an alternative to Facebook. The answer is, “” However, the use of the service for certain types of groups and certain purposes is less easy than it was in the past. Some folks can use the apps and features instead of fooling around with Dark Web services.

Stephen E Arnold, May 27, 2016

The Reinterpretation of Google History

May 27, 2016

I read “Why Google Beat yahoo in the War for the Internet.” The information in the article touches upon some important points; for example, Google focused on a more homogeneous infrastructure. The history of Google, however, includes some tactical moves which the article ignores.

My enthusiasm for recycling the information about Google’s first five years has shriveled since I published the third volume of my Google trilogy. I want to point out several factoids which, no doubt, will interest few today. The article makes much of what is described as a “fresh start.” I do not agree with the “fresh” part.

First, Google is a descendent of Alta Vista, Jon Kleinberg’s Clever, and the information access research conducted at Stanford University and other universities with an interest in this technical field. As a result, the infrastructure benefits from Digital Equipment’s investment in its Alta Vista system. Much of that “knowledge” migrated to Google as Messrs. Brin and Page hired notable professionals away from the chaos of Alta Vista under Hewlett Packard’s management. Jeff Dean, Simon Tong, and others are responsible for much of the infrastructure for Google. The Alta Vista system was anchored in the DEC technology. The memory management benefits were obtained at a cost. Google embraced commodity hardware and a big chunk of the Alta Vista thinking. Fresh? Well, sort of.

Second, Google’s scrutiny of Yahoo had a couple of payoffs. Yahoo was a crazy quilt of warring tribes. Each tribe had its own technology idols. Google interpreted this as expensive and focused on reducing the costs by standardizing on systems and methods to a greater degree than Yahoo did. Over time, Yahoo became more sluggish due to its different fiefdoms. Google was comparatively stronger due to its less chaotic approaches. Don’t get me wrong. Google in its first five years was a wild and crazy outfit. Yahoo was wilder and crazier. As part of Google’s learning from Yahoo, Google recognized the value of selling ads the Yahoo way. Yahoo was unhappy with Google’s borrowing of its approach. Google settled a legal spat with about $1 billion in payments to the Yahooligans. But the majority of Google’s revenue comes from that me too play.

Third, Google, like Yahoo, is not sure what it will be from year to year. The difference is that Google has crafted a relatively consistent flow of advertising revenue from its early and somewhat crude pre-Oingo days. Google integrated acquired technologies more effectively than Yahoo typically did. The ability to integrate provided Google an important edge.

There are other touchpoints in Google’s early days. From my point of view, Google is from its inception a beneficiary of good luck because the competitors in Web search were distracted in an effort to become portals. Google, as I see the company, is less of an innovator and more of an emulator. Google has yet to demonstrate that renaming the company, reorganizing the units, and funding projects like cheating death will yield the next big thing.

Google, for me, was a one off, an anomaly.

Stephen E Arnold, May 27, 2016

Inc Magazine Explains Search. Really.

May 26, 2016

I read “How the World of Search Looks Like. Really.” [sic]

Now that is fine syntax. Perhaps the savvy Inc editor is confused about the Strunk & White comments about the use of “what”? Really.

The write up is even more orthogonal than the headline’s word choice.

An expert in search, who works at Gravity Media, has focused his attention on information access. Now information access is a nebulous concept. Search is a bit less difficult to define if you are, like me, pushing 72 years in age. Log on to an online system. Enter a keyword. Review the matches the system generates via brute force look up. See, easy, really.

I learned in the write up that my Abe Lincoln learnings are hopelessly out of whack.

I noted this passage:

While young Snapchatters who grew up in the midst of the evolving Web may prefer to Google search, the later-adopting Baby Boomers may very well be using Yahoo search.

Okay. Snapchatters. How does one “find” information via Snapchat?

I noted this statement:

Globally, quite a few other competitors are making good old Google sweat a bit. Tell me, are you feeling lucky? (My poor attempt at a Google joke…) Internationally, people are Yandexing, Baiduing, Yahooing, and the list goes on and on.

Ha. Ha. Really.

Then a statement which blindsided me. People in different countries search for information in ways different from those used in the US:

As the globe continues to shrink in the wake of the World Wide Web, these cultural nuances are something international brands should consider when trying to capture global audiences. Up until now there has been little attention paid to this increasing trend of the “other” search networks.

Right. Little attention. I assume those ads on Baidu for products not from China are outliers?

I circled:

Chinese-Americans’ searches will likely use a combination of Chinese and English search terms depending on what their level of comfort is with translation. This same fact is the reason a first generation Chinese millennial living in the U.S. would choose to utilize both Baidu and Google, depending on what they are searching.

My approach is to search for Chinese information in Chinese. I don’t read or speak Chinese, but I have team members who do. If one of these people sends me a link to a document in a language other than English, I use various online translation systems to get the main idea. Then I pick up the phone and talk with the native speaker about the information.

I completed the article with a big blue exclamation point:

In conclusion, the truth is that there is very little data on the Internet related to global search trends and user preferences. If the Internet has taught us one thing it is that being more visible on the Web is always to the benefit of the marketer. So if your brand has not been running search campaigns across more networks than just Google, now is the time to start. The insights that can be derived from a test campaign alone can reveal hugely important details related to the search habits of your target audience. Even if the outcome of a Yahoo paid search campaign reaffirms that a strictly Google campaign is the way to reach your brand’s target audience, there is only one way to find out – test it.

It seems, gentle reader, that the article is less about the search thing and more about the marketing services thing. That’s okay. Little wonder that niche search engines are poking their noses into the big, uncertain world. One can now search for gifs at GifMe or Giphy for this reason.

Back to Inc. What the heck is the editorial policy at Inc. Wonky word choice and an article about search which does not address the topic of what the world of search looks like. Looks like content marketing at best and editorial shortcutting from my vantage point in rural Kentucky. Really.

Stephen E Arnold, May 26, 2016

The Internet Archive: How It Works

May 26, 2016

I have noted that the interface for the Internet Archive is interesting. For me, the system is almost unusable.

I read “The Technology Behind the World’s Worst DVR.” I think the write up does a good job of explaining some of the challenges the system presents to the developers and to the users. I learned that the political ad archive works like this:


Okay, WordPress. I am a bit fuzzy about the other icons, however.

The good news appears in this statement:

Over the coming months we are working to make the system more accurate, and exploring ways to get it so that it can automagically identify newly released political ads without any need for manual entry.

Worth monitoring.

Stephen E Arnold, May 26, 2016

« Previous PageNext Page »