The Underside of the Internet, Just Slightly Off Base

October 11, 2017

Deutsche Welle ran a story about the Dark Web called “Darknet, The Shady Internet.” I found the approach interesting. Let me mention that I am the author of Dark Web Notebook, a guide for law enforcement and intelligence professionals. (Information about the Notebook is at this link.) I don’t want to work pedantically through the write up, pointing out issues I have with some of the assertions. I do want to highlight the conclusion of the article. DW points out that LE and intel professionals have to use methods which seem to be less than elegant. Here’s the passage I highlighted:

So what can police, federal law enforcement officials, secret police and international crime-fighting networks do to combat the darknet? Some tactics are surprisingly old fashioned. One is to purchase an illegal item from a darknet marketplace and then analyze the package and its contents when it comes in the mail. With enough data, police can hone in on the package’s source. Another tactic is to build rapport with the site’s owner, say a drug dealer, and to request a real-life meeting to exchange the goods.

I would point out that there are a number of companies which offer specialized products and services to assist LE and intel professionals with Dark Web investigations. These range from the Google and In-Q-Tel funded Recorded Future to the less well known Terbium Labs. There are other companies as well, and I profile a number of them in Dark Web Notebook.

I am surprised that the DW invested modest effort in its write up. Dark Web content is a tiny fraction of data available online. Nevertheless, as censorship in countries and at such firms as Facebook, Google, and Twitter-type companies increases, the Dark Web will experience some growth despite the hurdles the Dark Web puts in front of users.

I would point out that in the Dark Web Notebook we recount  an anecdote involving a German policeman who explored the Dark Web and found himself caught in a digital bear trap. Thus, knowledge of the sophisticated tools available to LE and intel professionals is important. Leaving these out of an article from a respected “news” organization underscores the need for a bit more attention to detail and context.

Stephen E Arnold, October 11, 2017

Search and Privacy: A Quick Update

October 3, 2017

In my files, I had a copy of “Duck Duck Go: Illusion of Privacy.” This document comments on the hurdles a public Web search system must jump over in order to deliver privacy. You can find the write up at this link. If you want to test some privacy-oriented search systems, there are some alternatives. I am not endorsing these outfits; I am passing along some links because within the last couple of years I learned that privacy is part of the marketing for these systems: [a] Ixquick which is now Startpage at This is a metasearch engine which means that the user’s query is passed (in theory anonymously to Bing, Google, Yandex, et al). [b] (Note that this European service asserts “strong privacy.” The link is  [c] Gibiru service ( emphasizes anonymous search. Gibiru provides a link to the Firefox Anonymox plug in. But the most recent version of Firefox has been tricky for us, however. My personal view on search anonymization is that when I research my books about cyberosint, the Dark Web, and eDiscovery for cyber intelligence, I assume that I have a number of individuals thrilled with the sites we uncover, write up, and describe in our lectures and webinars. In short, I avoid trying to be “tricky” because I can explain the thousands of queries we run about many exciting topics. See for a sampler.

Stephen E Arnold, October 3, 2017

Facebook: A Pioneer in Bro-giveness?

October 2, 2017

The write up “Mark Zuckerberg Asks for Forgiveness from ‘Those I Hurt This Year’ in Yom Kippur Message” surprised me. In my brief encounters with Silicon Valley “bros”, I cannot recall too many apologies or apologetic moments. My first thought was, “Short circuit somewhere.”

The Verge article explained to me:[Mark Zuckerberg, founder of Facebook] publicly asked for forgiveness for those I hurt this year.

I thought online companies were like utilities. Who gets excited if a water main breaks drowns an elderly person’s parakeet? Who laments when a utility pole short circuits a squirrel? Who worries if an algorithm tries to sell me an iPhone when I am an Android-type senior citizen?

I noted this statement:

Zuckerberg acknowledged that Facebook has had a divisive effect on the country, and that he’ll work to do better in the coming year.

I like New Year’s resolutions.

The write up quotes another Silicon Valley source which I sometimes associate with enthusiasm for what’s new and “important”:

Facebook itself needs to do better to improve its efforts in combating the spread of false information and abuse that appears throughout its platform. It and other social media sites have often touted themselves as a neutral platforms for all ideas and beliefs, but underestimate how these ideals can be undermined, which led to tangible impacts in the real world. Zuckerberg may be sincere in his intentions, but the company he founded needs to follow through on them.

Follow through? Okay.

I think of this commitment to do better as the Silicon Valley equivalet of the New Yorker’s breezy, “Let’s have lunch.”

Is bro-giveness is a disruptive approach to forgoveness? If it is, click the Like button.

Stephen E Arnold, October 2, 2017

Silicon Valley and the Butterflies

September 24, 2017

I read “The Tide Is Starting to Turn Against the World’s Digital Giants.” The idea is that those butterfly wings in Brazil can whip up Irma in Miami. Maybe? Maybe not? “Real” journalists have been paddling their canoes away from a whirlpool for decades.

Their efforts, like those sucked into the digital maws of the evil “Internet” have not gone well. The Guardian newspaper, itself whipped by one digital transformation after another, wants the old order restored. The idea is that “real” journalists and other “intermediaries” were gatekeepers. Now the distributed technologies have replaced the “old” gatekeepers with “new” gatekeepers. Oh, the “real” journalists sign, “We want to be Facebook. We were you know.”

The write up explains that the information satans are about to get their comeuppance. About time, I think the write up suggests.

I noted these comments in the “real” news write up:

[a] Multimillion fines are just the start for Facebook and Google, as the world comes to realize how political big tech has become [a piñata? Satan’s right hand? the destroyer of “real” news? Sentence fragments invite completion even when they appear in the Guardian newspaper]

[b] What’s more interesting are various straws in the wind that show how digital behemoths are losing their shine. Many of these relate to Brexit and the election of Donald Trump, and to the dawning of a realization that Google and Facebook in particular may have played some role in these political earthquakes.

[c] What we’ve come to understand over the last two years is that, to coin a slogan, the technical is political.

I find it interesting that the intellectual touchstone for the write up is not the history book but Buzzfeed. Yep, Buzzfeed, a love child of the Guardian in spirit perhaps?

Ah, “real” journalism. Bashing successful companies is effective with a digital information service as one’s inspiration.

Stephen E Arnold, September 24, 2017

Security: Whom Does One Trust?

September 19, 2017

I read “The Market Can’t – and Won’t – Deal with IT Security, It Must Be Regulated, Argues Bruce Schneier.” The write up is about online, which is of interest to me. I found the summary of the remarks of Bruce Schneier, a security expert, interesting.

The main point is that government must regulate security. I highlighted this passage:v”The market can’t fix this. Markets work because buyers choose between sellers, and sellers compete for buyers. In case you didn’t notice, you’re not Equifax’s customer. You’re its product.

Several questions occurred to me:

  1. Which government? Maybe the United Nations?
  2. What’s the enforcement mechanism? Is after-the-fact “punishment” feasible?
  3. What’s the end point of security regulation?

Here in rural Kentucky security boils down to keeping an eye on the two brothers who live in a broken down trailer next to the crazy people who have a collection of wild animals. The wild animals are less threatening than these fine examples of Appalachian oak.

In the larger world which includes a number of nation states which are difficult to influence, how are the regulations to be enforced. What if one of these frisky nation states is behind the headline making security breaches?

Answers to this question are likely to be cause for discussion. Talk is easy. Remediation may be a bit more difficult. Perhaps the barn has burned and the horses already converted to glue and dog food?

Fixes are hard. Talk, well, just talk.

Stephen E Arnold, September 19, 2017

Bing and Google: The News Battle

September 15, 2017

I read “Bing Battles Google News with Its Own Make-Over.” I noted the alliteration: Bing battle. I immediately thought, “Google Gropes.” Both of these companies are trying to reinvent the newspaper using zeros and ones, not dead trees. Let’s look at some of the points I highlighted:

I noted this statement everyone’s most lovable online ad vendor:

Google redesigned their desktop Google News website. Their [sic] new UI has a clean and uncluttered look.

Microsoft responded. I circled this statement:

Microsoft recently updated their Bing News experience that will help users in finding the most up to date and well-rounded information.

Note that the pivot of both sentences is a subjective assertion: “Clean and uncluttered” for the GOOG, and “most up to date and well rounded.”

Some facts would be useful. I am not sure what “clean” or “uncluttered” means. My recollection is that Einstein’s desk like most “dead tree” newspapers are organized in an eclectic manner. Facts supporting these assertions might be difficult to conjure.

The “most up to date” statement should be easy to back up. What’s the latency of the system? The superlative “most” means that Bing is the top dog in news. Hmmm. I don’t buy this.

My point is that the write up provides a useful idea: Neither Bing nor Google has figured out how to present “news” to each system’s online users. The implicit idea is that “dead tree” methods are of little use. Inspiration comes from each system’s response to what the other system does.

Cold War methods applied to online “news”? That’s what the write signals me.

Let’s step back.

Online users have different reasons for wanting news. Some folks chase sports, which as I recall was the most read section of the “dead tree” newspaper company at which I once worked. Other people have quite different reasons for scanning the news; for example, there are some who read the obituaries, others seek cartoons, and others want the latest on the real housewives.

Bing and Google have to figure out how to meet these diverse needs because the “dead tree” crowd has fallen in the forest.

The write up tells me one thing: Neither Google nor Microsoft has any idea about reinventing what “dead tree” newspapers used to do.

Now what? Shape the news to fit what each company’s filters “decide” is “real news”?

Stephen E Arnold, September 15, 2017

Why Not Have One or Two Smart Software Platforms? Great Idea!

September 8, 2017

I have been writing about online for decades. In my Eagleton Lecture in the 1980s (I forget the year I received an ASIS Award and had to give a talk as part of the deal), I pointed out that online concentrates. It is not just economy of scale, a single online service operates like a magnet. Once critical magnetism is achieved, other services clump around or two that central gizmo. There are fancy economic explanations; for example, economies of scale, convenience, utility, and so on.

Now couple the concentration with one of the properties of digital information flows and we have some exciting things to think about. In that Eagleton Lecture, I used the example of the telegraph to illustrate how even inefficient forms of information movement can rework social landscapes.

The point is that concentration and flows are powerful forces. When data flows through an institution, that institution comes apart. When online power is concentrated, the nature of “facts” changes. Even the unconscious decisions of a widely used online service alter how individuals perceive “facts” and set their priorities. (Are you checking your email now, gentle reader?)

When I read “Facebook and Microsoft collaborate to simplify conversions from PyTorch to Caffe2,” I thought about that decades old Eagleton Lecture of mine. The write up describes what seems to be a ho-hum integration play. Here’s a passage I highlighted:

The collaborative work Facebook and Microsoft are announcing helps folks easily convert models built in PyTorch into Caffe2 models. By reducing the barriers to moving between these two frameworks, the two companies can actually improve the diffusion of research and help speed up the entire commercialization process.

Just a tiny step, right?

A few observations:

  • Clumping is now evident between Google and Walmart
  • Amazon and Microsoft are partnering to make digital gizmos play well together
  • Fusion (both data and services) are the go-to idea for next generation information access and analysis services.

The questions which seem interesting this morning are:

  1. Why do we need to have just multiple artificial intelligence platforms? Won’t one be more efficient and “better”?
  2. How will users understand reality when smart software seamlessly operates across what appear to be separate functions?
  3. What will regulators do to control clumping which will command the lion’s share of revenue, resources, and influence?

Yep, just a minor step with this PyTorch and Caffee2 deal.

Online can be exciting and transformative too.

Stephen E Arnold, September 8, 2017

Factoids about Toutiao: Smart News Filtering Service

August 28, 2017

The filtering service Toutiao is operated by Bytedance. The company attracted attention  because it is generating money (allegedly) and has lots of users or “daily average users” in the 120 million range. (If you are acronym minded, the daily average user count is a DAU. Holy Dau!)

Forget Google’s “translate this page” for Toutiao, the service is blind to the Toutiao content. A work around is to cut and paste snippets into or get someone who reads Chinese to explain what’s on the Toutiao’s pages.

Other items of interest include. (Oh, the hyperlinks point to the source of the factoid.)

    • $900 million in revenue (allegedly). Wall Street Journal, August 28, 2017 with a pay wall for your delectation
    • Funding of $3 billion Crunchbase
    • Valuation of $20 billion or more Reuters
    • Toutiao means headlines Wikipedia
    • What it does from Wikipedia:

Toutiao uses algorithms to select different quality content for individual users. It has created algorithmic models that understand information (text, images, videos, comments, etc.) in depth, and developed large-scale machine learning systems for personalized recommendation that surfaces content users have not necessarily signaled preference for yet. Using Natural Language Processing and Computer Vision technologies in A.I, Toutiao extracts hundreds of entities and keywords as features from each piece of content. When a user first open the app, Toutiao makes a preliminary recommendation based on the operation system of his mobile device, his location and other factors. With users’ interactions with the app, Toutiao fine-tunes its models and make better recommendations.

  • Founded by Zhang Yiming, age 34, in 2012 Reuters

Technode’s “Why Is Toutiao, a News App, Setting Off Alarm Bells for China’s Giants?” suggests that Toutiao may be the next big Chinese online success. The reason is that the service aggregates “news” from disparate content sources; for example, text, video, images, and data.

Toutiao may be the next big thing in algorithmic, mobile centric information access solutions. The company generates revenues from online ads. The company’s secret sauce include smart software plus some extra ingredients:

  • Social functions
  • Search
  • Video
  • User generated “original” content
  • Global plans.

Net net: Worth watching.

Stephen E Arnold, August 28, 2017

Web Search Training Wheels: A Play for Precision

August 10, 2017

I read “How to Instantly Boost the Accuracy of Search Results on Google and Bing.” i love the word “instantly”, particularly when coupled to “accuracy.” The write up describes an overlay called Advangle, which helps a person create a search with more than 2.6 words. Interesting neologism Advangle.

These services are what I call “training wheels.” The idea is that a person looking for information fills in a form, which helps the person create a query more sophisticated than “pizza.” Many systems in the last 50 years have tried these types of interfaces. In fact, one can find them in the whiz bang interfaces available to cyber OSINT software users. I won’t drag the old Dow Jones interface into this post, nor will I provide screenshots of Palantir Gotham interfaces. (Hey, you probably know about these already.)

The write up, however, does not explore the concept in too much detail. I noted this statement:

The Advantage interface makes it easier to string together targeted searches with the right syntax, and in half the time it would take to type it all out by hand.

Saving time, not prediction or recall, is the unique selling proposition.

It is useful to keep in mind that formal search operators are still available to users of Bing, Google, Yandex, and a number of other systems. The problem is that as Web search has massified, a tiny faction of the users of ad supported Web search systems bother with formal operators like filetype: or other oddities.

The real problems with search are far deeper than an interface overlay. Let me highlight several which I find consistently troublesome:

  1. Finding a way to impart the skills of well executed reference interview conducted by an expert in online search and retrieval. (Marydee Ojala, Ruth Patel, Anne Mintz, Ulla de Stricker, and Barbara Quint are individuals who can help a PhD formulate a statement of what information and data are needed, convert that desire into appropriate queries of appropriate databases, and deliver a filtered list of results.) Software, no matter how nifty the interface, at this time cannot replicate this expertise.
  2. Individuals who need information are more crippled than their counterparts from 30 years ago. Online systems have worked hard to let popularity and past user behavior provide a context for a query like “cyrus.” If you think you will get the pop star before a long dead historical figure, you are more sophisticated than the eager consumers of pop up ads on a Pixel phone
  3. Databases are governed by editorial policies. In the good old days of 1975, creators of databases figured out what and how to index. Today most users believe that Google has “all” the world’s information. Nothing could be more wrong headed. Indexes, particularly free ones, include what creates traffic. If the content gets a little too frisky, censorship, filtering, and smart / predictive software steps in and delivers “better” information.

I suggest you give the Advantage service a try. You may find that it is better than a room stuffed with Quints and Ojalas and others of this ilk.

My approach is simple: Know what one wants. Formulate a suitable query. Pass the query across the sources/databases likely to have indexed the information. Review the results. Think about the information gaps. Repeat the process.

Pretty crazy today, right?

Who has time to figure out what companies are in the cyber OSINT business or what Dark Web sites continue to offer contraband in the wake of AlphaBay and Hansa.

Research via digital resources, unlike checking Facebook, is a bit of a mental workout.

On the other hand, why not let the ad supported search engines deliver exactly what they think you need. Better yet, let these outfits provide that information before you know you need it.

A system that actually delivered precise, on point, timely, and authoritative results would be great. It would be nice to be able to live forever and travel to the stars.

Reality is a tad different. UX is not yet a replacement for knowing how to research in a way that moves beyond finding Game of Thrones.

Stephen E Arnold, August 10, 2017

India Jumps on the Filtering Bandwagon

August 9, 2017

We noted “Internet Archive Contacted Indian Govt Regarding the Block but Got No Response.” The main point is that a repository (incomplete as its collection of Web pages may be) seems to be unavailable in India. Perhaps the Indian government has found a way to search for information in the service. We have noted that searching for rich media, including the collection of 78 rpm records, is a tough slog. It is tough to find information even when it is online. When services are filtered, locating facts, semi-facts, and outright hoohaw becomes impossible. We think the actions could impair the outstanding customer support services provided by the world’s second largest nation. Efficient delivery of information centric services, however, are like to improve in Mumbai. China, Indonesia, Russia, Turkey, and now India may be taking steps to put the data doggies in the kennel.

Stephen E Arnold, August 9, 2017

Next Page »

  • Archives

  • Recent Posts

  • Meta