CyberOSINT banner

Digital Censorship: The Dartmouth Solution

June 21, 2016

I love the Ivy League. Yale, Princeton, Dartmouth, et al. I read “New Tool to Take Down Terrorism Images Online Spurs Debate on What Constitutes Extremist Content.” The source is the Bezos owned newspaper, which may or may not make a difference when applying the filters.

Here’s the method I circled in Hawthorne A red:

It [censoring mechanism] works by creating a distinct digital signature or “hash” for each image, video or audio track. The idea is to create a database of hashed content that Internet firms can use in automated fashion to vet images uploaded to their platforms. If there’s a match, the company can determine whether it violates its terms of service and should be taken down.

The write up brings up the challenge of defining what should and should not be filtered. If you are interested in automated filtering, check out the write up and the Dartmouth wizard who has created what he thinks may be a “game changer.”

What could go wrong with an Ivy League created system? If one cannot find information online, it does not exist. Is non existence a net good?

Stephen E Arnold, June 21, 2016

Digital Currencies: More Excitement

June 21, 2016

An “attacker” explains the legal perception he has. You can read this argument at this link. I do not have a horse in this race. In my recent lecture at a security conference in Myrtle Beach, SC, I pointed out that digital currencies work reasonably well for what I call small scale transactions. Putting one’s life savings into a digital currency is a step some bad actors are reluctant to take. Traditional non digital money laundering and tax evasion methods will slowly yield to Fancy Dan types of “money.” But if you are adventurous, have a go.

Stephen E Arnold, June 21, 2106

Murdoch Wall Street Journal Factiva: Known Unknowns

June 10, 2016

That Donald Rumsfeld statement about known knowns, known unknowns, etc. Is back. The Wall Street Journal ran an ad for Factiva. You remember Factiva. It is the Dow Jones Information Service repositioned and renamed a number of times over the last 15 or 20 years.

If you are into for fee search, you will know about Factiva and its kissing cousins: LexisNexis (bring your legal client’s purchase order), CSA ProQuest Dialog (bring your library acquisition budget), and Ebsco (bring your credit card). For fee information services serve the professional searcher market. Most people — including Gen X and Millennials researchers — are happy with Google. Objective results every time.

The for-fee services are still around. Public library and university fund raising programs help pay for access. Some queries returning zero useful results can cost $100 or more. Hey, you didn’t know, right?

If you navigate to the June 2, 2016, Wall Street Journal, page A7 in my dead tree edition ran a full page ad for Factiva. The ad highlights a couple of pie charts. Here they are in a tough to read gray and blue motif. Users of commercial database services have really sharp eyes and don’t need high contrast text, right?

The first pie chart shows your life consumed with research. Notice how little time one has to eat lunch. Note what a tiny portion of one’s day is available for email, Facebook, talking with colleagues, making sales calls, printing, the youth soccer telephone tree.


Now look at the second chart.


Look at the many different tasks one can undertake in a single work day. One can, of course, “take lunch.” I eat lunch, but that’s because here in rural Kentucky, we “eat” a meal. We make decisions. Apparently in Factiva land one takes a meal and probably takes decisions.

Other tasks one can pursue when one has Factiva are:

  • Collaborating across departments
  • Advise colleagues
  • Stay on top of the news (Hey, it is part of that real journalism outfit owned by Mr. Murdoch. No bugging telephones, please.)
  • Create a company newsletter. (I assume this word is “blog”, a Snapchat, or a tweet, but I could be off base.)
  • Build powerful infographics. (Hmmm. I thought art types created infographics based on the data generated by a business intelligence system.)
  • Research. Yes via Factiva.

Now I know that I am really out of the flow. The diagram showing the different between Baby Boomers and Millennials created by ace research analyst Mary Meeker reminded me of the gulf between my demographic and the zippy millennials.


Slide 51 from the Meeker, State of the Internet report.

The main point for me is that I possess zero of the attributes of millennials. I don’t earn to spend. I am retired. I conserve to pay for the old age home which I believe millennials call “opportunities for bingo.”

But the best part of the Factiva ad is the copy. I know words. Those nifty pie charts were the cat’s pajamas, weren’t they?

Here’s the guts of the message:

Spend your day working, not searching. Factiva’s reputable sources, flexible search and powerful insights provide access to thousands of quality, licensed, news and information sources in 28 languages. Know unknowns. [Emphasis added]

If Ms. Meeker is correct in her research and the supporting information from Hillhouse Capital and dozens of what appear to be primary sources and many hours of online searching commercial and Web resources — messaging apps are where the future is. Oh, there are videos too, but the takeaway is that traditional methods of getting digital information are in the same spot newspapers were yesterday.

The ad warrants several questions:

  • Why does it have to be so darned big? Maybe small ads in the Wall Street Journal are ignored?
  • How many of the Wall Street Journal’s readers are information specialists trained in the use of commercial online services? Judging from the Special Library Association’s challenges, I would suggest that the ad would have made sense to the corporate information specialist working in 1986, not 2016.
  • What’s with the wonky pie charts? When I worked at a commercial database company, I don’t recall meeting any online users who spent the bulk of every day online. There were reference interviews (remember them, millennials?), culling the outputs from dot matrix printers, and planning search strategies before going online and whacking away.

Mr. Rumsfeld’s statement about knowns and unknowns emerged from his brush with the murky world of government related information. If he were to use Factiva today, would he have modified this famous statement:

There are known knowns. These are things we know that we know. There are known unknowns. That is to say, there are things that we know we don’t know. But there are also unknown unknowns. There are things we don’t know we don’t know.

Perhaps Factiva, like IBM Watson, is easier to describe than turn an information search system into a lean, mean, money making machine? I would suggest that the answer for decades has been an unknown unknown.

Stephen E Arnold, June 10, 2016

eBay and Facebook: Different Spins in Online Sales

April 14, 2016

I noted two seemingly unrelated items about two different companies. Here are the two items:

  1. Russian Diplomat: ISIS Making $200 Million Selling Stolen Artifacts on eBay
  2. Weapons for Sale on Facebook in Libya

In our work on the “Dark Web Notebook,” we have examined a number of sites which purport to offer contraband or prohibited products. These sites have been accessible using special software.

What is interesting is that the difference between the Dark Web and the “regular” Web seem to be blurring.

If these two stories are accurate, questions about governance by the owners of the Web sites may be raised. Since we began working on this new study of online content, we have noted that the boundary separating the Web which billions use from the Web tailored to a smaller set of online users is growing more difficult to discern.

In itself, the boundary’s change is interesting.

Stephen E Arnold, April 14, 2016

US Control of Internet Over

March 20, 2016

Short honk: I read “Quietly, Symbolically, US Control of the Internet Was Just Ended.” The write up explains that at a meeting in Morocco, people who run the “Internet’s naming and numbering system” have a plan

to end direct US government oversight control of administering the internet and commit permanently to a slightly mysterious model of global “multi-stakeholderism”.

What’s multi stakeholderism? I noted the reference to Snowden but multi stakeholderism?

Stephen E Arnold, March 20, 2016

Anonymous Hacks Turkish Cops

February 17, 2016

No Dark Web needed.

Anonymous has struck again, this time hacking the Turkish General Directorate of Security (EGM) in its crusade against corruption. The International Business Times reports, “Anonymous: Hacker Unleashes 17.8 GB Trove of Data from a Turkish National Police Server.” It is believed that the hacker responsible is ROR[RG], who was also deemed responsible for last year’s Adult Friend Finder breach. The MySQL-friendly files are now available for download at TheCthulhu website, which seems to be making a habit of posting hacked police data.

Why has Anonymous targeted Turkey? Reporter Jason Murdock writes:

“Anonymous has an established history with carrying out cyberattacks against Turkey. In 2015 the group, which is made up of a loose collection of hackers and hacktivists from across the globe, officially ‘declared war’ on the country. In a video statement, the collective accused Turkish President Recep Tayyip Erdo?an’s government of supporting the Islamic State (Isis), also known as Daesh. Turkey is supporting Daesh by buying oil from them, and hospitalising their fighters,’ said a masked spokesperson at the time. ‘We won’t accept that Erdogan, the leader of Turkey, will help Isis any longer. If you don’t stop supporting Isis, we will continue attacking your internet […] stop this insanity now Turkey. Your fate is in your own hands.’”

We wonder how Turkey will respond to this breach, and what nuggets of troublesome information will be revealed. We are also curious to see what Anonymous does next; stay tuned.

Cynthia Murrell, February 16, 2016


Stealing Data on the Dark Web Just Became Easier

February 1, 2016

“Underground Black Market: Thriving Trade in Stolen Data, Malware, and Attack Services” assumes that the reader knows the basics of the Dark Web. Let’s stake a step back.

Before we talk about stealing data on the Dark Web we must first define what we mean by the Dark Web. Most internet uses never go beyond the surface web, that part of the Web that consists of static Web sites such as Google, Facebook, and YouTube. What makes the Dark Web so interesting is that is it not entirely dark.

In fact, many Dark Web sites and their content are visible to the public. What is not visible is the server addresses which block most people from seeing who is running the sites.

In the article, Candid Wueest talks about a new paradigm for stealing and moving stolen data on the Dark Web. I noted that crimeware-as-a-service lets:

Attackers can easily rent the entire infrastructure needed to run a botnet or any other online scams. This makes cybercrime easily accessible for budding criminals who do not have the technical skills to run an attack campaign on their own. A drive-by download web toolkit, which includes updates and 24/7 support, can be rented for between $100 and $700 per week.

That means that it is becoming increasing easier for criminals to find, access, and sell data. Now you know. Now, anyone, including your local bad actor or your 11 year old, can access and steal data.

Here’s a troubling factoid from “The Tangled World of Stolen Data,” which we assume is spot on: It takes about 205 days for a company detect a data breach,  more than enough time for a cybercriminal to sell the data and get it distributed on the Dark Web.

So what can law enforcement agencies do? New advances in Dark Web access, such as I2P, are making it more difficult for these agencies to identify and react to data crimes. What this means is that the law security companies and law enforcement agencies will need to be creative. The FBI ran an offensive image site to get a grip on alleged wrong doers.

Perhaps the Dark Web is not as dark as many assume.

Martin A. Matisoff, MSc, February 1, 2016

A Road Map to the Dark Suburb of i2p Content

February 1, 2016

According to the I2P Web site, the Invisible Internet Project (I2P)  is an

anonymous overlay network … that is intended to protect communication from dragnet surveillance and monitoring by third parties such as ISPs and … is used by many people who care about their privacy: activists, oppressed people, journalists and whistleblowers.

Users who wanted information on I2P had two options for obtaining information about I2P and I2P services: search the web and create your own guide over time, or visit the I2P website which provides a useful index to I2P.


A more rich i2p resource is one you may want to explore. A fascinating Baedeker for the Dark Web is available on a pastesite, which is an anonymous publishing service.

The Guide to I2P and I2P Services Version 1 puts a Cliff’s Notes to sources of products, services, and information about weapons, controlled substances, and stolen Uber accounts. There are descriptions of the best ways for users to configure their computers so they can access .i2p sites and what you need to do once connected to these hidden services.

The guides offers a plethora of links to some of the most requested I2P sites, including image boards, such as Anch , a site for and by anarchists; file sharing sites such as Document Heaven  financial sites such as VEscudero’s Service, Darknet Products,  and social sites such as id3nt  and Visibility. Investigators may understand Facebook and Twitter, but the Dark Web is, for many, a digital Rubik’s cube.

In addition, the guide will offer tutorials and other topics including links to sites for users who speak different languages such as Russian, German, and Spanish.

The Guide to I2P and I2P Services not only provides numerous links to I2P sites, but it addresses concerns about the dangers of relaying encrypted traffic and Java vulnerabilities. Furthermore, it tells you how to connect to I2P IRC servers that are not part of IRC2p. The guide can help you map dark net maze.

How can investigators, analysts, and intelligence professionals get a working understanding of i2p? Easy. Contact benkent2020 at and inquire about our on site or online webinars about the Dark Web.

Martin A. Matisoff, MSc, February 1, 2016

AOL: A New Do for the Year of the Monkey

January 18, 2016

I read “AOL’s Identity Crisis: The Company May Ditch the AOL Brand.” I remember the flood of discs. I remember the hidden files thoughtfully placed on my hard drive when I installed America Online. I remember the Xoogler who bought his own local publishing outfit to reinvigorate AOL and, of course, his own local America Online. I remember Verizon buying AOL because, well, it could.

According to the write up:

one of AOL’s biggest priorities for the new year is figuring out its brand and investing in it, even if that means saying goodbye to the name “AOL” in favor of launching something completely new.

I learned that

Mark Ritson, a leading brand expert and marketing consultant, tells Business Insider that he also thinks the messy corporate brand definitely needs a clean up. AOL is tricky, he says, because it has very strong brand awareness, but that its image “has an unpalatable mix of being seen to be out of date and a business failure.”

No matter what name Verizon chooses, AOL will always evoke fond memories for me. The dial up modem, the chat groups, the fantastical email services.

So many memories. What ever happened to that Xoogler’s local news idea? Ah, it did not work out.

I look forward to the Yahooligans following AOL’s trajectory. Two new branding opportunities for the marketing consultants.

It is the year of the monkey too. Love those creatures.

Stephen E Arnold, January 18, 2016

Boolean Search: Will George Boole Rotate in His Grave?

January 12, 2016

Boolean logic is, for most math wonks, the father of Boolean logic. This is a nifty way to talk about sets and what they contain. One can perform algebra and differential equations whilst pondering George and his method for thinking about fruits when he went shopping.

In the good old days of search, there was one way to search. One used AND, OR, NOT, and maybe a handful of other logic operators to retrieve information from structured indexes and content. Most folks with a library science degree or a friendly math major can explain Boolean reasonably well. Here’s an example which might even work on CSA ProQuest (nèe Lockheed Dialog) even today:

CC=77? AND scam?

The systems when fed the right query would reply with pretty good precision and recall. Precision provided info that was supposed to be useful. Recall meant that what should be included was in the result set.

I thought about Boole, fruit, and logic when I read “The Best Boolean and Semantic Search Tool.” Was I going to read about SDC’s ORBIT, ESA Quest, or (heaven help me) the original Lexis system?


I learned about LinkedIn. Not one word about Palantir’s injecting Boolean logic squarely in the middle of its advanced data management processes. Nope.

LinkedIn. I thought that LinkedIn used open source Lucene, but maybe the company has invested in Exorbyte, Funnelback, or some other information access system.

The write up stated:

If you use any source of human capital data to find and recruit people (e.g., your ATS/CRM, resume databases, LinkedIn, Google, Facebook, Github, etc.) and you really want to understand how to best approach your talent sourcing efforts, I recommend watching this video when you have the time.

Okay, human resource functions. LinkedIn, right.

But there is zero content in the write up. I was pointed to a video called “Become a LinkedIn Search Ninja: Advanced Boolean Search” on YouTube.

Here’s what I learned before I killed the one hour video:

  1. The speaker is in charge of personnel and responsible for Big Data activities related to human resources
  2. Search is important to LinkedIn users
  3. Profiles of people are important
  4. Use OR. (I found this suggestion amazing.)
  5. Use iterative, probabilistic, and natural language search, among others. (Yep, that will make sense to personnel professionals.)

Okay. I hit the stop button. Not only will George be rotating, I may have nightmares.

Please, let librarians explicitly trained in online search and retrieval explain methods for obtaining on point results. Failing a friendly librarian, ask someone who has designed a next generation system which provides “helpers” to allow the user to search and get useful outputs.

Entity queries are important. LinkedIn can provide some useful information. The tools to obtain that high value information are a bit more sophisticated than the recommendations in this video.

Stephen E Arnold, January 12, 2016

Next Page »