CyberOSINT banner

Poor IBM i2: 15 Year Old Company Makes Headlines in Fraud Detection and Big Blue Is Not Mentioned

August 3, 2015

Before IBM purchased i2 Ltd from an investment outfit, I did some work for Mike Hunter, one of the founders of i2 Ltd. i2 is not a household name. The fault lies not with i2’s technology; the fault lies at the feet of IBM.

A bit of history. Back in the 1990s, Hunter was working on an advanced degree in physics at Cambridge University. HIs undergraduate degree was from Manchester University. At about the same time, Michael Lynch, founder of Autonomy and DarkTrace, was a graduate of Cambridge and an early proponent of guided machine learning implemented in the Digital Reasoning Engine or DRE, an influential invention from Lynch’s pre Autonomy student research. Interesting product name: Digital Reasoning Engine. Lynch’s work was influential and triggered some me too approaches in the world of information access and content processing. Examples can be found in the original Fast Search & Transfer enterprise systems and in Recommind’s probabilistic approach, among others.

By 2001, i2 had placed its content processing and analytics systems in most of the NATO alliance countries. There were enough i2 Analyst Workbenches in Washington, DC to cause the Cambridge-based i2 to open an office in Arlington, Virginia.

i2 delivered in the mid 1990s, tools which allowed an analyst to identify people of interest, display relationships among these individuals, and drill down into underlying data to examine surveillance footage or look at text from documents (public and privileged).

IBM has i2 technology, and it also owns the Cybertap technology. The combination allows IBM to deploy for financial institutions a remarkable range of field proven, powerful tools. These tools are mature.

Due to the marketing expertise of IBM, a number of firms looked at what Hunter “invented” and concluded that there were whizzier ways to deliver certain functions. Palantir, for example, focused on Hollywood style visualization, Digital Reasoning emphasized entity extraction, and Haystax stressed insider threat functions. Today there are more than two dozen companies involved in what I call the Hunter-i2 market space.

Some of these have pushed in important new directions. Three examples of important innovators are: Diffeo, Recorded Future, and Terbium Labs. There are others which I can name, but I will not. You will have to wait until my new Dark Web study becomes available. (If you want to reserve a copy, send an email to benkent2020 at yahoo dot com. The book will run about 250 pages and cost about $100 when available as a PDF.)

The reason I mention i2 is because a recent Wall Street Journal article called “”Spy Tools Come to Wall Street” Print edition for August 3, 2015) and “Spy Software Gets a Second Life on Wall Street” did not. That’s not a surprise because the Murdoch property defines “news” in an interesting way.

The write up profiles a company called Digital Reasoning, which was founded in 2000 by a clever lad from the University of Virginia. I am confident of the academic excellence of the university because my son graduated from this fine institution too.

Digital Reasoning is one of the firms engaged in cognitive computing. I am not sure what this means, but I know IBM is pushing the concept for its fascinating Watson technology, which can create recipes and cure cancer. I am not sure about generating a profit, but that’s another issue associated with the cognitive computing “revolution.”

I learned:

In pitching prospective clients, Digital Reasoning often shows a demonstration of how its system respo9nded when it was fed 500,000 emails related to the Enron scandal made available by the Federal Energy Regulatory Commission. After being “taught” some key concepts about compliance, the Synthesys program identified dozens of suspicious emails in which participants were using language that suggested attempts to conceal or destroy information.

Interesting. I would suggest that the Digital Reasoning approach is 15 years old; that is, only marginally newer than the i2 system. Digital Reasoning lacks the functionality of Cybertap. Furthermore, companies like Diffeo, Recorded Future, and Terbium incorporate sophisticated predictive methods which operate in an environment of real time information flows. The idea is that looking at an archive is interesting and useful to an attorney or investigator looking backwards. However, the focus for many financial firms is on what is happening “now.”

The Wall Street Journal story reminds me of the third party descriptions of Autonomy’s mid 1990s technology. Those who fail to understand the quantity of content preparation and manual, subject matter expert effort required to obtain high value outputs are watching smoke, not investigating the fire.

For organizations looking for next generation technology which is and has been working for several years, one must push beyond the Palantir valuation and look to the value of innovative systems and methods.

For a starter, check out Diffeo, Recorded Future, and Terbium Labs. Please, push IBM to exert some effort to explain the i2-Cybertap capabilities. I tip my hat to the PR firm which may have synthesized some information for a story that is likely to make the investors’ hearts race this fine day.

Stephen E Arnold, August 3, 2015

Endeca: Facets of Novelty

August 1, 2015

I am no specialist in the arcane art of legal eagle spotting. I did notice some references to a dust up between an outfit called Speedtrack and licensees of Endeca’s ageing search technology.

The Speedtrack outfit seems to have rights to an invention called “Method for Accessing Computer Files and Data, Using Linked Categories Assigned to Each Data File Record on Entry of the Data File Record.” This is explained brilliantly in US5544360, filed in February 1995.

Here’s a diagram showing how the user can click on categories to locate information. No typing required.


Compare this to Endeca’s invention, “Hierarchical Data Driven Navigation System and Method for Information Retrieval.” This is US7062483, filed in 2001. You may also find US7035864 and US7325201 interesting as well.


Federal Circuit Reaffirms Kessler Doctrine As A Patent Infringement Defense For Customers” explains that the Speedtrack infringement case pivots on the Kessler doctrine. Here’s the explanation from the article:

First, unlike res judicata, which is a defense that is personal to the parties in a prior litigation, the Kessler Doctrine “attaches to the [accused] product itself” and precludes a patentee from reasserting the same patent against the same (or “essentially the same”) product in a subsequent action.

Then noted:

Second, the Federal Circuit ruled that the Kessler doctrine may be raised by customers as well as the product manufacturer or supplier.

What I found fascinating was this infringement related statement attributed to the presiding legal eagle:

Third, the Federal Circuit held that the Kessler doctrine applied to Speedtrack’s claim even though the Endeca software allegedly infringed only when combined with the customer’s own computer hardware.

I recall that Endeca’s faceted navigation burst upon the scene in the late 1990s. Who knew that Jerzy Lewak (co founder of Speedtrack), Slawek Grzechnik, and Jon Matousek seemed to be trying to figure out a way around the problem of keyword search before Endeca?

I wonder if Oracle were surprised too. I have a hunch Speedtrack was.

Stephen E Arnold, August 1, 2015

Palantir Sucks in More Dinero

July 24, 2015

I am all for keeping the companies involved with law enforcement and intelligence entities out of the public eye. The hoo hah about Hacking Team is a grim reminder of what happened to Gamma Group and FinFisher when information about their services and products hit the “real” journalists’ radar.

I want to point you to “Confirmed. Palantir Raise a Huge $450 Million Investment.” The write up points out:

This [more cash investments] confirms a report last month that the company was raising up to $500 million at a valuation of $20 billion – making it the third most valuable “startup” on the Valley scene. (If you can call a 16-year-old company that reportedly generates millions in revenue a “startup.”)

Palantir is a unicorn wearing an invisibility saddle, tack, and saddle blanket. That’s okay with me. My observation is that Palantir has technology which is intended to prevent untoward acts. Are these untoward acts being prevented? I will let you answer that question.

I have no comment on whether the Palantir technology works. Even court documents related to Palantir’s dust up with i2 Group Ltd (a former client of mine) are not public. Why would i2, the pioneer in Palantir’s software segment, get involved with legal eagles?

Perhaps someone will have an answer some day. For now, I will ignore the partially invisible unicorn. The company has plenty of stakeholders who are trying to figure out Palantir so my efforts are redundant.

Stephen E Arnold, July 24, 2015

Real Journalists and Presstitution

July 24, 2015

I read and enjoyed an article for one word: “presstitute.” You can see the word in context in “Are Media Companies One Native Ad Away from Becoming Presstitutes.” Perhaps the word “native” is not clear? Inclusions, inserts, or paid advertorials will make the meaning of native clear.

The idea is that “real” journalists were before the eye opening days of yellow journalism were objective. Messrs. Pulitzer and Hearst were like Mark Zuckerberg and Larry Page more than a century ago.

Flash forward to the present and the “real” journalists are struggling to make their well honed business model work in a world of iPhones and Instagram.

Read the original essay. You get some dancing around the May pole, but the article is significant because of the word “presstitute” in my opinion. That’s a business model with legs. No comment about whether the legs are comely, hirsute, appropriate, or inappropriate from me, however.

Stephen E Arnold, July 214, 2015

Italian Firm Delivers Semantic API to Wall Street

July 22, 2015

Short honk: There are quite a few high technology firms chasing the deep pockets on Wall Street and in the City. Some, like Digital Reasoning, have teamed with larger players to capture customers. Others, like Connotate, have relied on their stakeholders to open doors. Many companies attended financial technology showcases to demonstrate the power of their intelligent systems; for example, Digital Shadows. Some companies like Terbium Labs show up and demonstrate how their advanced technology reduces risk and improves financial performance.

Expert System is approaching the market with what it calls the “first semantic API”. The idea is that money folks can create cognitive computing systems. You can read about the system at this link.

Expert Systems is betting that this is true. The news release quotes Luca Scagliarini, CEO as saying:

Intelligent solutions for strategic information management are absolutely critical in today’s big data world, and no where is this more critical than in the financial services industry where inaccurate or incomplete data can lead to fatal decisions. With Cogito API Finance, we are filling a big gap and tremendous need for customized knowledge management solutions in the financial industry.

Expert System is a publicly traded company (EXSY:MI) so the payoff from this cognitive push should be evident in the firm’s next financial report.


Today shares are trading at 2.12, up 0.02 or 0.76 percent. BAE Systems, a company with its NetReveal / Detica technologies which are in use in a number of financial applications, is trading at 29.35. There is market headroom available.

Stephen E Arnold, July 22, 2015

Hadoop Rounds Up Open Source Goodies

July 17, 2015

Summer time is here and what better way to celebrate the warm weather and fun in the sun than with some fantastic open source tools.  Okay, so you probably will not take your computer to the beach, but if you have a vacation planned one of these tools might help you complete your work faster so you can get closer to that umbrella and cocktail.  Datamation has a great listicle focused on “Hadoop And Big Data: 60 Top Open Source Tools.”

Hadoop is one of the most adopted open source tool to provide big data solutions.  The Hadoop market is expected to be worth $1 billion by 2020 and IBM has dedicated 3,500 employees to develop Apache Spark, part of the Hadoop ecosystem.

As open source is a huge part of the Hadoop landscape, Datamation’s list provides invaluable information on tools that could mean the difference between a successful project and failed one.  Also they could save some extra cash on the IT budget.

“This area has a seen a lot of activity recently, with the launch of many new projects. Many of the most noteworthy projects are managed by the Apache Foundation and are closely related to Hadoop.”

Datamation has maintained this list for a while and they update it from time to time as the industry changes.  The list isn’t sorted on a comparison scale, one being the best, rather they tools are grouped into categories and a short description is given to explain what the tool does. The categories include: Hadoop-related tools, big data analysis platforms and tools, databases and data warehouses, business intelligence, data mining, big data search, programming languages, query engines, and in-memory technology.  There is a tool for nearly every sort of problem that could come up in a Hadoop environment, so the listicle is definitely worth a glance.

Whitney Grace, July 17, 2015
Sponsored by, publisher of the CyberOSINT monograph


Short Honk: Saudi Supercomputer

July 14, 2015

In order to crunch text and do large scale computations, a fast computer is a useful tool. Engineering & Technology Magazine reported in “Saudi Machine Makes It on to World’s Top Ten Supercomputer List”:

The Shaheen II is the first supercomputer based in the Middle East to enter the world’s top ten list, debuting at number seven. The Saudi supercomputer is based at King Abdullah University of Science and Technology and is the seventh most powerful computer on the planet, according to the Top 500 organization that monitors high-performance machines. China’s Tianhe-2 kept its position as the most powerful supercomputer in the world in the latest rankings.

If you are monitoring the supercomputer sector, this announcement, if accurate, is important in my opinion. There are implications for content processing, social graph generation, and other interesting applications.

Stephen E Arnold, July 14, 2015

Need a 1.3 Gb Corpus with a Million Text Objects?

July 12, 2015

Short honk: If you have a search and content processing system, you might want to navigate to this link. You can access  the Hacker news data dump. My thought would be for the Watson team to process this information and then put up a demo of the Watson system using the Hacker News content. Any other search and content processing vendors game? interesting content and a beefy enough corpus to provide interesting results.

Stephen E Arnold, July 12, 2015

Dealing with Company and Product Identity: Terbium Labs Nails It

July 11, 2015

Navigate to and read about the company.


Nifty name. Very nifty name indeed. Now, a bit of branding commentary.

I used to work at Halliburton Nuclear. Ah, the good old days of nuclear engineers poking fun at civil engineers and mathematicians not understanding any joke made my the computer engineers.

The problem of naming companies in high technology disciplines is a very big one. Before Halliburton gobbled up the Nuclear Utility Services outfit, the company with more than 400 nuclear engineers on staff struggled with its name. Nuclear Utility Services was abbreviated to NUS. A pretty sharp copywriter named Richard Harrington of the dearly loved Ketchum, McLeod and Gove ad agency came up with this catchy line:

After the EPA, call NUS.

The important point is that Mr. Harrington, a whiz person, wanted to have people read each letter: E-P-A, not say eepa and say N-U-S not say noose. In Japanese, the sound “nus” has a negative meaning usually applied to pressurized body odor emissions. Not good.

Search and content processing vendors struggle with names. I have written about outfits which have fumbled the branding ball. Examples range from Thunderstone which has been usurped by a gaming company. Brainware which has been snagged and used for interesting videos. Smartlogic whose name has been appropriated by a smaller outfit doing marketing/design stuff. There are names which are impossible to find; for example, i2, AMI, and ChaCha to name a few among many.

I want to call attention to a quite useful product naming which I learned about recently. Navigate to Consider the word Terbium. Look for the word “Matchlight.”

I find Terbium a darned good word because terbium is an element, which my old (and I mean old) chemistry professor pronounced “ter-beem”). The element has a number of useful applications. Think solid sate devices and as a magic ingredient in some rocket fuels and—okay, okay—some explosives.

But as good as “terbium” is for a company I absolutely delight in this product name:


Now what’s Matchlight and why should anyone care. My hunch is that the technology which allows a next generation approach to content identification and other functions works to

  • light a match in the wilderness
  • illuminate a dark space
  • start a camp fire so I can cook a goose

You can and should learn more about Terbium Labs and its technology. The names will help you remember.

Important company; important technology. Great name Matchlight. (Hear that search and content processing vendors with dud names?)

Stephen E Arnold, July 11, 2015

SAS Text Miner Promises Unstructured Insight

July 10, 2015

Big data is tools help organizations analyze more than their old, legacy data.  While legacy data does help an organization study how their process have changed, the data is old and does not reflect the immediate, real time trends.  SAS offers a product that bridges old data with the new as well as unstructured and structured data.

The SAS Text Miner is built from Teragram technology.  It features document theme discovery, a function the finds relations between document collections; automatic Boolean rule generation; high performance text mining that quickly evaluates large document collection; term profiling and trending, evaluates term relevance in a collection and how they are used; multiple language support; visual interrogation of results; easily import text; flexible entity options; and a user friendly interface.

The SAS Text Miner is specifically programmed to discover data relationships data, automate activities, and determine keywords and phrases.  The software uses predictive models to analysis data and discover new insights:

“Predictive models use situational knowledge to describe future scenarios. Yet important circumstances and events described in comment fields, notes, reports, inquiries, web commentaries, etc., aren’t captured in structured fields that can be analyzed easily. Now you can add insights gleaned from text-based sources to your predictive models for more powerful predictions.”

Text mining software reveals insights between old and new data, making it one of the basic components of big data.

Whitney Grace, July 10, 2015

Sponsored by, publisher of the CyberOSINT monograph

« Previous PageNext Page »