CyberOSINT banner

Information Theory: An Eclectic Approach

June 8, 2015

I love information theory. If you want to get some insight into selected themes in this discipline, check out “Journey into Information Theory.” The course or “program” falls into two sections: Ancient Information Theory and Modern Information Theory. What is interesting is that the Ancient category zips right along from written language to Morse code. Where the subject becomes troublesome for me is the Modern section. The program moves from “symbol rate” to Markov chains. To my eye, there are some omissions. But it appears that the course or program is free.

Stephen E Arnold, June 8, 2015

Is Collaboration the Key to Big Data Progress?

May 22, 2015

The article titled Big Data Must Haves: Capacity, Compute, Collaboration on GCN offers insights into the best areas of focus for big data researchers. The Internet2 Global Summit is in D.C. this year with many exciting panelists who support the emphasis on collaboration in particular. The article mentions the work being presented by several people including Clemson professor Alex Feltus,

“…his research team is leveraging the Internet2 infrastructure, including its Advanced Layer 2 Service high-speed connections and perfSONAR network monitoring, to substantially accelerate genomic big data transfers and transform researcher collaboration…Arizona State University, which recently got 100 gigabit/sec connections to Internet2, has developed the Next Generation Cyber Capability, or NGCC, to respond to big data challenges.  The NGCC integrates big data platforms and traditional supercomputing technologies with software-defined networking, high-speed interconnects and visualization for medical research.”

Arizona’s NGCC provides the essence of the article’s claims, stressing capacity with Internet2, several types of computing, and of course collaboration between everyone at work on the system. Feltus commented on the importance of cooperation in Arizona State’s work, suggesting that personal relationships outweigh individual successes. He claims his own teamwork with network and storage researchers helped him find new potential avenues of innovation that might not have occurred to him without thoughtful collaboration.

Chelsea Kerwin, May 22, 2014

Stephen E Arnold, Publisher of CyberOSINT at

Automating Innovation: Another Beyond Search Moment

May 7, 2015

What future Bill Gates or Larry Page can resist the notion of automating innovation. The notion that invention is 99 percent perspiration is definitely the equivalent of making horseshoes with a fire, hammer, and a strong arm.

I read Inno 360’s article “5 Things We Do to Automate Innovation.” Note that this is not automating information analysis which garnered Banjo $100 million in funding. Inno 360 deals with innovation, the Tesla type of battery insight.

The write up identifies five steps:

  1. Semantic search
  2. Visualizations
  3. Outreach
  4. Collaborative evaluation and vetting management
  5. Content management.

I find this list of items quite interesting. Each item is a basket containing numerous and often widely divergent technologies, definitions, and features.

If you are intrigued by a company asserting that it can do automatically that which a handful of humans achieves, navigate to I cannot define semantic search and content management. I don’t know what collaborative evaluation and vetting management means either, particularly in terms of automation. I have a good handle on visualizations. These are pictures, and I know that many cyber OSINT systems can generate visual outputs.

Automating innovation is a new angle, and I think it may be quite a remarkable achievement if it works. Innovation in information access is needed. Humans have not made much progress in the last 50 years when it comes to information access. The search box seems to the go to solution.

Stephen E. Arnold, May 7, 2015

Listen Up, Search Experts, Innovation Ended in 1870

May 2, 2015

I just completed a video in which I said, “Keyword search has not changed for 50 years.” I assume that one or two ahistorical 20 somethings will tell me that I need to get back in the rest home where I belong.

I read “Fundamental Innovation Peaked in 1870 and Why That’s a Good Thing,” which is a write up designed to attract clicks and generate furious online discussion and maybe a comic book. I think that 1870 was 145 years ago, but I could be wrong. Ahistorical allows many interesting things to occur.

The point of the write up is to bolster the assertion that “society has only become less innovative through the years.” I buy that, at least for the period of time I have allocated to write this pre Kentucky Derby blog post on a sparkling spring day with the temperature pegged at 72 degrees Fahrenheit or 22.22 degrees Celsius for those who are particular about conversions accurate to two decimal places. Celsius was defined in the mid 18th century, a fact bolstering the argument of the article under my microscope. Note that the microscope was invented in the late 16th century by two Dutch guys who charged a lot of money for corrective eye wear. I wonder if these clever souls thought about bolting a wireless computer and miniature video screen to their spectacles.

The write up reports that the math crazed lads at the Santa Fe Institute “discovered” innovation has flat lined. Well, someone needs to come up with a better Celsius. Maybe we can do what content processing vendors do and rename “Celsius” to “centigrade” and claim a breakthrough innovation?

Here’s the big idea:

“A new invention consists of technologies, either new or already in use, brought together in a way not previously seen,” the Santa Fe researchers, led by complex systems theorist Hyejin Youn, write. “The historical record on this process is extensive. For recent examples consider the incandescent light bulb, which involves the use of electricity, a heated filament, an inert gas and a glass bulb; the laser, which presupposes the ability to construct highly reflective optical cavities, creates light intensification mediums of sufficient purity and supplies light of specific wavelengths; or the polymerase chain reaction, which requires the abilities to finely control thermal cycling (which involves the use of computers) and isolate short DNA fragments (which in turn applies techniques from chemical engineering).”

Let’s assume the SFI folks are spot on. I would suggest that information access is an ideal example of a lack of innovation. Handwritten notes pinned to manuscripts did  not work very well, but the “idea” was there: The notes told the lucky library user something about the scroll in the slot or the pile in some cases. That’s metadata.

Flash forward to the news releases I received last week about breakthrough content classification, metadata extraction, and predictive tagging.

SFI is correct. These notions are quite old. The point overlooked in the write up about the researchers’ insight is that pinned notes did not work very well. I recall learning that monks groused about careless users not pinning them back on the content object when finished.

Take heart. Most automated, super indexing systems are only about 80 percent accurate. That’s close enough for horse shoes and more evidence that digital information access methods are not going to get a real innovator very excited.

Taking a bit of this and a bit of that is what’s needed. Even with these cut and paste approach to invention, the enterprise search sector leaves me with the nostalgic feeling I experience after I leave a museum exhibit.

But what about the Google, IBM, and Microsoft patents? I assume SFI wizards would testify in the role of expert witnesses that the inventions were not original. I wonder how many patent attorneys are reworking their résumés in order to seek an alternative source of revenue?

Lots? Well, maybe not.


Defense Contractor Makes Leap Investment Into Cybersecurity  

April 30, 2015

The expression goes “you should look before you leap,” meaning you should make plans and wise choices before you barrel headfirst into what might be a brick wall.  Some might say Raytheon could be heading that way with their recent investment, but The Wall Street Journal says they could be making a wise choice in the article, “Raytheon To Plow $1.7 Billion Into New Cyber Venture.”

Raytheon recently purchased Websense Inc., a cybersecurity company with over 21,000 clients.  Websense will form the basis of a new cyber joint venture and it is projected to make $500 million in sales for 2015.  Over the next few years, Raytheon predicts the revenue will surge:

“Raytheon, which is based in Waltham, Mass., predicted the joint venture would deliver high-single-digit revenue growth next year and mid-double-digit growth in 2017, and would be profitable from day one. Raytheon will have an 80% stake in the new cyber venture, with Vista Partners LLC holding 20%.”

While Raytheon is a respected name in the defense contracting field, their biggest clients have been with the US military and intelligence agencies.  The article mentions how it might be difficult for Raytheon’s sales team and employees to switch to working with non-governmental clients.  Raytheon, however, is positioned to use Websense’s experience with commercial clients and its own dealings within the security industry to be successful.

Raytheon definitely has looked before its leapt into this joint venture.  Where Raytheon has shortcomings, Websense will be able to compensate and vice versa.

Whitney Grace, April 30, 2015
Sponsored by, publisher of the CyberOSINT monograph

OpenText Innovation Tour

March 4, 2015

The post on the CEO Blog on Opentext titled Innovation Tour 2015 Kicks Off announces the 15 city tour from a company acquiring technology, not developing it. The company seems unperturbed by this disparity, and touts their excitement to “simplify, transform and accelerate” operations in 2020. The tour will visit four continents and aims to reach out to the companies partners and customers along the way. The blog past written by CEO Mark Barrenechea says,

“Some of the exciting innovations I plan to share on the tour include the very topical cloud and analytics. Since the cloud offers huge benefits to customers, we’ve enhanced our cloud offerings with the addition of subscription pricing and we’ve launched OpenText CORE—an on-demand SaaS solution for cloud-based document sharing and collaboration. We’ve also invested in a predictive analytics platform for all our EIM solutions…Analytics is a powerful addition to our portfolio.”

The tour promises exhibitions, keynote speakers and roundtable discussions. The only question for interested parties may be, can they overhype this tour? Apparently not, with this year’s focus being the Digital First-World and the revolutionary changes that OpenText suspects will take place this decade. It seems that if you miss your chance to participate in the innovation tour, you will never catch up to the companies that do.

Chelsea Kerwin, March 04, 2015

Sponsored by, developer of Augmentext

San Francisco, Patents, and Theft: Search Innovation?”

February 5, 2015

Short honk: I read “A Bizarre Statistical Fact about Patents in San Francisco.” Some statistics professors might wrinkle their brows at the apparent correlation. Nevertheless, here’s the quote I noted:

What I’m suggesting is that this giant spike in patent rates is reflecting the combination of innovation and theft. Consider that many patents are used by the wealthier classes as a way to bilk people out of money. There’s the obvious case where patent trolls buy up overbroad patents— often in software — and threaten people with lawsuits until they pay to license a dubious patent from the troll. But patents also allow big companies to block small businesses from innovating, by charging astronomical prices to license really basic ideas or software functions. Especially in Silicon Valley, patents are often a game played by wealthy businesses, to the detriment of small-time entrepreneurs and teams of inventors.

I wonder if there is an impact on search innovation?

Stephen E Arnold, February 5, 2015

Google X Labs Just Buys a Better Idea

January 23, 2015

Short honk: X Labs was to be part of the Google innovation push. The idea was “moon shots.” Well, Google came up with balloons, self driving cars, and a linguistic innovation, Glassholes.

Now Google and by extension gets a better idea. I read “Google Wants Life on Mars in $1bn SpaceX Investment.” Fueled by ad revenue, Google is into satellites and rocket ships.

The article said:

Google is racing to spread internet access as it looks for new ways to boost its user base and sell more digital advertising. By teaming up with SpaceX, Google would be seeking to gain an edge over rivals such as Facebook, which is working on projects to deliver Internet service to underserved regions by building drones, satellites and lasers. WorldVu Satellites, backed by Qualcomm and Virgin Group, has begun a similar effort. “Google needs to find additional sources of revenue,” said Greg Sterling, vice-president of strategy and insights for Local Search Association, whose members operate in the location-based advertising market. “If they can expand into new markets, obviously they can expand their revenue and keep investors happy.”

The issue for me is innovation. After 2006, Google began to flag in the innovation department. Amazon did the cloud thing. Facebook did the relationship thing. AirBnB and Uber did the sharing thing.

Google continued to keep its infrastructure chugging along in order to sell ads.

Don’t get me wrong. Google is a giant outfit. I find it interesting that Google X Labs came up with balloons and Elon Musk came up with a better idea. To get that innovation, Google has to write a check, not rely on its 50,000 plus really smart folks.

Interesting. Loon balloons trumped by satellites and rockets. What’s that line about soaring like an eagle?

Stephen E Arnold, January 21, 2015

Satellites or Balloons?

January 19, 2015

I enjoy conversations about how innovative Google is. Prior to 2006, I was on the Google is the leader bandwagon. Now it may be time to shift from the GOOG to the Musk.

I read “Elon Musk touts launch of ‘SpaceX Seattle’.” Tucked in the encomium to Mr. Musk was this passage:

As guests drank beer and wine and sipped Champagne from glasses etched with the SpaceX logo, Musk outlined an audacious plan to build a constellation of some 4,000 geosynchronous satellites, a network in space that could deliver high-speed Internet access anywhere on Earth. Those satellites are to be designed by software and aerospace engineers in SpaceX’s new engineering office in Redmond.

Now satellite communications is not new. Anyone remember Equatorial Communications? History lesson aside, Google wants to deliver the Internet via balloons which rhymes with loons.

According to, the first hot air balloon was launched in 1783. Satellites came along when I was in grade school.

It seems that Google is not thinking on the Musk scale. Wasn’t X Labs supposed to do moon shots. Mr. Musk now does transportation and digital thingies.

In terms of search, I question whether Google’s innovations in search as described by “How Google Tries To Figure Out Our Hidden Needs” improves relevance or tweaks the PT Barnum formula for revenue.

Stephen E Arnold, January 19, 2015

Did You Know Oracle and WCC Go Beyond Search?

January 10, 2015

I love the phrase “beyond search.” Microsoft uses it, working overtime to become the go-to resource for next generation search. I learned that Oracle also finds the phrase ideal for describing the lash up of traditional database technology, the decades old Endeca technology, and the Dutch matching system from WCC Group.

You can read about this beyond search tie up in “Beyond Search in Policing: How Oracle Redefines Real time Policing and Investigation—Complementary Capabilities of Oracle’s Endeca Information Discovery and WCC’s ELISE.”

The white paper explains in 15 pages how intelligence led policing works. I am okay with the assertions, but I wonder if Endeca’s computationally intensive approach is suitable for certain content processing tasks. The meshing of matching with Endeca’s outputs results in an “integrated policing platform.”

The Oracle marketing piece explains ELISE in terms of “Intelligent Fusion.” Fusion is quite important in next generation information access. The diagram explaining ELISE is interesting:


Endeca’s indexing makes use of the MDex storage engine, which works quite well for certain types of applications; for example, bounded content and point-and-click access. Oracle shows this in terms of Endeca’s geographic output as a mash up:


For me, the most interesting part of the marketing piece was this diagram. It shows how two “search” systems integrate to meet the needs of modern police work:


It seems that WCC’s technology, also used for matching candidates with jobs, looks for matches and then Endeca adds an interface component once the Endeca system has worked through its computational processes.

For Oracle, ELISE and Endeca provide two legs of Oracle’s integrated police case management system.

Next generation information access systems move “beyond search” by integrating automated collection, analytics, and reporting functions. In my new monograph for law enforcement and intelligence professionals, I profile 21 vendors who provide NGIA. Oracle may go “beyond search,” but the company has not yet penetrated NGIA, next generation information access. More streamlined methods are required to cope with the type of data flows available to law enforcement and intelligence professionals.

For more information about NGIA, navigate to

Stephen E Arnold, January 10, 2015

Next Page »