PathAR Update

February 24, 2014

One of the goslings dug up additional information on the PathAR company. The firm caught my attention with its assertion that it could identify one specific meaning content object in a large corpus.

The company’s Web site is


According to an SEC Form D, the executives of the company are:

  • Patrick D. Butler
  • Andrew Woglom, chief financial officer
  • Anthony (Tony) Marshall
  • Mark Jacobson.

The address for the company is listed on Form D as:

110 S. Sierra Madre Street
Colorado Springs, CO 80903.

The company is seeking a software development manager and a senior architect/developer.

The CrunchBase profile states that the company provides “leading edge analysis capabilities.”

The firm has received $500,000 in funding.

The company appears to be throwing its hat in the ring with IBM, Palantir, and Recorded Future. With Palantir still pursuing a $9 billion valuation, the smart analytics sector continues to attract innovators and entrepreneurs. The question is, “Are there enough customers to make the dozens of analytics firms profitable?”

Stephen E Arnold, February 24, 2014

Free Knowledgebase Builder for Mind Mapping

February 24, 2014

Mind maps can be a valuable tool for the visual among us, and you can easily build your own virtual version with Knowledgebase Builder 2.6 from InfoRapid, based in Waiblingen, Germany. The best part—it’s free for personal use. As with most such business models, the company hopes you’ll try the freeware version and decide you can’t live without the tool in your workplace. The Professional Edition, which lets multiple users work together on the same knowledge base, goes for 99 euros (about $135 as of this writing). The price for the version with all the bells and whistles, the Enterprise Version, varies by company size, but starts at 1,000 euros (about $1,360 as I type) for a small business.

The description tells us:

“InfoRapid KnowledgeBase Builder allows you to easily create complex Mind Maps with millions of interconnected items. One single Mind Map can hold your entire knowledge, all your thoughts and ideas in a clear way. The data is stored securely in a local database file. While traditional Mind Maps don’t offer cross connections, InfoRapid KnowledgeBase Builder can connect any item with each other and label the connection lines. The program contains an archive for documents, images and web pages that may be imported and attached to any chart item or connection line.”

The six-minute video on the website demonstrates the Builder’s functionality, using as its example text about the software itself. The connection lines they mention above, which shift to adjust to new input, are reason enough to switch from pen-and-paper or MSPaint mapping techniques. Another key feature: You can link to documents or web pages from within the map, simplifying follow-through (a weak point for many of us.) The Highlighter Analysis is pretty nifty, too. Anyone curious about this tool should check out the site—the (personal use) price can’t be beat.

Cynthia Murrell, February 24, 2014

Sponsored by, developer of Augmentext

PathAR: Bold Claims

February 23, 2014

I came across a quite remarkable marketing assertion. The company using the wording is PathAR LLC, based in the midwest. Here’s what the company says:

Today 1 of the 3.8 Billion users of social media WILL impact your organization! Do you know who that 1 user is? How do we do it?
We built the world’s most advanced commercially available end-to-end solution for creating actionable intelligence from big data! Our proprietary intelligence engine powers Dunami, our web-based software platform. Dunami combines breakthrough advances in network analysis with advanced analytical techniques derived from long standing intelligence practices. Dunami’s broad capabilities are being used to Find, Understand, and Predict the behaviors of thought leaders and organizers on any topic, including identifying extremists, criminals, and others who are inciting potential violence around the globe!

When I read the statements, I wonder how predictive methods can pinpoint a single datum as the pivotal item of information.

Dunami, as a product/service name, poses some findability challenges. The name is in use for an exercise studio, a religious connotation, and a visual novel.

The company has filed for a trademark. See The company has a modest LinkedIn presence. See

Is this another outfit chasing after IBM i2, Recorded Future, and the dozens of vendors listed on the Carasoft Web site?

Stephen E Arnold, February 23, 2014

Frequentists Versus Bayesians: Is HP Amused?

February 19, 2014

I read a long report and then a handful of spin off reports about HP and Autonomy, mid February 2014 version. The Financial Times’s story is a for fee job. You can get a feel for the information in “HP Executives Knew of Autonomy’s Hardware Sales Losses: Report.” There are clever discussions of this allegedly “new information” in a number of blogs. What is interesting is an allegedly accurate chunk of information in “HP Explores Settlement of Autonomy Shareholder Lawsuit.” My head is spinning. HP buys something. Changes the person on watch when the deal was worked out. HP gets a new boss and makes changes to its board of directors. HP then accuses everyone except itself for buying Autonomy for a lot of money. HP then whips up the regulators, agitates accounting firms, and pokes Michael Lynch with a cattle prod.

As this activity was in the microwave, it appears that HP knew how the hardware/software deals were handled. If the reports are accurate, Dell hardware was more desirable than HP’s hardware.

But there is a more interesting twist. I refer you, gentle reader, to “A Fervent Defense of Frequentist Statistics.” Autonomy’s “black box” consists of Bayesian methods and what I call MCMC or Monte Carlo and Markov Chain techniques. The idea is that once some judgment calls are made, the Integrated Data Operating Layer  or IDOL can chug away without human involvement. When properly resourced and trained, the Autonomy system works for certain types of content processing and information retrieval applications. You can read more about IDOL in our for-fee analysis of IDOL. This document reviews several important patents germane to the Autonomy system. You can purchase a copy of this analysis at

In a Fervent Defense, an old battle line is reactivated. The “frequentists” are not exactly thrilled with the rise of Bayesian methods. Autonomy emerged from Cambridge University when some of the Bayesian methods were revealed as crucial to World War II activities. Freqeuntists point out that there are some myths about Bayesian methods. The write up is not for MBAs, failed Web masters, and unemployed middle school teachers. For example, the myths allegedly dispelled in the article are:

  • “Bayesian methods are optimal.
  • Bayesian methods are optimal except for computational considerations.
  • We can deal with computational constraints simply by making approximations to Bayes.
  • The prior isn’t a big deal because Bayesians can always share likelihood ratios.
  • Frequentist methods need to assume their model is correct, or that the data are i.i.d.
  • Frequentist methods can only deal with simple models, and make arbitrary cutoffs in model complexity (aka: “I’m Bayesian because I want to do Solomonoff induction”).
  • Frequentist methods hide their assumptions while Bayesian methods make assumptions explicit.
  • Frequentist methods are fragile, Bayesian methods are robust.
  • Frequentist methods are responsible for bad science
  • Frequentist methods are unprincipled/hacky.
  • Frequentist methods have no promising approach to computationally bounded inference.”

The key point is that HP is going to learn, already has learned, or learned and just forgotten that Bayesian methods are not a suitable for every single information processing application. In fact, using Bayesian when a frequentist method is more appropriate can produce unsatisfactory results for a discriminating data scientist. The use of frequentist methods when Bayesian is more appropriate can yield equally dissatisfying outputs.

The point is that if one buys a system built on one method and then applies it inappropriately, the knowledgeable user is going to be angry. It is possible that some disappointed users will take legal action, demand a license refund, or just hit the conference circuit and explain why such and such a system was a failure.

Will HP put the three ring circus of buying Autonomy to rest and then find itself mired in the jaws of a Bayesian versus frequentist dispute? My hunch is, “Yep.”

Could HP have convinced itself that Autonomy was a universal fix it kit for information processing problems? If the answer is, “Yes,” then HP is going to have to come to grips with licensees who are going to point out that the solution did not cure the problem.

In short, HP faces more excitement. The company will not be “idle” any time soon. HP may not be amused, but I am. Search is indeed a bit more difficult than some would have customers believe.

Stephen E Arnold, February 19, 2014

Marvel Introduced by Elasticsearch to Monitor and Manage Data Extraction

February 17, 2014

The article titled Elasticsearch Debuts Marvel To Deploy And Monitor Its Open Source Search And Data Analytics Technology on TechCrunch provides insight into Marvel, which the article calls a “deployment management and monitoring solution.” Elasticsearch is a technology for extracting information from structured and unstructured data and its users include such big names as Netflix, Verizon and Facebook among others. The article explains how Marvel will work to manage Elasticsearch,

“Enter Marvel, Elasticsearch’s first commercial offering, that makes it easy to run search, monitor performance, get visual views in real time and take action to fix things and improve performance. Marvel allows Elasticsearch system operators, who manage the technology at companies like Foursquare, see their Elasticsearch deployments in action, initiate instant checkup, and access historical data in context. Potential systems issues can be spotted and resolved before they become problems, and troubleshooting is faster. Pricing starts at $500 per five nodes.”

Elasticsearch reported that their revenue growth in 2013 was at over 400% and Marvel will only further their popularity. Already a user-friendly and lightweight technology, Elasticsearch is targeting developers interested in real-time discernibility of their data. Marvel may be great news for Elasticsearch and its users, but is certainly bad news for competitor Lucid Imagination.

Chelsea Kerwin, February 17, 2014

Sponsored by, developer of Augmentext

Advice on Making the Most of Limited Data

February 12, 2014

The article How To Do Predictive Analytics with Limited Data from Datameer on Slideshare suggests that Limited Data may replace Big Data in import. The idea of “semi-supervised learning” is presented to handle the difficulties associated with creating predictions based on limited data such as expense and manageability and simply missing key data. The overview states,

“As it turns out, recent research on machine learning techniques has found a way to deal effectively with such situations with a technique called semi-supervised learning. These techniques are often able to leverage the vast amount of related, but unlabeled data to generate accurate models. In this talk, we will give an overview of the most common techniques including co-training regularization. We first explain the principles and underlying assumptions of semi-supervised learning and then show how to implement such methods with Hadoop.”

The presentation summarizes possible approaches to semi-supervised learning and the assumptions it is possible to make about unlabeled data (these include such models as clustering, low density and manifold assumptions). It also covers the concepts of Label Propagation and Nearest Neighbor Join. However, as inviting as it is to forget Big Data, and switch to predictive analytics with Limited Data the suggestion may sound too much like Bayes-Laplace.

Chelsea Kerwin, February 12, 2014

Sponsored by, developer of Augmentext

Attivio and Quant5 Partner to Meet Challenges of Data Analytics

February 11, 2014

The article on PRNewswire titled Attivio and Quant5 Partner to Bring Fast and Reliable Predictive Customer Analytics to the Cloud explains the partnership between the two analytics innovators. Aimed at producing information from data without the hassle of a team of data scientists, the partnership promises to effectively create insights that companies will be able to act on. The partnership responds to the growing frustration some companies face with gleaning useful information from huge amounts of data. The article explains,

“Attivio built its business around the core principle that integrating big data and big content should not require expensive mainframe legacy systems, handcuffing service agreements, years of integration and expensive data scientists. Attivio enterprise customers experience business-changing efficiency, sales and competitive results within 90 days. Similarly, Quant5 arose from the understanding that businesses need simple, elegant solutions to address difficult and complex marketing challenges. Quant5 customers experience increased revenues, reduced customer churn and an affordable and fast path to predictive analytics.”

The possibility of indirect sales following in the footsteps of Autonomy and Endeca does seem to be a part of the 2014 tactics. The Attivio-Quant5, Inc. solutions are offered in five major areas of concern: Lead & Opportunity Scoring, Customer Segmentation, Targeted Offers, Product Usage and Product Relationships.

Chelsea Kerwin, February 11, 2014

Sponsored by, developer of Augmentext

Government Buys into Text Analytics

February 7, 2014

What do you make of this headline from All Analytics: “Text And The City: Municipalities Discover Text Analytics”? Businesses have been using text mining software for awhile and understand the insights it can deliver to business decisions. The same goes for law firms that must wade through piles of litigation. Are governments really only catching onto text mining software now?

The article reports on several examples where municipal governments have employed text mining and analytics. Law enforcement agencies are using it to identify key concepts to deliver quick information to officials. The 311 systems, known as the source of local information and immediate contact with services, is another system that can benefit from text analytics, because it can organize and process the information faster and more consistently.

There are many ways text analytics can be helpful to local governments:

“Identifying root causes is a unique value proposition for text analytics in government. It’s one thing to know something happened — a crime, a missed garbage collection, a school expulsion — and another to understand where the problem started. Conventional data often lacks clues about causes, but text reveals a lot.”

The bigger question is will local governments spend the money on these systems? Perhaps, but analytic software is expensive and governments are pressured to find low-cost solutions. Expertise and money are in short supply on this issue.

Whitney Grace, February 07, 2014

Sponsored by, developer of Augmentext

Quote to Note: Big Data Skill and Value Linked

February 6, 2014

Tucked in “The Morning Ledger: Companies Seek Help Putting Big Data to Work” was a quote attributed to SAS, a vendor of statistical solutions and software. The quote:

David Ginsberg, chief data scientist at SAP, said communication skills are critically important in the field, and that a key player on his big-data team is a “guy who can translate Ph.D. to English. Those are the hardest people to find.”

I have been working through patent documents from some interesting companies involved in Big Data. The math is somewhat repetitive,  but the combination of numerical ingredients makes the “invention” it seems.

One common thread runs through the information I have reviewed in preparation for my lectures in Dubai in early March 2014. Fancy software needs humans to:

  • Verify the transforms are within acceptable limits
  • Configure thresholds
  • Specify outputs often using old fashioned methods like SQL and Boolean
  • Figure out what the outputs “mean”.

With search and content processing vendors asserting that their systems make it easy for end users to tap the power of Big Data, I have some doubts. With most “analysts” working in Excel, a leap to the types of systems disclosed in open source patent documents will be at the outer edge of end users’ current skills.

Big Data requires much of skilled humans. When there are too few human Big Data experts, Big Data may not deliver much, if any, value to those looking for a silver bullet for their business.

Stephen E Arnold, February 6, 2014

The Future of Business Intelligence

January 26, 2014

In the article titled Business Intelligence Usage Evolving Subtly on Smart Data Collective it is made apparent that new developments in business intelligence and analytics are still growing. The article assumes that the 2013 trend in cloud computing popularity will continue into 2014.

Looking further ahead, the article states:

“There could soon be a whole new BI paradigm, in which many affordable analysis processes are created at once, rather than devoting the whole budget to one effort. Enterprise Apps Today explained that this is another natural role for the cloud, with good projects surviving and poor options falling by the wayside, all without the effort or funding that would be necessary to accomplish the same on-site.”

The article cites a MarketsandMarkets survey that concluded that BI would be found useful in many sectors. More specifically, “the source indicated that the technology will grow at a rate of 8.3 percent through 2018.” That would mean a value of $20.8 billion in 2018, up from the current worth of $13.9 billion. However, others are less optimistic, believing the slow evolution of business intelligence may be too snail-like, since business intelligence is currently meeting sales resistance in France, as we reported in the article Business Intelligence: Free Pressure for Fee Solutions. Perhaps subtle is not enough?

Chelsea Kerwin, January 26, 2014

Sponsored by, developer of Augmentext

« Previous PageNext Page »