Imagine the Internet without Search Engines

April 17, 2014

Centrifuge Systems proposes an interesting idea in “Big Data Discovery Without Link Analysis Is Like The Web Without Google.” Centrifuge Systems asks readers of the short article to imagine using the Internet without a search engine. How would we locate information? It would be similar to the librarian’s favorite description of the Internet all the contents of a library spilled on the floor. The article continues to explain that big data without link analysis works the same as the Internet without a search engine.

What is link analysis?

“You can view link analysis as a data discovery technique that reveals the structure and content of information by representing it as a set of interconnected objects. When combined with a visual representation, an investigator can quickly gain an understanding of the strength of relationships and the frequency of contacts and immediately discover new associations. Link analysis offers an intuitive alternative to the traditional relational database formats and BI tools without deep technical expertise.”

It is a convincing analogy. To increase a potential client’s interest, Centrifuge Systems offers a Data Discovery Challenge, where the client is given a free solution. In other terms, it’s a free estimate for services. Big data is full of analytics, but has anyone other than Centrifuge Systems offer rich link analysis?

Whitney Grace, April 17, 2014
Sponsored by, developer of Augmentext

Google Glass and Predictive Analytics

April 13, 2014

I read “I Was Assaulted for Wearing Google Glass.” This is a sad commentary on our times. Will I be assaulted in Harrod’s Creek for driving my Kia Soul and wearing a T shirt that says “ Seavey’s Dog Kennel”? After I read the item, I wondered, “Why can’t Google’s predictive analytics be used to display to a Glass wearer that the assault risk level?” A color coded scheme could be used based on previous Glass users’ encounters, GPS data, and other inputs available to the Google / Recorded Future systems.

I noted this passage:

Why were people laughing at my misfortune or implying I somehow deserved it?

Beats me.

Stephen E Arnold, April 13, 2014

The Enigma App

April 1, 2014

Information can be an enigma, which is probably why the developers named their new app that. Visiting the Enigma Web site opens on a picture of either New York or London with the headline “navigate the world of public data.” It is an intriguing idea that one would think could be accomplished with search engine or academic database. Then again when you think about the process and how time consuming it is, it would be handy to have a search engine that did most of the work for you.

Enigma was built as a solution to this problem. The company says they have:

“Enigma is amassing the largest collection of public data produced by governments, universities, companies, and organizations. Concentrating all of this data provides new insights into economies, companies, places and individuals.”

Enigma’s services do come with a fee, however. They offer public data search and quick analytics for free with sign-up, but if you want API access and online support you need to upgrade to plans that start at $195/month. The data search must be gold, when you consider that many of these records are available for the public. It is worth exploring to see how the service differs from a basic search engine, but it is hard to sign up. The registration page is finicky.

Whitney Grace, April 01, 2014
Sponsored by, developer of Augmentext

Darpa Prods Big Data Experts

March 29, 2014

I read “Darpa Calls for Advanced Big Data Ideas.” If the write up is accurate, Darpa is not on board with the marketing innovations about Big Data, whatever the term means. Darpa wants more. According to the TechRadar story:

According to V3, DARPA director Arati Prabhakar told a briefing on emerging threats with the House Armed Services Committee’s Subcommittee on Intelligence that it is looking to come up with some advanced big data ideas. She said that DARPA is creating a new set of cyber security capabilities that will ensure that networked information is trustworthy.

Address “big data” may be easier if those talking about it would define the term and the context in which the phrase is being used. Those who chant “Big Data,” including Darpa, are just empowering the sales people, the self appointed experts, and the failed middle school teachers who write “reports” for mid tier consulting firms.

Stephen E Arnold, March 29, 2014

Addiction Model Measures App’s Addictiveness Over Retention

March 28, 2014

The article on re/ titled Mixpanel: How Addictive Is Your App? presents a new analytic report called Addiction. Under a picture of a wrist cuffed to the smartphone it holds, the article cheerfully explains that fifty percent of social app users engage with the service for over five hours a day. Enterprise apps are used more during the business day, and messaging apps show a lesser addiction in their users, supporting the idea that people are now using social media apps for most of their communications. The article explains,

“Addiction adds an extra layer of insight that allows companies to analyze user behavior on an even deeper level. One thing that’s clear is that addiction is inextricably linked to function: If your product is a social app that people don’t use more than once a day, that’s a red flag — and not one you would have previously been able to catch if you relied solely on Retention.”

The article stipulates that the most important feature of Addiction is that it enables companies to visualize how “embedded” their service is in user’s daily schedules. This will allow them to better follow the effect of their smallest adjustments in the app and really see how their customers react. Whether or not this is a dangerous ability is not considered.

Chelsea Kerwin, March 28, 2014

Sponsored by, developer of Augmentext

The HP View of Watson

March 19, 2014

I suppose IBM will respond with more than recipes at South by Southwest. If you enjoy big companies’ analyses of one another, you will want to gobble up “15 Reasons HP Autonomy IDOL OnDemand Beats IBM Watson.” This is not the recipe for making pals with a $100 billion outfit.

What does IBM Watson have as weaknesses? What does the reinvented (sort of) Autonomy technology have as strengths? I cannot reproduce the 15 items, but I can highlight five of the weaknesses and enjoin you to crack open the slideshow that chops up the IBM Watson PR stunt.

Here are the six weaknesses I found interesting:

  1. Reason 3. IBM Watson is a data scientist heavy platform. IDOL is not. My view is that HP paid $11 billion for Autonomy and now has to deal with the write down, legal actions related to the deal, and tossing out Mike Lynch’s revenue producing formula. Set aside the data scientists and the flip side “too few data scientists” and consider the financial mountain HP has to climb. A data scientist or two might help.
  2. Reason 4. HP has “an ultimate partner story.” I find this fascinating. Autonomy grew via acquisitions and an indirect sales model. Now HP wants to make the partner model generate enough revenue to pay off the Autonomy purchase price, grow HP’s top line faster than traditional lines of business collapse, and make partners really happy. This may be a big job. See IBM weakness 9, 11, 12, and 14. There is some overlap which suggests HP is having difficulty cooking up 15 credible weaknesses of Watson. (I can name some, by the way.)
  3. Reason 6. HP offers a “proven power platform for analytics.” I am not sure about the alliteration nor am I confident in my understanding of analytics and search. IBM Watson doesn’t have much to offer in either of these departments. IDOL, at least the pre HP incarnation, had reasonably robust security capabilities. I wonder how these will be migrated to the HP multi cloud environment. IBM Watson is doing recipes, so it too has its hands full.
  4. Reason 10. HP asserts that it offers a “potential app store.” I understand app store. Apple offers one that works well. Google is in the app store business. Amazon has poked its nose into the marketplace as well. I don’t think either HP or IBM have credible app stores for variants of the two companies’ search technologies. Oh, well, it sounds good. “Potential” is a deal breaker for me.
  5. Reason 13. HP “is focused on ramping up the innovation lifecycle.” I think this means coming up with good ideas faster. I am not sure if a service can spark a client’s innovation. Doesn’t lifecycle include death? Since IBM Watson seems a work in progress, I am not sure HP’s just released reinvention of Autonomy has a significant advantage because it too is “ramping up.”
  6. Reason 15. HP has “fired up” engineers. Okay, maybe. IBM has engineers, but I am not sure if they are fired up. My question is, “Is being fired up” a good thing. I want engineers to deliver solutions that work, are not “ramping up,” and not marketing driven.

My take on this slide deck is that it is nothing more than a marketing vehicle. I had to click multiple ads for HP products and services to view the 15 reasons. Imagine my disappointment that five of the IBM weaknesses related to partnering programs. Wow, that must be really helpful to a licensee of cloud Autonomy trying to deal with performance issues on an HP data center. HP is definitely countering IBM Watson’s recipe play with old fashioned cheerleading. Rah, rah.

Stephen E Arnold, March 19, 2014

Civic Predictive Analysis Proving Accurate

March 19, 2014

We find the field of predictive analysis fascinating (see here, here, and here, for example), and now we have more evidence of how important this work can be. Motherboard reports on “The Math that Predicted the Revolutions Sweeping the Globe Right Now.” The key component: high food prices. Writer Brian Merchant explains:

“There’s at least one common thread between the disparate nations, cultures, and people in conflict, one element that has demonstrably proven to make these uprisings more likely: high global food prices.

Just over a year ago, complex systems theorists at the New England Complex Systems Institute warned us that if food prices continued to climb, so too would the likelihood that there would be riots across the globe. Sure enough, we’re seeing them now. The paper’s author, Yaneer Bar-Yam, charted the rise in the FAO food price index—a measure the UN uses to map the cost of food over time—and found that whenever it rose above 210, riots broke out worldwide. It happened in 2008 after the economic collapse, and again in 2011, when a Tunisian street vendor who could no longer feed his family set himself on fire in protest.”

Bar-Yam’s model forewarned about the Arab Spring and the Tunisian self-immolation. Well, not those specific ways unrest would manifest, but that something big and ugly was bound to happen. Similarly, the same model divined that there would be conflicts around the world this year—as we have seen in the Ukraine, Venezuela, Brazil, Thailand, Bosnia, Syria, Spain, France, Sweden…. Last year’s global food prices were the third-highest on record; this is no coincidence. See the article for more on Bar-Yam’s methods as well as specific links between food scarcity and some of the conflicts currently shaking the world.

What can this technology do, besides hand a few of us a big bucket of “I-told-you-so”? Armed with this information, policymakers could take steps to modify the way the global marketplace is run and stop (at least some, possibly most) food shortages before they start. This means powerful people from many countries would have to work together to make major changes on a global scale for the good of humanity. With money involved. Hey, anything’s possible, right?

Cynthia Murrell, March 19, 2014

Sponsored by, developer of Augmentext

Kontagent Comes Clean

March 18, 2014

A recent partner audit by Facebook prompted the removal of business intelligence firm Kontagent from the Facebook Mobile Measurement Program (MMP). A post at the Kontagent Kaleidoscope blog from the company’s CEO, “An Update on our Relationship with Facebook, How We Store Data,” addresses the issue head-on. Andy Yang admits his company made a mistake, but assures us that absolutely no data breaches resulted from the misstep. Furthermore, though the company is not currently part of the MMP, it is still working with Facebook in other areas.

Yang details what precipitated his company’s removal from the program. They did violate Facebook’s policy on how long they could store data, but note that the slip-up occurred as they were working to exceed Facebook’s requirements on privacy and security. Still, they say, the mistake was theirs, they are learning from it, and they hope to earn the chance to rejoin the program. See the post for more on their security measures and on what transpired with Facebook. Yang summarizes:

“In short, Kontagent created an encryption policy that we designed to completely protect user privacy while addressing Facebook’s policy in one elegant solution. In hindsight, while our intentions were good, we overthought the solution when a more basic approach would have better met Facebook’s requirements.

“I completely respect the audits that Facebook conducts to ensure their partners are properly compliant. We will address each of the issues noted in Facebook’s audit despite not being a member of the MMP.”

After its launch in 2007, Kontagent cut their data analysis teeth on SaaS analytics for key social game developers. Now, leading brands in a variety of fields depend upon their expertise. Based in San Francisco, Kontagent also maintains offices in Toronto, London, Seoul, and Tokyo.

Cynthia Murrell, March 18, 2014

Sponsored by, developer of Augmentext

Full Fidelity Analytics from Karmasphere

March 18, 2014

It is the data equivalent of a distortion-free sound system— Karmasphere blogs about what they are calling “Full-Fidelity Analytics.” Karmashpere founder Martin Hall explains what the analytics-for-Hadoop company means by the repurposed term:

“Ensuring Full-Fidelity Analytics means not compromising the data available to us in Hadoop in order to analyze it. There are three principles of Full-Fidelity Analytics:

1. Use the original data. Don’t pre-process or abstract it so it loses the richness that is Hadoop

2. Keep the data open. Don’t make it proprietary which undermines the benefits of Hadoop open standards

3. Process data on-cluster without replication. Replication and off-cluster processing increases complexity and costs of hardware and managing the environment.

“By adhering to these principals during analytics, the data remains rich and standard empowering deep insights faster for companies in the era of Big Data.”

The post goes on to list several advantages to the unadulterated-data policy; Hall declares that it reduces complexity, lowers the total cost of ownership, and avoids vendor lock-in, to name a few benefits. The write-up also discusses the characteristics of a full-fidelity analytics system. For example, it uses the standard Hadoop metastore, processes analytics on-cluster, and, above all, avoids replication and sampling. See the post for more details about this concept. Founded in 2010, Karmasphere is headquartered in Cupertino, California.

Cynthia Murrell, March 18, 2014

Sponsored by, developer of Augmentext

HP Healthcare Analytics Aids in Reducing Waste

March 17, 2014

The article titled HP Autonomy Unlocks Value of Clinical Data with HP Healthcare Analytics from Market Watch explores HP’s announcement of a new analytics platform for healthcare providers to use in their work to comprehend clinical data, both structured and unstructured. The new platform was created in a partnership between HP and Standford Children’s Health and Lucile Packard Children’s Hospital. It is powered by HP Idol. The article states,

“The initial results have already yielded valuable insights, and have the potential to improve quality of care and reduce waste and inefficiency.

Though the core mission of the Information Services Analytics team at Lucile Packard Children’s Hospital Stanford is to enable operational insights from structured clinical and administrative data, innovation projects are also a key strategic initiative of the group… The healthcare industry faces the enormous challenges of reducing cost, increasing operational efficiency and elevating the quality of patient care.”

Costs have gotten out of control and it is the hope of this collaboration that analytics might be the key. A huge part of problem is the unstructured data that is overlooked in the form of text in a patient’s records, notes from the doctor or emails between the doctor and patient. HP Idol’s ability to understand and categorize such information will make early diagnosis and early detection much more possible. For more information visit

Chelsea Kerwin, March 17, 2014

Sponsored by, developer of Augmentext

Next Page »