Digital Reasoning: A New Generation of Big Data Tools

December 31, 2011

I read “Tool Detects Patterns Hidden in Vast Data Sets.” The Broad Institute’s online Web site reported that a group of researchers in the US and Israel “have developed a tool that can tackle large data sets in a way that no other software program can.”

What seems exciting to me is that the mathematical procedure which involves creating a space and grids into which certain discerned patterns are placed provides a fascinating potential enhancement to companies like ours–Digital Reasoning. Our proprietary methods have performed similar associative analytics in order to reduce the uncertainty associated with processing large flows of data and distilling meaningful relationships from them. Some day computers and associated systems will be able to cope with exabytes of data from the Internet of things. Today, the Broad Institute validates the next-generation numerical methods that its researchers, Digital Reasoning’s engineers, and a handful of other organizations have been exploring.

The technical information about the method, which is called MIC, shorthand for Maximal Information Coefficient, is available to members of the AAAS. To get a copy of the original paper and its mathematical exegesis you will want the full bibliographic information:

“Detecting Novel Associations in Large Data Sets” by David N. Reshef, Yakir A. Reshef, Hilary K. Finucane, Sharon R. Grossman, Gilean McVean, Peter J. Turnbaugh, Eric S. Lander, Michael Mitzenmacher, and Pardis C. Sabeti, Science, 16 December 2011, Volume. 334, Number 6062, pages 1518-1524.

The core of the authors work is:

Identifying interesting relationships between pairs of variables in large data sets is increasingly important. Here, we present a measure of dependence for two-variable relationships: the maximal information coefficient (MIC). MIC captures a wide range of associations both functional and not, and for functional relationships provides a score that roughly equals the coefficient of determination (R2) of the data relative to the regression function. MIC belongs to a larger class of maximal information-based nonparametric exploration (MINE) statistics for identifying and classifying relationships. We apply MIC and MINE to data sets in global health, gene expression, major-league baseball, and the human gut microbiota and identify known and novel relationships.

Digital Reasoning’s application of similar mathematical methods underpins our entity-oriented analytics. You can read more about our methods in our description of Synthesys, a platform for performing automated understanding of the meaning of Big Data in real time.

The significance of this paper is that it shines a spotlight on the increasing importance of research into applications of next-generation numerical methods. Public discussion of methods like MIC will serve to accelerate innovation and the diffusion of knowledge. At Digital Reasoning we see this as further evidence of the potential of algorithmic, unaided approaches like ours to achieve true “automated understanding” of all forms of text regardless of volume, velocity or variety. As we shift to IPv6, the “Internet of things” will dramatically increase the flows of real time data. With automobiles and consumer devices transmitting data continuously or on demand, the digital methods of 10 or five years ago fall short.

Three other consequences of MIC-style innovations will accrue:

First, at Digital Reasoning, we will be able to enhance our existing methods with the new insights, forming partnerships and investing in research to apply demonstrations to real world problems. The confidence SilverLake partners’ investment in Digital Reasoning has provided us with capital to extend our commercial system quickly and in new directions such as financial services, health care, legal, and other verticals.

Second, we see the MIC method fueling additional research into methods making Big Data more accessible and useful; that is, consumerize some applications without solutions. Big Data will eventually be part of a standard information process, not something discussed as “new” and “unusual.”

Third, greater awareness of the contribution of mathematics will, I believe, stimulate young men and women to make mathematics and statistics a career. With more talent entering the workforce, the pace of innovation and integration will accelerate. That’s good for many companies, not just Digital Reasoning.

Kudos to the MIC team. What’s next?

Tim Estes, December 31, 2011

Sponsored by Pandia.com

Preparing for 2012: Search and More

December 31, 2011

Organizations are coming to understand that even if their employees are experts in their field, what they also need are experts in the field of applied mathematics. Gigaom reported on the growing need for applied mathematicians to create big data strategies in the article “Spread the Word: Math Is the New Sexiness in IT.”

The article states:

Analyzing traditional business data held in a data warehouse is one thing, but doing big data and, more specifically, data science is quite another. McKinsey & Co. predicts that by 2018, the United States will have a shortage of 1.5 million managers and analysts with the know-how to use the analysis of big data to make effective decisions,” and a shortage of almost 200,000 people with the deep analytical skills necessary for data science.

Big data is molding the future of careers and is evolving at a rate that is faster than the majority of us can keep up with. If we don’t start making math an appealing career field, the Big Data problem could become too much to handle. You can either be a user of ATM machines or you can program ATM machines. You pick.

Jasmine Ashton, December 31, 2011

Sponsored by Pandia.com

Life Fix: Take More Breaks, Get More Done

December 31, 2011

Get ready for the New Year!

Can you get ahead by taking a nap at your desk every day? That’s what Tony Schwartz advises in Harvard Business Review’s “How to Accomplish More by Doing Less.” Actually, Schwartz doesn’t advise napping per se. It’s about combining focused ninety-minute work sessions with breaks throughout the day.

He illustrates his point by comparing two theoretical workers. The first, Bill, spends the day straight through at his desk, even eating lunch there. Nick, however, takes a fifteen minute break after every hour and a half of work, takes forty five minutes for lunch, and even takes a nap of up to twenty minutes each afternoon. While it may appear to the casual observer that Bill is the better worker, Nick actually gets better results because he isn’t burning himself out. The article insists:

It’s not just the number of hours we sit at a desk in that determines the value we generate. It’s the energy we bring to the hours we work. Human beings are designed to pulse rhythmically between spending and renewing energy. That’s how we operate at our best. Maintaining a steady reservoir of energy — physically, mentally, emotionally and even spiritually — requires refueling it intermittently. Work the way Nick does, and you’ll get more done, in less time, at a higher level of quality, more sustainably.

Schwartz boosts his conclusion with studies of pilots and violinists, as well as with his personal experience. He seems to understand, though, that such patterns aren’t widely tolerated, much less encouraged; he appeals to managers to embrace intermittent renewal for their employees.

Good luck with that in 2012 as you race to fix a search system which has indexed employee medical and employment records.

Cynthia Murrell, December 31, 2011

Sponsored by Pandia.com

Useful for 2012

December 31, 2011

This is not search related, but we noted the item and wanted to share it with our two or three readers.

Privacy is a luxury that few can afford in the Internet age that we live in. However, technology allows you to use this fact to your advantage. The days of anonymous prank calls are over, thanks to the new reverse phone service WhoIsThisPhone.com.

KillerStartups reported on this free and convenient reverse-phone search service in the article “WhoIsThisPhone.com Reverse-Phone Search.”

WhoIsThisPhone.com allows anyone to research its extensive phone database and find a phone number and exact geographical location from any caller who may be trying to hide his or her identity.

The description states:

People calling you up in the middle of the night and hanging up without speaking a word, people leaving strange messages into your machine, people who keep on calling and requesting to talk to someone that you have already explained doesn’t live there. WhoIsThisPhone.com is going to assist you in all such scenarios. You’ll get to know who’s behind such calls. And once you know as much, you’ll be able to begin doing what it takes to make them stop.

While the basic site is free, there are marginal charges that can be incremented by having reports generated.

Jasmine Ashton, December 31, 2011

Sponsored by Pandia.com

In a Web Without Google

December 30, 2011

ExtremeTech has some predictions about  “The Post-Google Apocalypse.” Though the Web has been dominated by Google for fifteen years, the article insists that state of affairs is about to change. Writer Sabastian Anthony points to the trend toward targeted and authoritative searches as one factor. More important, though, is the shift to mobile apps. Ease of use, both in using and installing apps, is the driving factor behind this shift, he says. That, and the waning of Windows.

So, what will this post-Google world look like? Anthony predicts a surge in the growth of app stores, each with its own standards and rules. Such diversity will create headaches for developers as well as erode that ease-of-use that propelled users this direction in the first place. The write up proposes an answer:

There is a solution, though, and it revolves around open standards — HTML5 and JavaScript, to be exact. The only way that developers — and consumers — will be able to keep up with six or more platforms, is to have a common language. We are already seeing this with many mobile apps where there’s basically just a native wrapper around an HTML5 website — this is just a stopgap solution, though, and can cause more problems than it solves.

Will the world of apps develop such standards, or are we doomed to more hassles than we’re leaving behind on our desktops? Or, perhaps, will Google find a new way to reign supreme?

Cynthia Murrell, December 30, 2011

Sponsored by Pandia.com

One Suggestion for a URL Shortening Solution in SharePoint

December 30, 2011

We know that communicating SharePoint links can get pretty painful when you copy and paste a link to a document from a Document Library buried deep in a hierarchal list of sites. URL shortening has eased a lot of this pain across a variety of platforms. Jan Tielens discusses URL shortening and provides a suggested solution in “URL Shortening for SharePoint 2010.” The author describes her solution:

So to make a long SharePoint URL short, you can copy the URL to the clipboard, go to a URL shortener, past the long link over there and copy the short URL you get in return back to the clipboard. Works perfectly, but there are quite some tedious steps to go through. Already a long time ago, when SharePoint 2007 was still the rage, I posted some code that automates all these steps. Finally I found some time to update the code to SharePoint 2010 and nicely package it in a Sandboxed Solution, so it works both for SharePoint 2010 deployed on premises as in the cloud on Office 365.

A handy tip for a pesky problem, no doubt. We’ve seen how short URLs allow for convenient messaging and sharing, like in the case with Twitter or Identi.ca. But URL shortening is a tedious process.

If you prefer to focus your time on tasks of greater importance, check out Fabasoft Mindbreeze. Over there, they really have text processing components in SharePoint down to an art. You can read more here at, “Information Pairing Makes Websites More Intelligent!” to learn about some of the benefits of their information pairing technology. “It smoothly integrates itself into your website so that the user doesn’t even realize that Cloud services are working in the background. Furthermore, InSite always knows what a user is interested in.” Their great deal of digital know-how takes convenience to a new level with mobility and maintenance-free capabilities; check out Fabasoft Mindbreeze.

Philip West, December 30, 2011

Sponsored by Pandia.com

PLM Predictions for 2012

December 30, 2011

It must be the end of the year because all the prognosticators are out and making their lists.  Technological innovations are no exceptions and the article “Kalypso Issues Top 5 Innovation Predictions for 2012″ focuses on the future of  product development,  PLM technology and innovation.

Kalypso, a innovation consulting firm, anticipates PLM technologies to look to the sky and find the cloud.  The article explains that:

With the recent release of new cloud-based PLM technologies from Dassault Systèmes and Autodesk, companies have a growing list of options for transitioning to cloud-based PLM systems. This, coupled with cloud-based energy savings and the potential to reduce cost of software ownership, will likely draw more companies to these solutions.

They also foresee PLM entering the consumer packaged goods industry.  These companies “have traditionally lagged behind other industries in adopting PLM to drive innovation” but that is about to change.

These innovations are very intriguing and we are excited to see what 2012 brings.  So what do we think is going to happen in 2012?  Well don’t be surprised if Inforbix, a collection of integrated applications that help companies find, re-use and share product data, continues to make a big splash.  They had a great 2011 and we see even bigger things for them in the upcoming year.  We hope everyone has a happy, healthy and prosperous New Year!

Jennifer Wensink, December 30, 2011

Protected: Cloud Applications May Keep Companies from Going Down with World Economy

December 30, 2011

This content is password protected. To view it please enter your password below:

Article Marketing Confused with Article Spinning

December 30, 2011

We receive quite a few missives from hot, maybe radioactive, public relations outfits. A good example is AtomicPR, the MarkLogic information output service. I have a tough time figuring out what is editorial opinion such as the information I generate when a topic like azure chip consultants or LinkedIn enterprise search blather. I also never know when a Forbes or Bloomberg news story is a recycle news release. Marketwatch also baffles me. I am deeply suspicious of any information from Marketwatch which is displayed with copious amounts of green, which is supposed to suggest money to me I think. I skip the public relations nuclear waste, the company sponsored blogs which provide me with tips to cope with eDiscovery as ZyLAB is doing, and sponsored blogs like our own HighGainBlog.com operation. (Oh, we will announce a new sponsor in January 2012, and we will deliver useful, curated information too. I find company blogs endlessly amusing. Google operates more than four score blogs each outputting “content.” Now the SEO crowd has figured out “content.” Hooray.

Writing for Search Engine Journal, Suzanne Edwards puts her spin on article marketing in “Eight Good Reasons Why Spinning Articles is Bad for your Website.” The writer, who also writes for Cash for Gold, of all places, makes some sweeping generalizations. There is a wide range between summarizing and pointing the way to helpful links and using “spinning” software. The article describes these abominable applications:

Tons of article spinning software have flooded the Internet. Needless to say, article marketing has become an efficient way of building hundreds if not thousands of backlinks. However, automatic articles spinning with the use of a spinning software is deemed as a black hat SEO technique that can seriously hurt a website’s search rankings and page rank.

While we agree with her point on robo-writers, she paints with too broad a brush. Is reason number nine financial advisors’ ability to discern truth and accuracy? Financial services firms make SEO firms look like the Vatican’s college of Cardinals on a Bible study weekend.

Super. This goose will immediately snap to content ideas from a financial advisor. It’s okay. We trust financial advisors like Bear Stearns and Lehman Brothers, right? We believe everything we read on the Internet even when the content is delivered by predictive methods developed by dear old Google and Microsoft.

Cynthia Murrell, December 30, 2011

Sponsored by Pandia.com

Pocket Cloud Explore

December 30, 2011

I love things that make life simpler. Wyse Technology has come up with a new type of integrated search, as TechCrunch reveals in “PocketCloud Explore Lets You Search Your Android, PC & Mac at Once.” The app for Android, aptly named Wyse PocketCloud Explore, lets you search multiple places at once. Wish I could do that when I loose my car keys! According to the write up, with this app:

You can perform universal file searches, then view the files, rename them, move them into folders, share them or download them to your device. The software lets you perform unlimited copying and moving of video, image and audio files between your Windows or Mac computer and your Android device. Meanwhile, other files types can be opened or edited in your preferred Android application (e.g., QuickOffice). You can also choose to email the file via Android’s email client.

As opposed to cloud storage products like Dropbox or Box.net, this app for the self-storage inclined runs only $4.99 with no monthly fee. Writer Sarah Perez notes one important caveat: if you store it yourself, back up your data. Regularly. Seriously, no excuses.

Wyse Technology has become a leader in the burgeoning cloud computing field. The company, headquartered in San Jose, CA, works with businesses, government organizations, and hybrid partnerships worldwide. Last month, Wyse proudly acquired Trellia, a company proficient in cloud-based mobile device management.

Cynthia Murrell, December 30, 2011

Sponsored by Pandia.com

Next Page »