Inetsoft Incorporates Data Source Connector for Hadoop
February 20, 2014
Style Intelligence, the Business Intelligence platform from Inetsoft, released v. 11.5 late in 2013. According to the Passionned Group post “New Release of Inetsoft Includes Hadoop Connector,” the new data source connector will
“enable users to add data from the Apache Hive data warehouse to their dashboards and analysis in the same easy way they connect any other data source.”
The update also features a complete restyle of their user interface, which is based on a self-service dashboard for data reporting. Style Intelligence is built on an open standard SOA architecture and delivers data reports from both structured and unstructured sources.
In other words, Inetsoft has seen that Hadoop is the way Big Data is going, and they want to make sure their own product can work with what is fast becoming the industry standard.
Laura Abrahamsen, February 20, 2014
Sponsored by ArnoldIT.com, developer of Augmentext
Thomson Reuters Acquires Entagen, Builds Cortellis Data Fusion Technology
February 19, 2014
The press release on ThomsonReuters.com titled Thomson Reuters Cortellis Data Fusion Addresses Big Data Challenges by Speeding Access to Critical Pharmaceutical Content announces the embrace of big data by revenue hungry Thomson Reuters. The new addition the suite of drug development technologies will offer users a more intuitive interface through which they will be able to analyze large volumes of data. Chris Bouton, General Manager at Thomson Reuters Life Sciences is quoted in the article,
“Cortellis Data Fusion gives us the ability to tie together information about entities like diseases and genes and the connections between them. We can do this from Cortellis data, from third party data, from a client’s internal data or from all of these at the same time. Our analytics enable the client to then explore these connections and identify unexpected associations, leading to new discoveries… driving novel insights for our clients is at the very core of our mission…”
The changes at Thomson Reuters are the result of the company’s acquisition of Entagen, according to the article. That company is a leader in the field of semantic search and has been working with biotech and pharmaceutical companies offering both development services and navigation software. Cortellis Data Fusion promises greater insights and better control over the data Thomson Reuters holds, while maintaining enterprise information security to keep the data safe.
Chelsea Kerwin, February 19, 2014
Sponsored by ArnoldIT.com, developer of Augmentext
New Managers, Products from Centrifuge Systems
February 19, 2014
The announcement from Centrifuge titled Centrifuge Systems Strengthens Big Data Discovery and Security promotes the release of Centrifuge 2.10. The new features of the link analysis and visualization software include the ability to block access as well as grant access to specific individuals, a more flexible method of login validation and the ability to “define hidden data sources, data connections and connection parameters.” Stan Dushko, Chief Product Officer at Centrifuge, explains the upgrades and the reasoning behind them,
“With organizations steadily gathering vast amounts of data and much of it proprietary or sensitive in nature, exposing it within visualization tools without proper security controls in place may have unforeseen consequence…Can we really take the chance of providing open access to data we haven’t previously reviewed? Not knowing what’s in the data, is all the more reason to enforce proper security controls especially when the data itself is used to grant access or discover its existence altogether.”
The Big Data business intelligence software provider promises customers peace of mind and total confidence in their technology. They believe their system to be above and beyond the dashboard management systems of “traditional business intelligence solutions” due to their displays possibility of being reorganized in a more interactive way. Speaking of organization, you may notice that finding Centrifuge Systems in Google is an interesting exercise.
Chelsea Kerwin, February 19, 2014
Sponsored by ArnoldIT.com, developer of Augmentext
Funnelback Advocates Big Data Mining
February 9, 2014
It is a new year and, as usual, there are big plans for big data. Instead of looking ahead, however, lets travel back to the July 4, 2012 Squiz and Funnelback European User Summit. On that day, Ben Pottier gave a discussion on “Big Data Mining With Funnelback.” Essentially it is a sales pitch for the company, but it is also a primer to understanding big data and how people use data.
At the beginning of the talk, Pottier mentions a quote from the International Data Corporation:
“The total amount of global data is expected to grow almost 3 zettabytes during 2012.”
That is a lot of ones and zeroes. How much did it grow in 2013 and what is expected for 2014? However much global data is grown, Pottier emphasizes that most of Funnelback’s clients have 75,000 documents and as it grows bigger organizations need to address how to manage it. Over the basic explanation, Pottier explains the single biggest issue for big data is finding enterprise content. In the last five minutes, he discusses data mining’s importance and how it can automate work that used to be done manually.
In Pottier’s talk, he explains that search is a vital feature for big data. Ha! Interesting how search is stretched to cover just about any content related function. Maybe instead of big data it should be changed to big search.
Whitney Grace, February 09, 2014
Sponsored by ArnoldIT.com, developer of Augmentext
Quote to Note: Big Data Skill and Value Linked
February 6, 2014
Tucked in “The Morning Ledger: Companies Seek Help Putting Big Data to Work” was a quote attributed to SAS, a vendor of statistical solutions and software. The quote:
David Ginsberg, chief data scientist at SAP, said communication skills are critically important in the field, and that a key player on his big-data team is a “guy who can translate Ph.D. to English. Those are the hardest people to find.”
I have been working through patent documents from some interesting companies involved in Big Data. The math is somewhat repetitive, but the combination of numerical ingredients makes the “invention” it seems.
One common thread runs through the information I have reviewed in preparation for my lectures in Dubai in early March 2014. Fancy software needs humans to:
- Verify the transforms are within acceptable limits
- Configure thresholds
- Specify outputs often using old fashioned methods like SQL and Boolean
- Figure out what the outputs “mean”.
With search and content processing vendors asserting that their systems make it easy for end users to tap the power of Big Data, I have some doubts. With most “analysts” working in Excel, a leap to the types of systems disclosed in open source patent documents will be at the outer edge of end users’ current skills.
Big Data requires much of skilled humans. When there are too few human Big Data experts, Big Data may not deliver much, if any, value to those looking for a silver bullet for their business.
Stephen E Arnold, February 6, 2014
Topology and Big Data Making Shapes
February 5, 2014
The article titled Lawrence Livermore Explores the Shape of Data, Expanding Query-Free Analytics on GCN delves into the work of the Livermore National Laboratory in partnership with Ayasdi Inc. Using homegrown technology, the lab tackles big data to analyze various areas of research such as climate change, national security and biological defense. Recently their work has begun to incorporate topology, the study of shapes.
The article explains the connection:
“The fundamental idea is that topological methods act as a geometric approach to pattern or shape recognition within data,” says a September 2013 article in the journal Science co-authored by Ayasdi CEO Gurjeet Singh. It allows “exploration of the data, without first having to formulate a query or hypothesis.” That is, researchers can find things they did not know they were looking for. For instance, in a database of billions upon billions of phone records scientists could make sense of who was talking to whom.”
Such complicated shapes are almost impossible for people living in 3D to even imagine. But the practical applications seem endless. Stanford University began the research that resulted in TDA in the 1970’s, and received $10 million from NSF and DARPA in 2003. Five years later Ayasdi was founded for commercial use of the TDA software, which is offered as a cloud-based service.
Chelsea Kerwin, February 05, 2014
Sponsored by ArnoldIT.com, developer of Augmentext
Big Data and Human Culture
February 1, 2014
A couple of big data pros have written the book on how data analysis can connect us with our culture’s evolution. At least one critic, though, is lukewarm about the project, for which the authors plumbed the depths of Google’s huge digitized book database for patterns in word usage. Nick Romeo at the Daily Beast describes “Why Big Data Doesn’t Live up to the Hype.”
He writes:
“Sometimes it seems the only thing larger than big data is the hype that surrounds it. Within the first 30 pages of Uncharted: Big Data as a Lens on Human Culture, Erez Aiden and Jean-Baptiste Michel manage to compare themselves to Galileo and Darwin and suggest that they, too, are revolutionizing the world. The authors were instrumental in creating the Google Ngram viewer, which allows researchers or anyone else so inclined to explore the changing frequencies of words across time. Likening their creation to a cultural telescope, they proceed to share some of their ostensibly dazzling findings.”
Romeo begins his piece with an account of similar, pre-big-data-era investigations into language that is worth checking out in itself. He admits that it can be interesting to observe the patterns that turn up in such explorations, both analog and digital. He even shares a few examples from the book that he found intriguing. For example, writers shifted from treating “the United States” as plural to singular in 1880. However, Romeo maintains that Aiden and Michel are overstating the significance of their finds, which he calls mere trivia. Is he right, or could such efforts provide key insights into the human condition?
Cynthia Murrell, February 01, 2014
Sponsored by ArnoldIT.com, developer of Augmentext
Velocity is the Neglected V
January 27, 2014
Isn’t this three V thing getting a bit long in the tooth? Ah, well, the convention serves as a vehicle for an emphasis on speed in Gigaom‘s piece, “Big Data and the Missing ‘V’.” Writer Alain Vandenborne notes that two of Big Data’s famous three V’s, volume and variety, have received a lot of attention, but insists that the third, velocity, is just as important to an organization’s success.
He writes:
“Velocity is what differentiates big data from traditional business intelligence (BI) practices. It enables real-time decisions and actions, whereas traditional BI generally only covers volume and variety. In a traditional data warehouse, data is typically collected and analyzed at the end of the day, and then made available the following morning, at the start of business. This creates issues, as a lot can happen since the last transaction—continued customer activity, supply chain issues, product faults, etc.—and companies can’t afford to wait for this data to become available.”
He has a point, and backs it up with some examples. One is feedback posted on social media—one negative comment left unaddressed overnight can grow into a firestorm of negative press before start-of-business the next day. Another good illustration involves the capacity for financial institutions to flag a potentially fraudulent charge and stop it mid-transaction.
Yes, customers have come to expect promptness on a scale like never before, and many businesses that can’t deliver will find themselves falling behind, no matter how much, or how many types of, data they can process (comparatively) slowly. The article suggests companies with a need for speed turn to cloud services like AWS and SAP Hana.
I wonder: Have companies overcome their hesitation to store data on third-party servers? Are we at the point where businesses will have no choice if they want to prosper?
Cynthia Murrell, January 27, 2014
Sponsored by ArnoldIT.com, developer of Augmentext
Video from Vivisimo Examines Big Data Management
January 15, 2014
The nine minute video article Big Data and Vivisimo- Managing and Extracting Insights From Large and Heterogenerous Data from IBM on Optimized Target Traffic explains the four V’s of Big Data as they pertain to Vivisimo. Typically only three V’s are named, “volume, velocity and variety.” Volume covers such questions as how does one get multiple data sets into Hadoop? Velocity is related to asking about frequency of updates and whether there is a single static repository, while variety refers to analytics to perform and applications to build. But the video also offers the fourth V of variability.
Bob Carter of Vivisimo explains:
“Lastly there is the issue of variability, in the sense that I need to deliver the information to different audiences. It could be done in different implementations, let’s say it’s a cross domain environment where I’m putting a Big Data system on a secure network and I have to offer up a subset of that information to a higher level network or to a different domain. How do I smartly share that information with other agencies or other commercial customers on my supply chain?”
Variability, according to Vivisimo’s agent, also encompasses considerations of different security settings and management requirements. Omitted from consideration however is the cost of “touching” a single record in an exception file. But serial processing is expensive, just like handling variety in Big Data.
Chelsea Kerwin, January 15, 2014
Sponsored by ArnoldIT.com, developer of Augmentext
A Webinar Adds Value to Data
January 11, 2014
Connotate is offering a webinar called, “Big Data: The Portal To New Value Propositions.” The webinar summary explains what most big data people already know: that with all the new data available, there are new ways to cash in. The summary continues with that people generate data everyday with everything they do on the Internet and that companies have been collecting this information for years. Did you also know that as well as a physical identity that people also have a virtual identity? This is very basic knowledge here. Finally the summary gets to the point about how business value propositions will supply new opportunities, but also leads to possible risks.
After the summary, there is a list of topics that will be covered in the webinar:
· “Review the process of creating big data-based value propositions and illustrate many examples in science and health, finance, publishing and advertising.
· Explore which companies are successful, which are not and why.
· Review the mechanics: How to use unstructured content and combine it with structured data.
· Focus on data extraction, the “curation” process, the organization of value-based schemas and analytics.
· Analyze the ultimate delivery of value propositions that rest on the unique combination of unique data sets responding to a specific need.”
Big data has been around long enough that there should be less of a focus on how the data is gathered and more on the importance of value propositions. Value propositions demonstrate how the data can yield the results and how they can be used. Data value debates have been going on for awhile, especially on LinkedIn. If Connotate and Outsell know how to turn data into dollars, advertise that instead of repeating big data specs.
Whitney Grace, January 11, 2014
Sponsored by ArnoldIT.com, developer of Augmentext