December 18, 2014
A smaller big data sector that specializes in text analysis to generate content and reports is burgeoning with startups. Venture Beat takes a look out how one of these startups, Narrative Science, is gaining more attention in the enterprise software market: “Narrative Science Pulls In $10M To Analyze Corporate Data And Turn It Into Text-Based Reports.”
Narrative Science started out with software that created sport and basic earnings articles for newspaper filler. It has since grown into help businesses in different industries to take their data by the digital horns and leverage it.
Narrative Science recently received $10 million in funding to further develop its software. Stuart Frankel, chief executive, is driven to help all industries save time and resources by better understanding their data
“ ‘We really want to be a technology provider to those media organizations as opposed to a company that provides media content,’ Frankel said… ‘When humans do that work…it can take weeks. We can really get that down to a matter of seconds.’”
From making content to providing technology? It is quite a leap for Narrative Science. While they appear to have a good product, what is it they exactly do?
December 17, 2014
You may want to read “Google Helps to Use Big Data for Global Surveillance—And That’s Good.” I have no big thoughts about this write up. Googlers like sushi, so protecting fish from overzealous fisher people seems logical to me. I would raise one question you ponder after you have read the article:
What happens when humans are tracked and analyzed in this manner?
Is this function in place as you read this?
I have no answers, but I enjoy learning what other people think. We do not need to discuss the meaning of “good.”
Stephen E Arnold, December 17, 2014
December 15, 2014
Did you know that there was hidden data in big data? Okay, that makes a little sense given that big data software is designed to find the hidden trends and patterns, but RC Wireless’ “Discovering Big Data Unknowns” article points out that there is even more data left unexplored. Why? Because people are only searching in the known areas. What about the unknown areas?
The article focuses on Katherine Matsumoto of Attensity and how she uses natural language processing to “social listen” in these grey areas. Attensity is a company that specializes in natural language processing analytics to understand the content around unstructured data—big data white noise. Attensity views the Internet as the world’s largest consumer focus group and they help their clients’ consumerism habits. The new Attensity Q platform enables users to identify these patterns in real time with and detect big data unknowns.
“The company’s platform combines sentiment and trend analysis with geospatial information and information on trend influencers, and said its approach of analyzing the conversations around emerging trends enables it to act as an “early warning” system for market shifts.”
The biggest problem Attensity faces is filtering out spam and understanding the data’s context. Finding the context is the main way social data can be harnessed for companies.
Scooping out the white noise for the useful information is a hard job. Can the same technology be applied to online ads to filter out the scams from legitimate ones?
December 15, 2014
While this is the season of miracles and magic, usually those are reserved for Hallmark movies and people in need, but one could argue that HP was in desperate need after the Autonomy fiasco. Maybe their Christmas wish will come true if the Information Week article “HP Cloud Adds Big Data Options” makes correct prediction.
HP will release its Haven big data analytics platform through the HP Helion cloud as Haven OnDemand. The writer believes this is HP’s next logical step given Autonomy Idol was released in January as SaaS. The popular Vertica DBMS will also be launches as a cloud service.
“Cloud-based database services have proven to be popular, with Amazon’s fast-growing Redshift service being an obvious point of comparison. Both HP Vertica and Redshift are distributed, columnar databases that are ideally suited to high-scale data-mart and data-warehouse use cases.”
HP wants to make a mark in the big data market and help their clients harness the valuable insights hiding in structured and unstructured data. While HP is on its way to becoming a key component in big data software, but it still needs improvement to compete. It doesn’t offer Hadoop OnDemand and it also lacks ETL, analytics software, and BI solutions that run alongside HP Haven OnDemand.
The company is finally moving forward and developing products that will start making up for the money lost in the Autonomy deal. How long will it take, however, to get every penny back?
December 10, 2014
The article on Enterprise Networking Planet titled Cisco Goes Open-Source for Big Data Analytics discusses the change for Cisco with some high-ups in the company. Annie Ballew, Solutions Architect in the Cisco Security Business Group, mentions that OpenSOC is not actually a Security Information and Event Management system but rather should be considered “big data technology for security analytics.” OpenSOC is freely available through Github. The article states,
“While the OpenSOC project itself is open-source, Cisco is already leveraging the technology in its commercial products.”OpenSOC is currently included in our Managed Threat Defense services offering where it is installed, implemented and fully operationalized,” Ballew said. Cisco launched its Manage Threat Defense service in April. That service manages and monitors logs as well as a customer’s security event lifecycle. Ballew added that OpenSOC is also integrated with various other Cisco security components such as Sourcefire FirePower NGIPS, SourceFire AMP, and ThreatGrid.”
The article also remarks on the importance of Elasticsearch to OpenSOC. The Kibana project provides the dashboard for the opensource Elasticsearch project, and Cisco admits that they work with Elasticsearch, but currently that relationship is only through Kibana. Cisco has worked with open-source before, so perhaps it should be no surprise that they turn to OpenSOC to meet their security demands when it comes to big data.
Chelsea Kerwin, December 10, 2014
December 1, 2014
“Kapow Enterprise 9.3 introduces new capabilities that give organizations greater flexibility, speed and reach in turning Big Data into business insights. These enhancements extend Kapow Enterprise as the leading data integration platform to access, integrate, deliver and explore data from the widest variety of internal and external sources.”
The new version boasts added flexibility and coverage when acquiring data across disparate sources. It also offers enhanced data distribution and exploration; of particular value to many will be the platform’s visual presentation of data through auto-generated graphs and tables, both of which update themselves as users add and remove filters. Kapow has also improved its Kapplets, the feature that lets users easily publish web apps that combine information into easily-digested interactive presentations. See the post for more information, or contact the company to request a demo.
Priding themselves on their products’ flexibility, integration-and-automation firm Kapow serves businesses of all sizes around the world. Headquartered in Palo Alto, California, Kapow was founded in 2005. The promising company was snapped up by process-applications outfit Kofax in 2013. Kofax is also based in Palo Alto, and was founded back in 1991.
Cynthia Murrell, December 01, 2014
November 22, 2014
If you are one of the Big Data believers, you will find “Clearing Up Muddied Waters in the ‘Data Lakes’” a reminder about the plasticity of concepts and their connotations. The write up addresses a clever phrase used to describe a storage pool into which
You store raw data at its most granular level so that you can perform any ad-hoc aggregation at any time. The classic data warehouse and data mart approaches do not support this.
The write up points out that the original notion of a data lake has been prodded, stretched, and pulled. Not surprisingly, after the verbal chiropractic, data lake is just not its old self.
Who are the perpetrators of this conceptual improvement? A “real” journalist and—no big surprise—several Big Data experts laboring away at a mid tier consulting firm.
So what? The coiner of the phrase points me and other readers to the original write up about data lakes here. Worth revisiting? Will the “real” journalist or the mid tier consultants likely to read the source document? I would guess not.
Stephen E Arnold, November 22, 2014
November 13, 2014
The article on Inside BigData titled RapidMiner Moves Predictive Analytics, Data Mining and Machine Learning into the Cloud promotes RapidMiner Cloud, the recently announced tool for business analysts. The technology allows for users to leverage over 300 cloud platforms such as Amazon, Twitter and Dropbox at an affordable price ($39/month.) The article quotes RapidMiner CEO Ingo Mierswa, who emphasized the “single click” necessary for users to gain important predictive analytics. The article says,
“RapidMiner understands the unique needs of today’s mobile workforce. RapidMiner Cloud includes connectors to cloud-based data sources that can be used on-premises and in the cloud with seamless transitioning between the two. This allows users to literally process Big Data at anytime and in any place, either working in the cloud or picking up where they left off when back in the office. This feature is especially important for mobile staff and consultants in the field.”
RapidMiner Cloud also contains the recently launched Wisdom of the Crowds Operator Recommendations, which culls insights into the analytics process from the millions of models created by members of the RapidMiner community. The article also suggests that RapidMiner is uniquely capable of integration with open-source solutions, rather than competing, the platform is more invested in source-code availability.
Chelsea Kerwin, November 13, 2014
October 28, 2014
I read “Statis Partners Up with HP Autonomy.” Statis is a services firm with about 60 employees. According to the article:
Hungarian Stratis Vezet?i és Informatikai Tanácsadó Kft. entered a partnership with Hewlett Packard Autonomy, thereby becoming part of the Big Data market, Stratis announced…
IDOL and the Digital Reasoning Engine can “do” Big Data, but the core system does information retrieval. There are other approaches to Big Data that use more modern technologies.
Some content processing vendors are showing more interest in what I call the Eastern European market. With HP looking to sell some of its China assets, the shift to Europe may be one way of growing revenues.
HP, like IBM, has its hands full. Forget the legal hassles, both companies are trying to get out of the buggy whip business, to reference a famous marketing myopia case.
The problem is that the revenues generated by new fangled businesses via the old-school IBM- and HP-type business model will produce revenue. Unfortunately that revenue will be less lucrative than the money made on mainframes and scientific equipment in the good old days.
HP will need to find dozens of Hungarian-type deals to allow Autonomy to pay back its new owner.
Stephen E Arnold, October 28, 2014
October 21, 2014
A happy quack to the reader who alerted us to “”What Is Big Data?” The write up consists of 43 definitions provided by luminaries in a variety of fields. If you are in search of enlightenment with regard to Big Data, navigate to the story and dig in.
I found a couple of definitions interesting. Let me highlight Daniel Gillick’s and Hal Varian’s. Both are hooked up with Google, one of the big time big data outfits.
Mr. Gillick says:
Historically, most decisions — political, military, business, and personal — have been made by brains [that] have unpredictable logic and operate on subjective experiential evidence. “Big data” represents a cultural shift in which more and more decisions are made by algorithms with transparent logic, operating on documented immutable evidence. I think “big” refers more to the pervasive nature of this change than to any particular amount of data.
Mr. Varian says:
Big data means data that cannot fit easily into a standard relational database.
There you have it: A cultural shift and anything that won’t fit in a Codd-style data management system. Are the other 41 definitions superfluous?
Stephen E Arnold, October 21, 2014