December 9, 2013
Big data primarily consists of unstructured data that forces knowledge professionals to spend 25% of their time searching for information, says Peter Auditore and George Everitt in their article “The Anatomy Of Big Data” published by Sand Hill. The pair run down big data’s basic history and identify four pillars that encompass all the data types: big tables, big text, big metadata, and big graphs. They identify Hadoop as the most important big data technology.
Big data companies and projects are anticipated to drive more than $200 billion in IT spending, but the sad news is that only a small number of these companies are currently turning a profit. One of the main reasons is that open source is challenging proprietary companies. It is noted that users are selling their data to social media Web sites. Users have become more of a product than a client and social media giants are not sharing a user’s personal information with the actual user.
Social media is large part of the big data bubble:
“The majority of organizations today are not harvesting and staging Big Data from these networks but are leveraging a new breed of social media listening tools and social analytics platforms. Many are employing their public relations agencies to execute this new business process. Smarter data-driven organizations are extrapolating social media data sets and performing predictive analytics in real time and in house. There are, however, significant regulatory issues associated with harvesting, staging and hosting social media data. These regulatory issues apply to nearly all data types in regulated industries such as healthcare and financial services in particular.”
Guess what one of the biggest big data startups is? Social media big data analytics.
The article ends by stating that big data helps organizations make better use of their data assets and it will improve decision-making. This we already know, but Auditore and Everitt do provide some thought-provoking insights.
Whitney Grace, December 09, 2013
December 3, 2013
The explosion of big data continues to put pressure on IT departments. GCN examines how government agencies are approaching the challenge in, “As Big Data Grows, Technologies Evolve Into Ecosystems.” Writer Rutrell Yasin frames the issue of deploying big-data platforms and analytics:
“But what is the best way to accomplish this: By cobbling together various ‘point products’ that address all of the big data processes, or by building a ‘big data platform’ that integrates all of the capabilities organizations need to apply deep analytics?”
The article goes on to examine the most prominent solutions competing for institutional big-data dollars. Not surprisingly, IBM‘s Eric Sall advocates a comprehensive platform, like his company’s InfoSphere. It looks like many organizations, though, are responding to the lure of open source. Though it is often the cheaper approach, the disparate nature of open-source solutions can pose its own problems. The article looks at efforts from outfits like Red Hat and Cisco that aim to consolidate apps and systems from different sources (both open source and paid). It is worth a look if your organization is at or approaching the big-data-solution crossroads.
The article concludes:
“The bottom line is that organizations need these massively parallel processing systems and other big data tools that can scale out to address the volume, velocity and variety of big data, whether they come from a proprietary vendor’s platform or a platform based on open technologies. It makes life simpler for organizations if their workforce can unlock the value of their data via an ecosystem of integrated tools, industry experts said.”
Indeed, simpler is usually better. Even if saving money is your main goal, do not dismiss paid solutions that help manage open source resources; the savings in time and frustration often more than make up for the added cost.
Cynthia Murrell, December 03, 2013
November 28, 2013
I read “Are Big Data Vendors Forgetting History?” I worked through five observations about Big Data and realized that history is essentially irrelevant to Big Data vendors and to some pundits.
I was encouraged by the opening paragraph; to wit:
With any new hot trend comes a truckload of missteps, bad ideas and outright failures. I should probably create a template for this sort of article, one in which I could pull out a term like “cloud” or “BYOD” and simply plug in “social media” or “Big Data.”
My confusion mounted as I worked through the five “history lessons” Datamation sought to teach me:
- Little failures “portend” sometimes big failures
- Fuzzy terminology can “poison the well”
- Details can sidetrack a project
- Technical details are important
- Big Data matters.
Okay, let me address items 3 and 4, the paradox of “details matter” and “details don’t matter.” I am not sure how to resolve these opposites. In my experience, the result, particularly in technology, depends on details. But the details have to fit into some “frame.” A random detail lacks context. Perhaps the lesson is to balance the “vision” with the “execution.” Get one wrong and the other is dragged down. Big Data requires trimming; that is, chopping the data down so that a question can be answered. Once the data set is created and conforms to textbook statistical tests, then a cascade of details take center stage. Big Data often lacks this organic flow between the two opposites.
With regard to item 1, failure on any scale predicting the future, I am not sure what history teaches. Napoleon hoofed it to Moscow and then a German military leader followed in Napoleon’s footsteps. Er, winter. Food. Resupply. History, like the stock market, does not do much to make prediction a dead certain process. Do technology managers learn from the “past”? In my experience, technology managers do what is necessary to keep their job and make money. Excellence is not as high on this list as one would hope. Tomorrow is like today. “Progress” based on reading tea leaves may be a difficult to achieve.
I think that fuzzy terminology, item 2, is an emergent function in technology. Making up words and coining buzzwords performs three jobs. First, it creates an air of specialty or I know something you need to know. Second, it allows an in crowd to form so that outsiders have a tough time getting in the club. Third, marketers can hook vague promises of value to a with it term to close a deal. In the last five years, the technical innovations have been more like refinements than breakthroughs.
Item 5 which suggests that anyone who questions the value of Big Data is taking the easy path forward is interesting. Big Data, in my view, has been a constant issue. What’s new is the number of companies using the term to describe what have been standard functions. Sure, the aging Hadoop “revolution” eliminates some of the hassles and costs associated with a Codd database. The reality is that most organizations lack the staff, the resources, and the time to convert Big Data into meaningful business activities. (Meaningful means “revenue producing.”)
In short, I find the list interesting, but I don’t think there are many history lessons for me. The write up is more of an apologia for a buzzword that is teaching some people that making sense of available information is dog work, expensive, and often tough to connect to a specific payback.
The reason? Big Data requires trained professionals with expertise in math, statistics, and business processes. Last time I checked, individuals with these capabilities were in short supply. Big Data just gets bigger when there are too few sculptors to chop down the ever growing mountain of bits and bytes.
Stephen E Arnold, November 28. 2013
November 25, 2013
The article titled What If We Could Feel the Big Data Sugar Rush Faster? on SmartData Collective compares companies buying Big Data to kids trick-or-treating on Halloween. The article sets up the metaphor in all its gory details, but mainly the point seems to be that no company wants to have to wait for the goods, just like no kids enjoys being forced to pause before gorging by his or her parents. The article lists the worst of the Big Data “Tricks”,
“Having to wait 12-15 months to see value, when it should never take longer than 90 days; Having to rip out and replace all the technology you’ve ever bought because you’re told by a mega-stack vendor that it’s now worthless; Having to segregate, silo and shift data and information in order to achieve your goals, when you shouldn’t have to touch it at all.”
It also enumerates the “treats”, in sum that eventually the company will be given the ability to analyze all of their data, and learn the patterns that accompany the analysis. The proverb patience is virtue comes to mind, but whether companies will have the patience to wait for the value might determine Big Data’s success.
Chelsea Kerwin, November 25, 2013
November 15, 2013
TechCrunch makes a big deal about this headline: “ClearStory Data Designs An Analytics Platform That Is About The Experience As Much As The Technology.” ClearStory Data is one of the first companies to launch an analytics platform that can offer rich visuals and sharing capabilities. The graphics and sharing come out on the user interface, but behind the pretty graphics and social media graces there is something else.
The article states:
“On the back-end, ClearStory has a platform for integrating a company’s internal and external data using an in-memory database technology, said CEO Sharmila Shahani-Mulligan in a phone interview this week. This can be relational or NoSQL data, point-of-sale information or demographic statistics from external sources. Its advantage is in the ability to process multiple types of data on the fly and then combine that with a modern user interface.”
Not a bad new way to use analytics, especially when the idea behind it is that users will be able to manipulate their data like a story rather than a boring data report. Think about it. What would you rather do, read a griping novel or the latest user agreement for iTunes? Turning shopping or Internet browsing into a story. Maybe this could be a new form of writing or even blogging where social media turns into a giant events catalog of how people shop.
Whitney Grace, November 15, 2013
November 15, 2013
Hadoop was named after a toy elephant, so it is only appropriate that as a form of charity the company is donating money to saving elephants from poachers. nature and technology have often been perceived to be at odds with one another, constantly battling for dominance over the planet. Technology can save nature and analytical data techniques have been used to solve problems according to the recent Gigaom article, “Buy Datameer’s Hadoop Application, Save An Elephant.”
The article states:
“We’ve written before about applying big data techniques to help solve societal problems, and now we have a case of applying the revenue from big data software sales directly to a cause. In this case Datameer, a startup that applies a spreadsheet interface to Hadoop, is selling a “charity edition” of its product for $49 and donating all the proceeds during the month of November to a conservation charity called Pro Wildlife.”
Some cynics may view this gesture as a marketing ploy to buy a product that meant to solve the big data problem. (Actually, it only allows users to download to a single desktop and analysis 10 GB, so it is more like big data for the single data-obsessed user). On the bright side, you get to help save the largest, living land mammal. Who does not like elephants?
Whitney Grace, November 15, 2013
November 14, 2013
Halloween may be over, but pictures of costumes are still being uploaded to the Internet waiting to be judged. What if someone decided to make a big data costume or dress up as a data analyst? It may be hard to visualize, but Attivio wanted to get into the Halloween fun by asking people what costume best suits their big data initiatives. The idea is that people are personifying a project—everything is alive these days. So Attivio ran a poll and the results are in the article, “Trick Or Treat-Enterprise Information Looks More Like Frankenstein Than Superman.”
What is scary about the results is that a lot of people are dealing with projects as slow as a zombie limp with only 3% saying they have a good return investment.
What are some other scary results?
· “Only 5 percent say they have no big data issues and are as happy as a witch in a broom factory;
· 40 percent complain that they have too many coffins where data goes to die;
· 55 percent admit that combining data and content sources requires a team of mad scientists.”
Ow! Talk about taking a vampire bite and becoming a slave to data slag. If a data implementation is not going as it should, because the plan is way too slow, Attivio suggests a switch to a UIA strategy. They have to throw that last bit of advertising in there. The poll results show a general lack of understanding when it comes to big data.
Whitney Grace, November 14, 2013
November 10, 2013
The article on SmartData Collective titled Can You Predict Crowd Behavior? Big Data Can argues that prediction of real-world events like protesting and violent conflict are already being successfully predicted, not by historians or economists but by data scientists, specifically those at Recorded Future. We have all heard about Nate Silver’s voting predictions, but according to the article, Recorded Future has taken crowd behavior predicting even further,
“Back in January 2010 a small startup company called Recorded Future released a blog post claiming that Yemen would likely have food shortages and flooding that year. Due to the combination, the country was headed for conflict. By September of that year not only had Yemen experienced flooding but was also combating food shortages… By February 2012, the protests had turned violent with protesters killed by gunmen and the Yemen President suffering severe injuries after a bomb was planted in his compound.”
While we are not sure how this is working out in the real world, with actual events, businesses have certainly embraced the idea that they can sell things to people before the people even know they need them. The problem might be how to avoid creeping the customer out like the expectant mother debacle at Target. Meanwhile the issue of privacy rears its head; apparently it is never too early to start predicting bad behavior.
Chelsea Kerwin, November 10, 2013
November 10, 2013
The article Gartner Inc. Insists Big Data Is Not Hype, Survey Says Otherwise on Tools Journal explains some of the findings in the recent released survey from the well-established research agency, Gartner. Are azure chip consultants ever wrong? The findings about Big Data are somewhat contradictory, with question marks forming after the number 8, signifying the percentage of survey respondents from organizations that had actually deployed some form of technology connected with Big Data. However, the number for investors (and those planning to invest) is up from last year, and now at 64 percent.
The article explains,
“Many analysts feel that 2013 for Big Data was turning out to be the year of experimentation and early deployment. The Gartner study showed 70 per cent of those surveyed had moved past the early knowledge gathering and strategy formation phases and into piloting (44 per cent) and deployment (25 per cent)… Lisa Kart, Research Director at Gartner says in a written statement, “The hype around big data continues to drive increased investment and attention, but there is real substance behind the hype.”
Hedging their bets on Big Data, Gartner’s numbers surrounding the technology suggest that the challenges associated with it are also changed from last year. Then the winner was governance issues, but for 2013 it looks like companies are having trouble with knowing how to get value from Big Data and also in firming up their strategies while developing competent skills.
Chelsea Kerwin, November 10, 2013
November 8, 2013
Talend has already made a name for itself in big data software, but they want to continue pushing boundaries with the newest release of its integration platform, says Sys-Con in “Talent Reinforces Leadership In Big Data Integration With Support For YARN.” Version 5.4 does something that no other data integration platforms running in Hadoop does: leverages YARN aka MapReduce 2.0.
“ ‘At the forefront of a big data paradigm shift, Talend invests heavily in building the integration platform of tomorrow, leveraging the benefits of open source for enterprise clients,’ said Fabrice Bonan, CTO and co-founder of Talend. ‘With the advent of YARN, Hadoop is truly becoming a computing platform that goes well beyond its early use cases. With Talend v5.4, we are providing customers with the tools they need to unleash the power of Hadoop to fully leverage their total data and use it as a strategic asset, for any type of value-added project or application.’”
Talend is stepping up to increase the value of big data projects by continuing to offer features that other companies end up copying. It sets the standard for itself and Talend is always successful in topping its prior releases.
Whitney Grace, November 08, 2013