Apache Sparking Big Data

April 3, 2015

Apache Spark is an open source cluster computing framework that rivals MapReduceVenture Beat says that people did not pay that much attention to Apache Spark when it was first invented at University of California’s AMPLAB in 2011.  The article, “How An Early Bet On Apache Spark Paid Off Big” reports the big data open source supporters are adopting Apache Spark, because of its superior capabilities.

People with big data plans want systems that process real-time information at a fast pace and they want a whole lot of it done at once.  MapReduce can do this, but it was not designed for it.  It is all right for batch processing, but it is slow and much to complex to be a viable solution.

“When we saw Spark in action at the AMPLab, it was architecturally everything we hoped it would be: distributed, in-memory data processing speed at scale. We recognized we’d have to fill in holes and make it commercially viable for mainstream analytics use cases that demand fast time-to-insight on hordes of data. By partnering with AMPLab, we dug in, prototyped the solution, and added the second pillar needed for next-generation data analytics, a simple to use front-end application.”

ClearStory Data was built using Apache Spark to access data quickly, deliver key insights, and making the UI very user friendly.  People who use Apache Spark want information immediately to be utilized for profit from a variety of multiple sources.  Apache Spark might ignite the fire for the next wave of data analytics for big data.

Whitney Grace, April 3, 2015
Stephen E Arnold, Publisher of CyberOSINT at www.xenky.com

Comments

Comments are closed.

  • Archives

  • Recent Posts

  • Meta