Hadoop Officially a Big Deal for Big Data

June 26, 2012

Hadoop, our favorite batch-processing data management system, is now more important than ever. InfoWorld reveals in “Hadoop Becomes Critical Cog in the Big Data Machine.” The previous version of Apache‘s Hadoop has been adopted by more and more organizations with vast swaths of data to manage. Many users develop their own technologies to complement the Hadoop stack.

Writer Paul Krill details ways NASA, Twitter, Netflix , and Tagged use Hadoop technology, as well as challenges each has faced with the software. Recommended reading for anyone with Hadoop in their lives.

Regarding the upcoming version, the article cites Eric Baldeschwieler, CTO of HortonWorks, a company which has contributed to Hadoop. The write up tells us:

“Hadoop 2.0 focuses on scale and innovation, with Yarn (next-generation MapReduce) and federation capabilities. Yarn will let users add their own compute models so that they do not have to stick to MapReduce. ‘We’re really looking forward to the community inventing many new ways of using Hadoop,’ Baldeschwieler says. Expected uses include real-time applications and machine-learning algorithms. Scalable, pluggable storage is planned also. Always-on capabilities in Version 2.0 will enable clusters with no downtime. Scalable storage is planned as well.”

Notice that MapReduce has been renamed Yarn; the entire layer has been rewritten. Expect Hadoop 2.0 to be generally available within the year.

Cynthia Murrell, June 26, 2012

Sponsored by PolySpot

Comments

Comments are closed.