A SASsy Hadoop Data Connection

January 2, 2015

It has been a while since we posted an article that highlights Hadoop’s capabilities and benefits. The SAS Data Management blog talks about how data sources are increasing and Hadoop can help companies organize and use their data: “The Snap, Crackle, And Pop Of Data Management On Hadoop.”

SAS is a leading provider of data management solutions, including an entire line based on the open source Hadoop software. They offer several ways to control data, including the FROM, WITH, and IN options. While the names are simple, they sun up the processes in one world.

The SAS FROM allows users to connect to the Hadoop cluster. It connects to Hadoop using an SAS/ACCESS engine, which collects metadata built in Hadoop and making them available in the data flows. This allows the software to make performance decisions without user intervention.

SAS WITH is more complicated based off its give and take function:

“The SAS WITH story provides transformation capabilities not yet available in Hadoop. UPDATE and DELETE are standard SQL transformations used in a variety of data processing programs. Hive does not yet support these functions, but you can utilize PROC IMSTAT (part of the WITH story) to lift a table or partition into memory and perform these functions in parallel. The table or partition could then be reincorporated into the Hive table, alleviating the need to truncate and reload from an RDBMS data source.”

SAS IN has the most advanced coding capabilities for data management. It allows users to run a program, where they can run eight functions in parallel against Hadoop data tables. They can also use DS2 language to perform difficult transformation of a table in parallel.

SAS’s three new Hadoop interactions allow for better streamlining of data from multiple sources and provides more insight into industry applications.

Whitney Grace, January 02, 2015
Sponsored by ArnoldIT.com, developer of Augmentext

Comments

Comments are closed.

  • Archives

  • Recent Posts

  • Meta