Apache Spark fully enabled for YARN, Hadoop: Hortonworks

Hortonworks announced that the Apache Spark, a new high-performance analytics engine, is fully enabled for the YARN resource management technology.

This means that the Apache Spark can be deployed together with other engines on a Hadoop cluster.

"Spark is now natively integrated into Hadoop, so its resources -- CPU, memory, and so on -- can be managed along with the other workloads running on a Hadoop cluster," Hortonworks vice president for corporate strategy, Shaun Connolly said.

"That's important to get right because Spark is memory- and CPU-intensive, and you don't want to have to have siloed clusters dedicated to running those workloads."

Connolly also noted that the reasoning behind the development of YARN and Hadoop 2.0 is to have the ability to operate several workloads at once over the same sets of data.

Apache Spark's version 1.0.0 was released at the end of last month, which was promised as a much faster engine for large-scale processing of data compared to the MapReduce.

MapReduce is currently the more popular choice for the Hadoop software framework. Apache Spark is looking to replace the engine, while at the same time offering more specialized applications.

Hortonworks also said that it is currently in collaboration with Databricks, which is a company that is founded by the developers of Apache Spark. The purpose of the collaboration is to make sure that newly developed applications and tools that are built on Apache Spark will be compatible with all its implementations.

"We're working to ensure that Apache Spark and its APIs and applications maintain a level of compatibility, so as we deliver Spark in our Hortonworks Data Platform, any applications will be able to run on ours as well as any other platform that includes the technology," Connolly said.

The Hortonworks Data Platform is the only complete and fully open source Enterprise Hadoop platform.

Apache Spark's HDP 2.1 Tech Preview Component is now available for free download and installation on the current distribution of HDP 2.0. Hortonworks is expecting that the release of HDP 2.1, which will have Apache Spark included, to gain production use certification within a few months, according to Connolly.

The company said that it will be in attendance at the upcoming Spark Summit from June 30 to July 2 in San Francisco. Hortonworks representatives will be discussing how Apache Spark can be applied and how it can be made even simpler for enterprises to use.

ⓒ 2024 TECHTIMES.com All rights reserved. Do not reproduce without permission.
Tags:Hadoop
Join the Discussion
Real Time Analytics