Apache Hadoop has become popular from its specialization in the execution of MapReduce programs. ...
Apache Hadoop has become popular from its specialization in the execution of MapReduce programs. However, it has been hard to leverage existing Hadoop infrastructure for various other processing paradigms such as real-time streaming, graph processing and message-passing. That was true until the introduction of Apache Hadoop YARN in Apache Hadoop 2.0. YARN supports running arbitrary processing paradigms on the same Hadoop cluster. This allows for development of newer frameworks as well as more efficient implementations of existing frameworks that can all run on and share the resources of a single multi-tenant YARN cluster. This talk gives a brief introduction to YARN. We will illustrate how to create applications and how to best make use of YARN. We will show examples of different applications such as Apache Tez and Apache Samza that can leverage YARN and present best practices/guidelines on building applications on top of Apache Hadoop YARN.