The Learning Edge

The Future of BIG Data – Hadoop 2.0 & Yarn

The Future of BIG Data – Hadoop 2.0 & Yarn

Want to be a Hadoop Developer? –

This Hadoop 2.0 and Yarn tutorial will unravel the pro and cons of Hadoop 2.0 & Yarn; covering Fundamentals and Advantages of Hadoop 2.X. A comprehensive comparison of Hadoop 1.X and Hadoop 2.X. YARN has been included in the presentation. At the end, you’ll know the Future of BIG Data

Session Agenda:

✓ What is Big Data?
✓ What is Hadoop 2.0? Why is it important?
✓ Understanding Hadoop 2.X features and Hadoop 2.0 Architecture
✓ Understanding YARN ( Yet Another Resource Navigator)
✓ Question & Answers

What is BIG Data? – Introduction BIG Data

Big data is the term for a collection of data sets so large and complex that it becomes difficult to process using on-hand database management tools or traditional data processing applications.

What is Hadoop 2.0? – Introduction Hadoop 2.0

Hadoop 2.0 is where the Resource management is decoupled from the MapReduce infrastructure. The 2.0 version can be termed as a generational shift in the architecture of Hadoop. With YARN (Yet Another Resource Negotiator), Hadoop got recast from a mere data operating system to a much stronger platform wherein HDFS is the file system and YARN is the operating system. YARN in Hadoop 2.0 allows multiple applications to be run on the same platform.

What are Hadoop 2.0 Components?

ResourceManager: The ResourceManager and the NodeManager forms the basis for managing applications in a distributed manner. The responsibility of the ResourceManager is to distribute available resource to the applications.

NodeManager: The NodeManager is per-machine slave and it is responsible for launching the application’s containers. It monitors the resource usage of CPU, memory, disk and network and also, it reports the same back to the ResourceManager.

ApplicationMaster: It is a framework-specific entity. On one side it communicates with the ResourceManager and on the other with the NodeManagers. It negotiates resources from the ResourceManager and it works with the NodeManagers to execute and monitor the component tasks.

Important Hadoop 2.0 Features:

1. Scalablity
2. New programming models and services
3. Improved cluster utilization
4. Agility
5. Beyond Java
6. Support for running Hadoop on MS Windows
7. Binary Compatibility for MapReduce applications built on Hadoop 1.0
8. Ample integration with the other Hadoop projects in the Hadoop Ecosystem

What is YARN? – Introduction to YARN

YARN is one of the key features in the second-generation Hadoop 2 version of the Apache Software Foundation’s open source distributed processing framework. Originally described by Apache as a redesigned resource manager, YARN is now characterized as a large-scale, distributed operating system for big data applications.

Skillspeed is a live e-learning company focusing on high-technology courses. We provide live instructor-led training in BIG Data & Hadoop featuring 24/7 Lifetime Support, 100% Placement Assistance & Real-time Projects.

Number: +91-90660-20904
Video Rating: / 5