Hadoop Cluster Availability


Overview/Description
Target Audience
Prerequisites
Expected Duration
Lesson Objectives
Course Number
Expertise Level



Overview/Description
When examining Hadoop availability it's important not to focus solely on the NameNode. There is a tendency since that is the single point of failure for HDFS, and many components in the ecosystem rely on HDFS, but Hadoop availability is a more general larger issue. In this course we are going to examine the availability and how to recover from failures for the NameNode, DataNode, HDFS, and YARN. This learning path can be used as part of the preparation for the Cloudera Certified Administrator for Apache Hadoop (CCA-500) exam.

Target Audience
Developers interested in expanding their knowledge of Hadoop from the operations perspective

Prerequisites
None

Expected Duration (hours)
2.8

Lesson Objectives

Hadoop Cluster Availability

  • start the course
  • describe how Hadoop leverages fault tolerance
  • recall the most common causes for NameNode failure
  • recall the uses for the Checkpoint node
  • test the availability for the NameNode
  • describe the operation of the NameNode during a recovery
  • swap to a new NameNode
  • recall the most common causes for DataNode failure
  • test the availability for the DataNode
  • describe the operation of the DataNode during a recovery
  • set up the DataNode for replication
  • identify and recover from a missing data block scenario
  • describe the functions of Hadoop high availability
  • edit the Hadoop configuration files for high availability
  • set up a high availability solution for NameNode
  • recall the requirements for enabling an automated failover for the NameNode
  • create an automated failover for the NameNode
  • recall the most common causes for YARN task failure
  • describe the functions of YARN containers
  • test YARN container reliability
  • recall the most common causes of YARN job failure
  • test application reliability
  • describe the system view of the Resource Manager configurations set for high availability
  • set up high availability for the Resource Manager
  • move the Resource Manager HA to alternate master servers
  • Course Number:
    df_ahop_a04_it_enus

    Expertise Level
    Intermediate