Capacity Management for Hadoop Clusters


Overview/Description
Target Audience
Prerequisites
Expected Duration
Lesson Objectives
Course Number
Expertise Level



Overview/Description
Apache Hadoop is an open source software framework for storage and large scale processing of data-sets on clusters of commodity hardware. This course focuses on the capacity management of Hadoop clusters. You will be introduced to the concepts of resource management through scheduling. You will learn how to use the Fair Scheduler Tool, and how to plan for scaling. This learning path can be used as part of the preparation for the Cloudera Certified Administrator for Apache Hadoop (CCA-500) exam.

Target Audience
Administrators looking to add to their knowledge of capacity management for Hadoop clusters

Prerequisites
None

Expected Duration (hours)
2.2

Lesson Objectives

Capacity Management for Hadoop Clusters

  • start the course
  • compare the differences of availability versus performance
  • describe different strategies of resource capacity management
  • describe how schedulers perform various resource management
  • set quotas for the HDFS file system
  • recall how to set the maximum and minimum memory allocations per container
  • describe how the fair scheduling method allows all applications to get equal amounts of resource time
  • describe the primary algorithm and the configuration files for the Fair Scheduler
  • describe the default behavior of the Fair Scheduler methods
  • monitor the behavior of Fair Share
  • describe the policy for single resource fairness
  • describe how resources are distributed over the total capacity
  • identify different configuration options for single resource fairness
  • configure single resource fairness
  • describe the minimum share function of the Fair Scheduler
  • configure minimum share on the Fair Scheduler
  • describe the preemption functions of the Fair Scheduler
  • configure preemption for the Fair Scheduler
  • describe dominant resource fairness
  • write service levels for performance
  • use the fail scheduler with multiple users
  • Course Number:
    df_ahop_a08_it_enus

    Expertise Level
    Expert