Installation of Hadoop


Overview/Description
Target Audience
Prerequisites
Expected Duration
Lesson Objectives
Course Number
Expertise Level



Overview/Description
Apache Hadoop is an open source framework for distributed storage and processing of large sets of data on commodity hardware. Hadoop enables businesses to quickly gain insight from massive amounts of structured and unstructured data. In this course, you'll learn step-by-step instructions for installing Hadoop in a pseudo-mode and troubleshoot installation errors. You'll learn where the log files are and more about the architecture. This learning path can be used as part of the preparation for the Cloudera Certified Administrator for Apache Hadoop (CCA-500) exam.

Target Audience
Technical personnel with a background in Linux, SQL, and programming who intend to join a Hadoop Engineering team in roles such as Hadoop developer, data architect, or data engineer or roles related to technical project management, cluster operations, or data analysis

Prerequisites
None

Expected Duration (hours)
2.5

Lesson Objectives

Installation of Hadoop

  • start the course
  • recall the minimum system requirements for installation
  • configure the start-up shell and yum repositories
  • install the Java Developers Kit
  • setup SSH for Hadoop
  • recall why version 2.0 was significant
  • describe the three different installation modes
  • download and install Apache Hadoop
  • configure Hadoop environmental variables
  • configure Hadoop HDFS
  • start and stop Hadoop HDFS
  • configure Hadoop YARN and MapReduce
  • start and stop Hadoop YARN
  • validate the installation and configuration
  • recall the structure of the HDFS command
  • recall the importance of the output directory
  • run WordCount
  • recall the ports of the NameNode and Resource Manager Web UIs
  • use the NameNode and Resource Manager Web UIs
  • describe the best practices for changing configuration files
  • recall some of the most common errors and how to fix them
  • access Hadoop logs and troubleshoot Hadoop installation errors
  • to install and configure Hadoop and its associated components
  • Course Number:
    df_ahec_a02_it_enus

    Expertise Level
    Intermediate