Apache Hadoop


Overview/Description
Target Audience
Prerequisites
Expected Duration
Lesson Objectives
Course Number
Expertise Level



Overview/Description
Apache Hadoop is a set of algorithms for distributed storage and distributed processing of very large data sets on computer clusters built from commodity hardware. This course will introduce the basic concepts of cloud computing using Apache Hadoop, cloud computing, Big Data, and the development tools applied.

Target Audience
This path is designed for developers, managers, database developers, and anyone interested in learning the basics of Hadoop, or cloud computing in general.

Prerequisites
None

Expected Duration (hours)
2.0

Lesson Objectives

Apache Hadoop

  • start the course
  • describe the basics of Hadoop
  • identify the major users of Hadoop, the end-user application, and the result
  • identify the characteristics of Big Data
  • compare and contrast the traditional data sources and Big Data sources
  • describe the clustering and distributed computing concepts of Hadoop
  • specify low cost commodity servers in Big Data and its configurations as nodes in small and large scale Hadoop installations
  • describe Hadoop installation requirements
  • troubleshoot Hadoop installation issues
  • configure Hadoop installation
  • identify the features of third party Hadoop distributions
  • describe the creation and evolution of Hadoop and its related projects
  • describe the use of YARN in Hadoop cluster management
  • describe the components and functions of Hadoop
  • compare and contrast the different types of Hadoop data
  • describe the four different types of cloud databases in NoSQL Databases
  • describe the basics of the Hadoop Distributed File System
  • describe HDFS and basic HDFS navigation operations
  • perform file operations such as add and delete within HDFS
  • describe the basic principles of MapReduce and general mapping issues
  • specify the use of Pig and Hive in Hadoop Map Reduce jobs
  • describe the use of MapReduce, MapReduce lifecycle, job client, job tracker, task tracker, map tasks, and reduce tasks
  • describe Hadoop MapReduce handles, data processes data, and vocabulary of the MapReduce dataflow process
  • describe the process of mapping and reducing
  • describe the basic principles and uses of Hadoop
  • Course Number:
    df_ahmr_a01_it_enus

    Expertise Level
    Beginner