Apache Hadoop
Overview/Description
Target Audience
Prerequisites
Expected Duration
Lesson Objectives
Course Number
Expertise Level
Overview/Description
Apache Hadoop is a set of algorithms for distributed storage and distributed processing of very large data sets on computer clusters built from commodity hardware. This course will introduce the basic concepts of cloud computing using Apache Hadoop, cloud computing, Big Data, and the development tools applied.
Target Audience
This path is designed for developers, managers, database developers, and anyone interested in learning the basics of Hadoop, or cloud computing in general.
Prerequisites
None
Expected Duration (hours)
2.0
Lesson Objectives Apache Hadoop
start the course
describe the basics of Hadoop
identify the major users of Hadoop, the end-user application, and the result
identify the characteristics of Big Data
compare and contrast the traditional data sources and Big Data sources
describe the clustering and distributed computing concepts of Hadoop
specify low cost commodity servers in Big Data and its configurations as nodes in small and large scale Hadoop installations
describe Hadoop installation requirements
troubleshoot Hadoop installation issues
configure Hadoop installation
identify the features of third party Hadoop distributions
describe the creation and evolution of Hadoop and its related projects
describe the use of YARN in Hadoop cluster management
describe the components and functions of Hadoop
compare and contrast the different types of Hadoop data
describe the four different types of cloud databases in NoSQL Databases
describe the basics of the Hadoop Distributed File System
describe HDFS and basic HDFS navigation operations
perform file operations such as add and delete within HDFS
describe the basic principles of MapReduce and general mapping issues
specify the use of Pig and Hive in Hadoop Map Reduce jobs
describe the use of MapReduce, MapReduce lifecycle, job client, job tracker, task tracker, map tasks, and reduce tasks
describe Hadoop MapReduce handles, data processes data, and vocabulary of the MapReduce dataflow process
describe the process of mapping and reducing
describe the basic principles and uses of Hadoop
Course Number: df_ahmr_a01_it_enus
Expertise Level
Beginner