Data Repository with HDFS and HBase


Overview/Description
Target Audience
Prerequisites
Expected Duration
Lesson Objectives
Course Number
Expertise Level



Overview/Description
Hadoop is an open source Java framework for processing and querying vast amounts of data on large clusters of commodity hardware. It relies on an active community of contributors from all over the world for its success. In this course, you'll explore the server architecture for Hadoop and learn about the functions and configuration of the daemons making up the Hadoop Distributed File System. You'll also learn about the command line interface and common HDFS administration issues facing all end users. Finally, you'll explore the theory of HBase as another data repository built alongside or on top of HDFS, and basic HBase commands. This learning path can be used as part of the preparation for the Cloudera Certified Administrator for Apache Hadoop (CCA-500) exam.

Target Audience
Technical personnel with a background in Linux, SQL, and programming who intend to join a Hadoop Engineering team in roles such as Hadoop developer, data architect, or data engineer or roles related to technical project management, cluster operations, or data analysis

Prerequisites
None

Expected Duration (hours)
2.1

Lesson Objectives

Data Repository with HDFS and HBase

  • start the course
  • configure the replication of data blocks
  • configure the default file system scheme and authority
  • describe the functions of the NameNode
  • recall how the NameNode operates
  • recall how the DataNode maintains data integrity
  • describe the purpose of the CheckPoint Node
  • describe the role of the Backup Node
  • recall the syntax of the file system shell commands
  • use shell commands to manage files
  • use shell commands to provide information about the file system
  • perform common administration functions
  • configure parameters for NameNode and DataNode
  • troubleshoot HDFS errors
  • describe key attributes of NoSQL databases
  • describe the roles of HBase and ZooKeeper
  • install and configure ZooKeeper
  • instause the HBase command line to create tables and insert datall and configure HBase
  • instause the HBase command line to create tables and insert datall and configure HBase
  • manage tables and view the web interface
  • create and change HBase data
  • provide a basic understanding of how Hadoop Distributed File System functions
  • Course Number:
    df_ahec_a03_it_enus

    Expertise Level
    Intermediate