Apache HBase Fundamentals: Advanced API, Administration, and MapReduce


Overview/Description
Target Audience
Prerequisites
Expected Duration
Lesson Objectives
Course Number
Expertise Level



Overview/Description
The administration of Apache HBase is a fundamental component to understand. HBase can be managed using the Java client API and can also be integrated with MapReduce to perform additional tasks that will help obtain maximum performance. This course will discuss and show how to implement filters to limit the results returned from a scan operation. It will also demonstrate how to administer the HBase cluster and instance and perform backup and restore operations. Using MapReduce is also discussed.

Target Audience
Administrators and developers who need experience using Hbase

Prerequisites
None

Expected Duration (hours)
2.0

Lesson Objectives

Apache HBase Fundamentals: Advanced API, Administration, and MapReduce

  • start the course
  • use utility filters that extend the FilterBase class to filter scan results
  • use comparison filters to limit the scan results using comparison operators and comparator instance
  • use custom filters to extend or change the behavior of an existing filter to achieve a more fine-grained control over the scan results
  • use the HBaseAdmin API to check the status of the master server, connection instance, and the configuration used by the instance
  • view a list of all the user space tables in HBase and the instance for the table
  • disable and delete tables from HBase
  • complete a major compaction using the HBase shell
  • merge regions in the same table using the Merge utility
  • stop and decommission a RegionServer
  • perform a rolling restart on the entire cluster
  • add a new node to HBase
  • view metrics to monitor HBase
  • take a snapshot
  • use a snapshot to clone a table and move it to another cluster
  • export and restore a snapshot to another cluster
  • perform a full shutdown backup of HBase
  • perform a backup of HBase on a live cluster
  • restore HBase
  • use the TableOutPutFormat class to set up a table as an output to the MapReduce process using HBase as the data sink
  • set up a table as an input to a MapReduce process using HBase as the data source
  • use MapReduce to bulk load data directly into HBase file system by bypassing the HBase API
  • use the getSplits method of the TableInputFormatBase class to create custom splitters when using an HBase table as a data source
  • access other HBase tables from within a MapReduce job by creating a Table instance in the setup method of Mapper
  • perform HBase cluster and node maintenance
  • Course Number:
    df_hbas_a03_it_enus

    Expertise Level
    Intermediate