Apache HBase Fundamentals: Advanced API, Administration, and MapReduce
Overview/Description
Target Audience
Prerequisites
Expected Duration
Lesson Objectives
Course Number
Expertise Level
Overview/Description
The administration of Apache HBase is a fundamental component to understand. HBase can be managed using the Java client API and can also be integrated with MapReduce to perform additional tasks that will help obtain maximum performance. This course will discuss and show how to implement filters to limit the results returned from a scan operation. It will also demonstrate how to administer the HBase cluster and instance and perform backup and restore operations. Using MapReduce is also discussed.
Target Audience
Administrators and developers who need experience using Hbase
Prerequisites
None
Expected Duration (hours)
2.0
Lesson Objectives Apache HBase Fundamentals: Advanced API, Administration, and MapReduce
start the course
use utility filters that extend the FilterBase class to filter scan results
use comparison filters to limit the scan results using comparison operators and comparator instance
use custom filters to extend or change the behavior of an existing filter to achieve a more fine-grained control over the scan results
use the HBaseAdmin API to check the status of the master server, connection instance, and the configuration used by the instance
view a list of all the user space tables in HBase and the instance for the table
disable and delete tables from HBase
complete a major compaction using the HBase shell
merge regions in the same table using the Merge utility
stop and decommission a RegionServer
perform a rolling restart on the entire cluster
add a new node to HBase
view metrics to monitor HBase
take a snapshot
use a snapshot to clone a table and move it to another cluster
export and restore a snapshot to another cluster
perform a full shutdown backup of HBase
perform a backup of HBase on a live cluster
restore HBase
use the TableOutPutFormat class to set up a table as an output to the MapReduce process using HBase as the data sink
set up a table as an input to a MapReduce process using HBase as the data source
use MapReduce to bulk load data directly into HBase file system by bypassing the HBase API
use the getSplits method of the TableInputFormatBase class to create custom splitters when using an HBase table as a data source
access other HBase tables from within a MapReduce job by creating a Table instance in the setup method of Mapper
perform HBase cluster and node maintenance
Course Number: df_hbas_a03_it_enus
Expertise Level
Intermediate