Data Factory with Oozie and Hue


Overview/Description
Target Audience
Prerequisites
Expected Duration
Lesson Objectives
Course Number
Expertise Level



Overview/Description
The Hadoop software library is a framework that allows for the distributed processing of large data sets across clusters of computers using simple programming models. It is designed to scale up from single servers to thousands of machines, each offering local computation and storage. Rather than rely on hardware to deliver high-availability, the library itself is designed to detect and handle failures at the application layer, so delivering a highly-available service on top of a cluster of computers, each of which may be prone to failures. This course explains Oozie as a workflow tool used to manage multiple stage tasks in Hadoop. Additionally, you'll learn how to use Hue, a front end tool which is browser based. This learning path can be used as part of the preparation for the Cloudera Certified Administrator for Apache Hadoop (CCA-500) exam.

Target Audience
Technical personnel with a background in Linux, SQL, and programming who intend to join a Hadoop Engineering team in roles such as Hadoop developer, data architect, or data engineer or roles related to technical project management, cluster operations, or data analysis

Prerequisites
None

Expected Duration (hours)
2.7

Lesson Objectives

Data Factory with Oozie and Hue

  • start the course
  • describe metastore and hiveserver2
  • install and configure metastore
  • install and configure HiveServer2
  • describe HCatalog
  • install and configure WebHCat
  • use HCatalog to flow data
  • recall the Oozie terminology
  • recall the two categories of environmental variables for configuring Oozie
  • install Oozie
  • configure Oozie
  • configure Oozie to use MySQL
  • enable the Oozie Web Console
  • describe Oozie workflows
  • submit an Oozie workflow job
  • create an Oozie workflow
  • run an Oozie workflow job
  • describe Hue
  • recall the configuration files that must be edited
  • install Hue
  • configure the hue.ini file
  • install and configure Hue on MySQL
  • use the Hue File Browser and Job Scheduler
  • configure Hive daemons, Oozie, and Hue
  • Course Number:
    df_ahec_a09_it_enus

    Expertise Level
    Intermediate