Batch Solutions with Hive and Apache Pig


Overview/Description
Target Audience
Prerequisites
Expected Duration
Lesson Objectives
Course Number
Expertise Level



Overview/Description
In this course you will learn how to implement batch solutions with Hive and Apache Pig. It is one in a series of courses that prepares learners for exam 70-775: Perform Data Engineering on Microsoft Azure HDInsight.

Target Audience
IT professionals who implement and work with big data analytics and engineering workflows and use open-source technologies; IT professionals preparing for Microsoft exam 70-775

Prerequisites
None

Expected Duration (hours)
1.0

Lesson Objectives

Batch Solutions with Hive and Apache Pig

  • start the course
  • describe Apache Hive
  • describe Apache Pig
  • define external Hive tables
  • identify how to load data into Hive tables
  • describe how to improve Hive performance
  • use XML files with Hive
  • describe how to use JSON files with Hive
  • use join tables with Hive
  • identify Hive query bottlenecks
  • describe Java UDFs with Hive
  • describe Python UDFs with Hive and Apache Pig
  • design scripts with Apache Pig
  • identify Hive storage formats
  • use Hive tables
  • Course Number:
    df_mahd_a04_it_enus

    Expertise Level
    Intermediate