Data Analysis Using Spark SQL and Hive


Overview/Description
Target Audience
Prerequisites
Expected Duration
Lesson Objectives
Course Number
Expertise Level



Overview/Description
In this course you will learn about performing data analysis using Spark SQL and Hive. It is one in a series of courses that prepares learners for exam 70-775: Perform Data Engineering on Microsoft Azure HDInsight.

Target Audience
IT professionals who implement and work with big data analytics and engineering workflows and use open-source technologies; IT professionals preparing for Microsoft exam 70-775

Prerequisites
None

Expected Duration (hours)
0.8

Lesson Objectives

Data Analysis Using Spark SQL and Hive

  • start the course
  • describe Jupyter and Apache Zeppelin
  • merge DataFrames using Spark SQL
  • describe Apache Parquet
  • manage interactive Livy sessions
  • describe what interactive querying is and how its used with Hive
  • use Ambari Views
  • use HiveOL
  • describe how to parse files such as CSV files with Hive
  • use ORC for caching
  • use Hive tables
  • use Zeppelin to visualize data
  • use data analysis for Spark SQL
  • Course Number:
    df_mahd_a07_it_enus

    Expertise Level
    Intermediate