Getting Started with Hive: Viewing and Querying Complex Data


Overview/Description
Expected Duration
Lesson Objectives
Course Number
Expertise Level



Overview/Description

Learners explore working with complex data types in Apache Hive in this Skillsoft Aspire course, which assumes previous work with Hive tables using the Hive query language, and comfort using a command-line interface or Hive client to run queries. Learners begin this 12-video, hands-on course by working with Hive tables whose columns are of complex data types (arrays, maps, and structs). Watch demonstrations of set operations and transforming complex types into tabular form with explode operation. Then use lateral views to add more data to exploded outputs. Course labs use the Beeline client; the instructor’s Beeline terminal runs on the master node of a Hadoop cluster, provisioned on Google Cloud platform using its Dataproc service, and learner access is assumed to a Hadoop cluster and Beeline, on-premises or in the cloud. Finally, learners observe how to use views to aggregate contents of multiple columns. As the course concludes, you should be comfortable working with all types of data in Hive and performing analysis tasks on tables with both parameter types as well as complex data.



Expected Duration (hours)
1.2

Lesson Objectives

Getting Started with Hive: Viewing and Querying Complex Data

  • Course Overview
  • load and access data in the form of arrays
  • work with data in the form of key-value pairs - map data structures in Hive
  • define and use structured data in the form of Hive struct types
  • transform complex data types to a tabular format to facilitate analysis using the explode and posexplode functions
  • combine the results of the explode function with other columns of a table to generate a lateral view
  • flatten multi-dimensional data structures by chaining lateral views
  • use the UNION and UNION ALL operations on table data and distinguish between the two
  • search for values in the results of a subquery using the IN and EXIST clauses
  • create and load data into tables efficiently by including these operations in a single query
  • define and work with views in Hive to simplify querying and control access to data
  • perform queries and utilize views on complex data types available in Hive
  • Course Number:
    it_dsgshvdj_03_enus

    Expertise Level
    Beginner