Dataproc Architecture


Overview/Description
Target Audience
Prerequisites
Expected Duration
Lesson Objectives
Course Number
Expertise Level



Overview/Description
Dataproc can be used to perform several operations when integrating platforms, including Pig and Hive. This course will dig further into Dataproc architecture while introducing the use of Pig and Hive.

Target Audience
Data professionals who are responsible for provisioning and optimizing big data solutions, and data enthusiasts getting started with Google Cloud Platform

Prerequisites
None

Expected Duration (hours)
0.8

Lesson Objectives

Dataproc Architecture

  • start the course
  • describe how to create a cluster with the Dataproc CLI
  • recognize implementations using the Dataproc REST API
  • describe the various Dataproc architecture types in GCP and common use cases
  • define Dataproc machine types and their uses
  • configure a custom machine type
  • describe how and when to execute Dataproc jobs
  • recognize connections between Apache Hadoop HDFS and Cloud Storage
  • describe the use of Pig and Hive
  • configure and execute a job using Pig and Hive with Dataproc
  • recall concepts of Dataproc jobs, including implementation of Pig and Hive
  • Course Number:
    cl_gcde_a06_it_enus

    Expertise Level
    Intermediate