Dataflow Autoscaling Pipelines


Overview/Description
Target Audience
Prerequisites
Expected Duration
Lesson Objectives
Course Number
Expertise Level



Overview/Description
Apache Beam, Cloud Dataflow, and Cloud Dataprep can be used to create data pipelines. In this course, you will learn how areas of Beam, Apache Beam SDK, Cloud Dataflow, and Cloud Dataprep assist in pipeline management.

Target Audience
Data professionals who are responsible for provisioning and optimizing big data solutions, and data enthusiasts getting started with Google Cloud Platform

Prerequisites
None

Expected Duration (hours)
0.8

Lesson Objectives

Dataflow Autoscaling Pipelines

  • start the course
  • define Apache Beam concepts and SDKs
  • describe the Python SDK and its connection with data pipelines
  • describe the Java SDK and its connection with data pipelines
  • initialize Cloud Dataprep
  • demonstrate how to ingest data into a pipeline
  • create recipes in a Cloud Dataprep pipeline
  • work with the import/export process and demonstrate how to run Dataflow jobs in Cloud Dataprep
  • describe MapReduce and the benefits of Cloud Dataflow over MapReduce
  • outline serverless architecture and some of the GCP products supporting data analytics
  • describe the use of Apache Beam, Cloud Dataflow, and Cloud Dataprep in GCP to create and manage pipelines
  • Course Number:
    cl_gcde_a11_it_enus

    Expertise Level
    Intermediate