Spark Streaming


Overview/Description
Target Audience
Prerequisites
Expected Duration
Lesson Objectives
Course Number
Expertise Level



Overview/Description
Spark Streaming leverages Spark's language-integrated API to perform streaming analytics. This design enables the same set of application code written for batch processing to join streams against historical data, or run ad-hoc queries on stream state. In this course, you will learn how to work with different input streams, perform transformations on streams, and tune up performance.

Target Audience
Programmers and developers familiar with Apache Spark who wish to expand their skill sets

Prerequisites
None

Expected Duration (hours)
2.7

Lesson Objectives

Spark Streaming

  • start the course
  • describe what a DStream is
  • recall how TCP socket input streams are ingested
  • describe how file input streams are read
  • recall how Akka Actor input streams are received
  • describe how Kafka input streams are consumed
  • recall how Flume input streams are ingested
  • set up Kinesis input streams
  • configure Twitter input streams
  • implement custom input streams
  • describe receiver reliability
  • use the UpdateStateByKey operation
  • perform transform operations
  • perform Window operations
  • perform join operations
  • use output operations on Streams
  • use DataFrame and SQL operations on streaming data
  • use learning algorithms with MLlib
  • persist stream data in memory
  • enable and configure checkpointing
  • deploy applications
  • monitor applications
  • reduce batch processing times
  • set the right batch interval
  • tune memory usage
  • describe fault tolerance semantics
  • Course Number:
    df_apsa_a02_it_enus

    Expertise Level
    Expert