Accessing Data with Spark: Data Analysis using Spark SQL


Overview/Description
Expected Duration
Lesson Objectives
Course Number
Expertise Level



Overview/Description

Analyze an Apache Spark DataFrame as though it were a relational database table. During this Aspire course, you will discover the different stages involved in optimizing any query or method call on the contents of a Spark DataFrame. Discover how to create views out of a Spark DataFrame's contents and run queries against them; and how to trim and clean a DataFrame. Next, learn how to perform an analysis of data by running different SQL queries; how to configure a DataFrame with an explicitly defined schema; and define what a window is in the context of Spark. Finally, observe how to create and analyze categories of data in a data set by using Windows.



Expected Duration (hours)
0.9

Lesson Objectives

Accessing Data with Spark: Data Analysis using Spark SQL

  • Course Overview
  • recall the different stages involved in optimizing any query or method call on the contents of a Spark DataFrame
  • create views out of a Spark DataFrame's contents and run queries against them
  • trim and clean a DataFrame before a view is created as a precursor to running SQL queries on it
  • perform an analysis of data by running different kinds of SQL queries, including grouping and aggregations
  • recognize how Spark DataFrames infer the schema of data loaded into them and configure a DataFrame with an explicitly defined schema
  • define what a window is in the context of Spark DataFrames and when they can be used
  • create and analyze categories of data in a dataset using Windows
  • analyze data using Spark SQL
  • Course Number:
    it_dsadskdj_03_enus

    Expertise Level
    Intermediate