Data Pipeline: Using Frameworks for Advanced Data Management


Overview/Description
Expected Duration
Lesson Objectives
Course Number
Expertise Level



Overview/Description

Discover how to implement data pipelines using Python Luigi, integrate Spark and Tableau to manage data pipelines, use Dask arrays, and build data pipeline visualization with Python in this 10-video course. Begin by learning about features of Celery and Luigi that can be used to set up data pipelines, then how to implement Python Luigi to set up data pipelines. Next, turn to working with Dask library, after listing the essential features provided by Dask from the perspective of task scheduling and big data collections. Learn about implementation of Dask arrays to manage NumPy application programming interfaces (APIs). Explore frameworks that can be used to implement data exploration and visualization in data pipelines. Integrate Spark and Tableau to manage data pipelines. Move on to streaming data visualization with Python, using Python to build visualizations for streaming data. Then learn about the data pipeline building capabilities provided by Kafka, Spark, and PySpark. The concluding exercise involves setting up Luigi to implement data pipelines, Spark and Tableau integration, and building pipelines with Python.



Expected Duration (hours)
0.6

Lesson Objectives

Data Pipeline: Using Frameworks for Advanced Data Management

  • Course Overview
  • recognize the features of Celery and Luigi that can be used to set up data pipelines
  • implement Python Luigi in order to set up data pipelines
  • list Dask task scheduling and big data collection features
  • implement Dask arrays in order to manage NumPy APIs
  • list frameworks that can be used to implement data exploration and visualization in data pipelines
  • integrate Spark and Tableau to manage data pipelines
  • use Python to build visualizations for streaming data
  • recognize the data pipeline building capabilities provided by Kafka, Spark, and PySpark
  • set up Luigi to implement data pipelines, integrate Spark and Tableau for data pipeline management, and build visualizations for data pipelines using Python
  • Course Number:
    it_dsdptbdj_02_enus

    Expertise Level
    Intermediate