Data Silos, Lakes, & Streams: Introduction


Overview/Description
Expected Duration
Lesson Objectives
Course Number
Expertise Level



Overview/Description

This 11-video course discusses the transition of data warehousing to cloud-based solutions using the AWS (Amazon Web Services) cloud platform. You will examine various implications involved in storing different types of data from different sources within an organization. You will need to be familiar with provisioning and working with resources on the cloud, basic big data architecture, distributed systems, using shell commands, and a Linux terminal prompt. You will learn that an organization may have data silos which may prevent access to other teams within an organization. You will learn how to use data lakes, a centralized repository to store data at scale, and as a viable solution to data silos that might exist within an organization. You will learn the difference between a data lake which stores all kinds of raw data in a native format before the data has been processed, and a data warehouse which contains data that can be used so directly to generate business insights. Finally, this course demonstrates storing data with AWS Redshift data warehouse.



Expected Duration (hours)
1.3

Lesson Objectives

Data Silos, Lakes, & Streams: Introduction

  • Course Overview
  • recall the characteristics and drawbacks of data silos
  • specify what a data lake enables
  • recognize the advantages of using data lakes to store data
  • describe the architecture of a data lake and identify challenges in its design
  • recall the characteristics of a data warehouse
  • specify the differences between data warehouses and data lakes
  • distinguish between batch and streaming data and recognize the Stream-First Architecture
  • describe how data can be moved from on-premise to the AWS cloud platform
  • recognize the technologies used to build data lakes on AWS
  • describe various use cases and architectures of working with data lakes on AWS
  • recall characteristics of data silos, data lakes, and data streams
  • Course Number:
    it_dsdslsdj_01_enus

    Expertise Level
    Intermediate